Full details on Data360 Analyze and Excel may be found within the following parent article :
Getting Started with Data360 Analyze and Excel
Calculations and Aggregations
Calculations and Aggregations | In Data360 Analyze |
IF statements |
IF statements in Excel are very powerful and you can use them in Data360 Analyze as well just with a slightly different statement structure. For example, I have a column 'color' and wanted to create a new column 'category' that would take the value "A" if the color was "Red" and "B" otherwise. In Excel, my IF statement would look something like IF (A3="Red", "A","B") as shown in the image below: In Data360 Analyze, there are a number of ways this can be implemented. The easiest is to use a Calculate Fields node configured like this: An alternative is to use a Transform node. The equivalient logic would be structured like this: out1.category = "A" if in1.color == "Red" else "B" as shown in the image below:
Note that correct indentation is essential when using Python scripting. As in Excel, I can have nested IF statements in Data360 Analyze as well. Following from the example above, in Excel, I might have the following nested IF statement: IF(A2="Red","A",IF(A2="Blue","B","C")) In Data360 Analyze, this would look like this: Python cheat sheet: Python Cheat Sheet |
Calculate Sum, Average, Minimum and Maximum Value |
You can use the Statistics node to perform a quick statistical analysis of numeric input data, including the sum, min, max, average, count, null count, standard deviation and variance functions. For example, I have sales data from all my distributors:
I can drag-and-drop a Statistics node onto my canvas and connect to my data source. In the 'FieldList' property, I can use the context menu to select the field that I want to run my analysis against, which is the sales amount field 'sale amount ($)'. In 'GroupBy' property, I specify 'distributor' as the field to group on, ensuring that the analysis produces one line per unique distributor. Note that when the context menu is used to insert the field name it uses the fields['my field name'] syntax. The Optional properties can be configured to output the required statistics: The Statistics node then produces the measures including the sum, average, minimum and maximum values of the sale amounts, etc:
Recommended Example: C:\Program Files\Data3SixtyAnalyze\docs\samples\nodes\Aggregating data.lna |
Cumulative Sum |
To calculate a cumulative sum, you can use the Sum operator an Aggregate node. For example, I have the total amount in sales by week and I want to view a year-to-date sales total. In an Aggregate node, the Sum operator can be applied to the 'weekly amount ($)' field using the grid editor. I'm not grouping by any field here so I can simply leave the 'GroupBy' property blank: As we want to output the cumulative sum for every record processed, the 'ImplicitWriteEvaluation' property is set to 'Per Record'. This produces the new column 'CumulativeSum', which is the cumulative sum of the weekly amount: However, the Aggregate node does not, by default, output any of the input fields. To change this requires a small modification to the underlying code used by the node. Switching to the 'Advanced' tab in the 'Operations' property displays the code generated by the grid editor: Adding this line to the ConfigureFields script instructs the node to output the input fields: outputs[0] += inputs[0] (alternatively you could use out1 += in1 instead) When the node is run this is the output: Note when the auto-generated code is changed in the Advanced tab the Grid editor can no longer be used. After initial configuration of the node using the grid editor, you may want to make a copy of the node before experimenting with changes to the code. Recommended Example: C:\Program Files\Data3SixtyAnalyze\docs\samples\nodes\Aggregating Data.lna |
Comments
0 comments
Please sign in to leave a comment.