Multi level pivot tables

Comments

6 comments

  • Avatar
    Adrian Williams

    Hi Leigh,

    can you clarify what you are trying to achieve, perhaps with a short example of the before and after data?

     

    In the interim, here is an example of using the Aggregate node to total some data using multiple grouping fields ...

     

    The following example data can be copied into a Create Data node (use Ctrl-A to select the default data and then Ctrl-V to paste it in):

    Name,Gender,Retired,Fruit:int
    Ann,F,N,2
    Bob,M,Y,4
    Chuck,M,N,1
    Dave,M,Y,2
    Emily,F,Y,1
    Freda,F,N,3
    Gerry,M,N,0
    Helen,F,Y,2
    Irene,F,N,1
    Jack,M,Y,3

    The node's output data should look like the following:

    An Aggregate node can then be added to the canvas and connected to the output of the Create Data node.

    The Aggregate node can be configured to group the data by the 'Gender' and 'Retired' fields and calculate the sum of the 'Fruit' field for each sub-group:

     

    The Aggregate node's output data will then be the following

    If required, you can apply other aggregation methods (e.g. count, average, etc) by adding further operations to the Operations grid.

     

    0
    Comment actions Permalink
  • Avatar
    Leigh Sullivan

    Hi. Thanks for that, the aggregate function gets me part of the way there, but what if I want to move fields from rows to columns and then apply and aggregate function.

    For example say I had the following dataset -

    Now I want to take the 'Year Picked' column and split it out into a column for each year. So it would look like this -

    So when I say multi level, I mean (as marked in red) having more than one column in the rows grouping when creating a pivot table.

     

    Thanks again,

    Leigh

    0
    Comment actions Permalink
  • Avatar
    Adrian Williams

    Hi Leigh,

    Thanks for the clarification. So in your case the data is only being reshaped in the pivot operation without applying an aggregation operation at that point. Yes, you can do this in Dataverse using the 'Pivot Data to Names' node.

    The attached data flow shows an example of how this can be done.

    However, there are a few points to note.

    Regarding the 'Names' field containing the values that will become the new field names, it is advisable for this to be a string data type field. In your sample data the 'Year Picked' field may be an integer which will cause problems. To overcome this the Transform node is used to create a new field where the values are strings i.e. 2009 becomes 'YR_2009'. This approach also overcomes an issue when there are missing values in the Names field as a null value will become 'YR_NULL' (You cannot have null field metadata in Dataverse so this would generate an error when pivoting the data. Also note, the field metadata cannot contain Unicode characters.

    By default, the Pivot Data to Names node expects a regular data set - it will generate an error if there are extra or missing names in the pivot groups. To overcome this you can change the 'Nameset' property from 'Exact' to 'Union' - which will fill the extra cells with NULL values, or to 'Intersection' - which will only output the fields that have complete cases.

    So in this case, the 'NamesField' property is configured to the (string data type) field containing the new year field names, the 'DataField' property specifies the measure values to be populated under the new fields, and the 'GroupBy' property identifies the dimension fields on which to group the data.

    In the example the 'NameSet' property is set to 'Union' to show the effect when there missing values in the Names field. You could alternatively have used  a Split or Filter node earlier in the data flow to remove records that contained null values.

    The pivoted data looks like this:

    The Aggregate node can then be used to apply any required summarization functions.

    Attached files

    Pivot_Example_Fruit_Production_DV_3_1_6 - 9 Feb 2018.lna

    0
    Comment actions Permalink
  • Avatar
    Adrian Williams

    One other point - the values in the NamesField and DataField properties in the Pivot Data to Names node should not be enclosed in apostrophes.

    0
    Comment actions Permalink
  • Avatar
    Leigh Sullivan

    Thanks Adrian for the response.

    So in theory that all makes sense!

    Unfortunately in practice I'm having trouble with a few errors. Maybe you might be able to shed some light on what I doing wrong!

    (Noting that now I am using a real dataset)

     
    Kind regards,

    Leigh

    0
    Comment actions Permalink
  • Avatar
    Adrian Williams

    Hi Leigh,

    Can you remove the apostrophes from around the field names in the NamesField property and DataField property (the node is a little picky about the format in those properties but apostrophes are ok in the GroupBy property).  I'm not sure this is the root cause of your problem but its best to eliminate it.

    However, its difficult to pinpoint the error without access to the data. Are there any special characters in the values in the NamesField i.e. OFFENCE_SUBGROUP field such as punctuation characters (comma, colon, semi-colon, apostrophes or symbols such as @, etc?). Similarly, are there any special characters in the GroupBy field names? Can you also check whether there are any missing values in the GroupBy fields.

    Would you be able to provide a sanitized subset of your data (specifically the fields identified in the node config and at least one other 'payload' field (with some dummy text in it)? I have opened a support ticket for this post so you should receive an email and be able to reply to it with an attachment.

    0
    Comment actions Permalink

Please sign in to leave a comment.



Powered by Zendesk