We recommend switching to the latest versions of Edge, Firefox, Chrome or Safari. Using Internet Explorer will result in a loss of website functionality.
Our Support systems migrated on Saturday, May 21. We'll automatically forward you to the new location for this content.

joining on all fields in the join node

Comments

11 comments

  • Avatar
    Infogix Support
    Originally posted by: awilliams1024

    Hi Jason,

    Assuming the use of the Join Inner node you can do the following:

    1. Configure both the LeftInputKey & RightInputKey with the following value: map(&field,inputFields(1))
    2. Remove name field conflicts by changing the second line in the Join Inner node's Output Script to be default emit 2:*
    3. Set the SortLeftInput and SortRightInput properties to true
    0
    Comment actions Permalink
  • Avatar
    Infogix Support
    Originally posted by: residnt

    This is working great thank you. Is there a way to filter out 1 record? I can't seem to use filter(&,,,) or i'm just doing it wrong.
    0
    Comment actions Permalink
  • Avatar
    Infogix Support
    Originally posted by: residnt

    So this does work
    map(&field,inputFields(1).splice(inputFields(1).fi nd(toLower("{{^sheetName^}}_created_ts")),-1))

    however that is assuming that the field I want to exclude from the list is the last column in the data set.
    0
    Comment actions Permalink
  • Avatar
    Infogix Support
    Originally posted by: awilliams1024

    I think your expression should be the following as you want to remove only one field from the list (i.e. with the splice function's length attribute set to 1 rather than -1):

    map(&field,inputFields(1).splice(inputFields(1).fi nd(toLower("{{^sheetName^}}_created_ts")),1))

    However, I do not believe the inputFields() function states there is any guarantee that the order of the fields in the field list corresponds with the order of the fields in the input data - though it does appear to align in my ad-hoc test.
    0
    Comment actions Permalink
  • Avatar
    Infogix Support
    Originally posted by: residnt

    Sorry it took me a while to respond. I'm using the -1 to return the list from the beginning assuming that the created_ts field is the last field in the list. How can we filter out the list to not include it as that would be the better way to go.

    Also on a side note. Is there a way to use the referenceFields function in the join node or any other node to pull a list that i've added to my data to only emit out those columns. I've tried map and that didn't work as it's looking for that column name as opposed to the list data associated within that field.

    Even trying the above expression in the referenceFields results in 0 records

    Thanks
    Jason
    0
    Comment actions Permalink
  • Avatar
    Infogix Support
    Originally posted by: awilliams1024

    I'm not sure I know how to filter the list to exclude a specific field. If you are not going to use that field in the later analysis, could you use a Filter node to exclude the field prior to the join?

    After some discussion the conclusion was that you cannot use the inputFields + map functions within referenceFields as referenceFields is a macro that is evaluated at compile time where as the inputField and map functions are only evaluated during node execution, meaning the values would not be available to referenceFields (as that would have already been evaluated prior to the node execution)
    0
    Comment actions Permalink
  • Avatar
    Infogix Support
    Originally posted by: residnt

    Ok is it a possibility to put in say a brainscript parameter where I can read in the column to use for the reference fields. I can put the comma separated values manually in a parameter which will work. I just cannot find any documentation on how to use each parameter type.
    0
    Comment actions Permalink
  • Avatar
    Infogix Support
    Originally posted by: stonysmith

    Think for a moment about the AggEx node.

    In it you you have the "GroupBy" parameter. You can place expressions into that field, like this:

    MyField1.left(3),  MyField2.date("CCYYMMDD")
    That expression will convert the data as requested, sort using those changes and then group the data by it.

    ===========
    Then, this expression
    emit referencedFields(1,{{^GroupBy^}})
    instructs LAE to write out the fields referenced in the GroupBy parameter.

    For this example, that expression translates to this:
    emit MyField1,MyField2
    Note that the whole field is emitted, unchanged by the left() or date() functions.

    This is documented if you put your cursor on the referenceFields function and hit F1.
    0
    Comment actions Permalink
  • Avatar
    Infogix Support
    Originally posted by: residnt

    Yes referenceFields is documented. It more so how to use the different parameter types. Ie Brainscript, BrainscriptExp, BrainscriptOutput, Java, Python.

    So that make sense with the field being output is the cell contents of the referenced field. However what I'm trying to do is dynamically setup the output which I don't always know what fields I'll need outputted. Basically I have a column that is joined on to get an alternative name. so lets say column name = card_id_1000 is changed to "Joe Smith" ${field1} ${field2}. field1 and field2 are referencing the field to replace that text with which I am able to handle with regex expressions however I don't need the entire data source from this point. I only need the name column, field1, field2. So by extracting those field names I aggregated the list into a comma separated string. Now this is where I want to filter out the data source to only those columns. I can't use reference field with a variable as it needs to be a column found on the input.

    So I'm looking for a way to emit only the columns that I find need to be replaced.
    0
    Comment actions Permalink
  • Avatar
    Infogix Support
    Originally posted by: stonysmith

    Take a look at the ChangeMetaData node.
    Depending upon how you set the options and what data you send into the 2nd pin, you can:
    1) Include all columns
    2) Include just some columns
    3) Rename any column
    4) Change the data type of any column

    Generally the order of the records on the 2nd pin sets the order used for output by ChangeMetaData.

    What I very commonly do is:
    - use the GetMetaData node to get a list of column names
    - use a Filter to setup the OldName, NewName, NewType values
    - send this into the 2nd pin of ChangeMetaData

    But, also in a lot of cases, I just use a StaticData node to setup the values I need instead of the GetMetaData
    0
    Comment actions Permalink
  • Avatar
    Infogix Support
    Originally posted by: residnt

    Yes that did work so I am able to dynamically emit the fields that I need. However this was quite a work around.
    0
    Comment actions Permalink

Please sign in to leave a comment.