The Filter, Transform, and Split nodes all serve similar functions, but they are each designed for specific situations. The Filter node outputs only records that match a certain criteria. The Split node groups records into two groups: those that match the filtering criteria and those that do not. The Transform node performs more generic data manipulation.
The Filter node outputs only the input records that pass its criteria. For example, a Filter could take in a list of invoices and output only those invoices where the invoice ammount is more than $2,000.
The Split node works similarly except that the input data is divided into two groups: records that meet the filtering criteria and records that do not. Those that meet the criteria are output to the first output pin; those that do not are output to the second pin. Unlike the Filter node, all input records are output to one of the two pins.
The Transform node, on the other hand, is the Swiss Army knife of data manipulation. It can be used to perform a number of functions in Data360 Analyze. First, the Transform node can conduct more complicated filtering. For example, the script in a Transform node could be defined to output records only if the first three characters of a field match "NOR". The Transform node could be provisioned with additional output pins and a script defined to group the input data into three or more output buckets, based on different criteria. Finally, the Transform node can perform lots of other data manipulations, such as converting the types of input fields or generating random test data.
Under the covers, the Filter and Split nodes are actually powered by the same functionality as the Transform node. In fact, if you click on the "Advanced" tab's link in the Properties panel of the Filter node, you will see the selection criteria displayed as Python script. This script could be copied into a Transform node for finer-tuned filtering.
As you can see, these three nodes each have their sweet spots where they shine. The Filter outputs only records that match the filtering criteria. The Split groups the records into records that match the filtering criteria and those that do not. The Transform provides more general filtering capabilities along with the additional data manipulation functionality provided by Python scripting.
Comments
0 comments
Please sign in to leave a comment.