Extract particular entry
I would like to ask is it possible to extract a particular entry from a fixed format file? The fixed format file is read from notepad, and I indicate the FilePositions in Fixed Format File.
For example:
Fruits:
Apple 10
Pear 6
Rotten Fruits:
Apple 100
Orange 200
I would like to get the value 10 in Fruits only from that file, and presumably I have multiple fixed format files, and sum all values of apple in Fruits from different files. Thanks.
-
You can use the Fixed Format File node to read the file or files. See the Help documentation 'Node help' > 'Input Connectors' > 'Fixed Format File' topic (or online here for the latest version of the product).
A Directory List node can be used to select the required filenames using a pattern, which can then be input to the Fixed Format File node. For example:
As mentioned in the node help, when using this method the format of all the files needs to be exactly the same. If the files do not have the same format the data needs to be input from each file separately and then you can use the Cat node to concatenate the records from each file into a single data set.
When using the Directory List node connected to the input of the Fixed File Format node, the 'FilenameExpr' property is used to specify the input field containing the filenames.
The 'FieldPositions' property is used to specify the expected format of the data in the files. See the node help for details.
Note: there is currently a error in the help documentation relating to the 'FieldPositions' property (which is fixed in the next release)
Instead of: field1,10,5,string:field2:15,3
The specification should be: field1,10,5,string:field2,15,3
Once the data has been imported you can then use a Filter node to output records that match your criteria and then use the Aggregate node to sum the values. Note, in the example screenshot above the field type of the second field (Col_2) was specified as an int (integer) type. By default, all fields are input as string.
-
You can also specify the 'FieldPositions' property as a newline-separated list with each field on its own line. For example, the specification from the previous screenshot could also be configured as follows (which is easier to interpret):
For completeness, here is the example data used in the fixed format file:
-
Is there a way to extract 2 consecutive rows from multiple files ? I just try to filter, but can only grab apple counts from both Fruits and Rotten fruits.
For example:
File A
-----------------------------
Fruits NULL
Apple 10
Pear 6
Rotten Fruits NULL
Apple 100
Orange 200
-----------------------------
File A
-----------------------------
Fruits NULL
Apple 1
Pear 20
Rotten Fruits NULL
Apple 10
Orange 20
-----------------------------
Expected OUTPUT:
--------------------------
Fruits NULL
Apple 11
-
Set up another Filter node where the criteria for the field selects records with a value of "FRUIT". You can then aggregate the records as before. However, the sum will be 0 if all of the values in the input records are NULL. If you want to explicitly code this as NULL then you could use a Calculate Fields node with an appropriate logical expression, e.g.
Null if Produce_Sum == 0 else Produce_Sum
You can then use a Cat node to combine the records with the records from the other Aggregate node. However, if you want the counts to be in the same field you would need to rename all the *_Sum fields to have the same name using a Modify Fields node then concatenate them. If you want the sum values to be in separate fields then you can just combine them in the Cat node by setting the ConcatenationMode property to 'Union'
-
thank you for your response. Could I also specify a row's first field name so that I can filter out dedicated rows as output? Like I just want these two rows from file A. If I specify fields['Apple'], the Apple row in Rotten Fruits will also be output. I don't want that to happen. However, I can't use head too, coz I am reading multiple files. Or should I use looping for each file obtained ? Thank you.
File A:
Fruits Num<------------
Apple 10<------------
Pear 6
Rotten Fruits Num
Apple 100
Orange 200
---------------------------------
Output:
Fruits Num
Apple 10
------------------------------
Please sign in to leave a comment.
Comments
8 comments