For those who used the Execute BRX node in LAE, or for new users wishing to explore it, Stony Smith and Ernest Jones from our Professional Services team, have developed a Data360 Analyze version of the node.
BACKGROUND:
The node executes a sub-graph multiple times with parameters that are driven by data. Execute the specified data flow, once per record in the input file, using the values of each field in the record as properties available for the data flow being run.
The input records to the node provide the run parameter values.
There are three versions of the node (attached):
1) A node to run within another data flow ( called Execute Dataflow.brg - this is in .brg format, you can import it into Analyze by selecting Import > Legacy Data Flow and browsing to the Execute Dataflow.brg file). The BRG can be imported into Data360 Analyze and uses the Analyze REST API version 3.
2) A python script to run from a command line (called ExecuteDataFlow.py). Before running the script, you should first execute the following command - it adds the python command to your path.
For Windows, it's
<installDir>/bin/laeEnv.bat
For Linux...
source <installDir>/.profile.lavastorm
Then, the command to get a list of the parameters is:
python ExecuteDataFlow.py -h
To run the data flow, you'll need to do something like this:
python ExecuteDataFlow.py --dataFlow ExecuteDataflow_SampleFlow2 -p "Schema=blue"
3) A single Python node version that does it all, called ExecuteDataflow_Python.lna
The Execute Dataflow.brg (number 1) and the Python node (number 3), are essentially the same thing.
Also attached is a sample ExecuteDataflow node called Test Execute Dataflow.brg, this is just to demonstrate that it works, and the screenshot below shows the configuration for the node.
NOTE: Per Mario's comment below, there are a couple of things to bare in mind:
- The data flow that the node runs must be in the Public Documents folder.
- The name of the data flow that the node runs, which is configured in the DataflowNameproperty, cannot have spaces in it.
Comments
3 comments
Christina Costello, the execute dataflow node works, but it doesn't run after 5th record from the input. is there any limitations to the "execute dataflow" ? I checked in the code but i couldn't find it.
It completes successfully for 5 inputs. But for 6th input it just fails and I don't see any error messages as well.
Can you please advise ?
There is no reason the dataflow should stop after 5. I have used it to run 20 at a time.
After that 5th run, if you go back out to the directory, you should be able to open the failed run and examine what went wrong. There also (should be) some kind of message if you drill down into the composite that waits for the job to run.
I had a play with this example and eventually got it working on my desktop installation.
A couple of things I noticed:
Please sign in to leave a comment.