HOW LAVASTORM ANALYTICS ENGINE (LAE) WORKS:
The Lavastorm Analytic Engine system consists of several pieces: Servers, Controllers, Graphs, Drones and Clients.
You use the Client (BRE) to create a graph (BRG), and you use BRE to compile the BRG into a BRX (a compiled graph).
A graph is like a script of "what to execute" or "the program". The compiled graph (BRX) contains the information necessary to execute a sequence of drones to read data, filter, sort, join that data and then output the data somewhere useful. There can be, and usually are, multiple streams of data within the graph that are joined / merged together, and some graphs can grow to identify hundreds of steps to perform in sequence.
A node is a single step of the process, for example: a join, filter, or sort. Each node reads from zero or more input files and writes to one or more output files. Usually, these "temporary" files are saved in a common "temp" folder during the run of the graph, and deleted when they are no longer needed.
NOTE: There is a mode where TCP / IP can be used to replace the temp files (this is called streaming mode).
A controller reads a BRX and determines what nodes need to execute and in which sequence, and it then asks the Server to start the node(s) as the various dependencies are satisfied. When each node finishes, the server notifies the controller, and then the controller determines what node(s) to initiate next.
The Server process listens for messages from controllers, and then executes drones as requested. In general, you have only one server process for each server.
SERVER FARMS:
It is possible to configure a controller to spread nodes across multiple servers in a "farm". These nodes then operate either asynchronously in parallel, or they can pass data one-to-another in a serial fashion, depending upon the particular programming needs.
If in non-streaming mode, then each node must run to completion before the next in a particular chain is initiated.. this requires that the entire file it is processing be read from input to output, requiring a significant amount of temp space.
Alternatively, in streaming mode, the nodes communicate with each other using an IP protocol, and do not generally require as much temp space.
Note: it is not required that any one node run on the same server as its immediate predecessor. What is required is that the node can "see" the data created by the node before it.
Each server in a FARM can be configured to support a different number of nodes. This setting is often called the THREAD count.. it is simply the maximum number of nodes that the controller will try to send to that server.
So, some terms:
Threads: the number of drones that a controller will ask for from a server
Drone: a spot reserved by the server for a node to execute in
Node: an executable binary
Note: Each controller will try to use the full thread count unless it is told otherwise. There is no communication between controllers. There is a mechanism for limiting the maximum number of drones a server will accept. (see “Thread Limit Configuration.pdf” attached). Therefore, you must be aware of your total mix of graphs running at any one time or you can over-load the CPU.
CONFIGURING THE SERVER FARM:
By default, the Farm parameter is only setup to ask for one thread (drone) on the local machine. This setting can easily be changed to include other servers and other thread configurations.
The default setting is in <installdir>/conf/brain/ls_brain.prop
ls.brain.controller.farm=( "${ls.brain.server.host=localhost}:${ls.brain.server.port}:4" )
The value at the end of ls.brain.controller.farm represents the number of threads available to an individual scheduled graph and does not consider what other graphs may be running.
You can change the setting there, or you can add the setting in any one of these places:
<installdir>/conf/brain/ls_brain.prop
<installdir>/conf/site.prop
<installdir>/.profile.lavastorm
You can also set the parameter on each/every script used to invoke the controller. To change the FARM or THREAD parameters in a script, you would use this syntax:
export p_ls_brain_controller_farm='("myhost:7721:16")'
NOTE: the parameter can also be set via the -P option on the command line that invokes the cotntroller.
The syntax here is server:port:threads .. The above setting would be to run 16 threads on a LAE server with port 7721 that would be found on pacdcls05.
If you want to specify multiple servers for a single farm:
export p_ls_brain_controller_farm='("server1:7721:16","server2:7721:16"))'
The above would put 16 drones onto server1 and then 16 drones onto server2. Note: there is a bit of a problem here.. the controller will always start putting drones onto server1 first, and it won't try to balance the load between the servers. If you want to try to balance the load some, you might try this:
export p_ls_brain_controller_farm='("server1:7721:4","server2:7721:4","server1:7721:4","server2:7721:4"))'
This would put 4 drones onto server1, then 4 onto server2, then another 4 onto server1, and finally 4 onto server2again. But, this is directly a function of how many drones that any one controller determines that it needs to start at any one time.
EXAMPLE:
Let's say you have five scripts that each run only 3 nodes, and those are in parallel with each other. If you start them all at once, then you're going to be trying to run all 15 drones on server1 only, because each controller starts on server1 before it switches to ls06. Almost no matter what you try, the first server listed in your FARM parameter is going to get a heavier workout.
Comments
0 comments
Please sign in to leave a comment.