We understand some customers have been using LAE's Brainscript for many years and may find the transition to Python difficult at first, but we feel it is best to move away from a proprietary script, with one of the main benefits being that users can get far better online community help for Python than Brainscript.
With that in mind, please see the below cheat sheet on how to configure certain actions in Python.
Action |
ConfigureFields |
ProcessRecords |
Output all input fields |
out1 += in1 |
|
Create a new field in a data set |
out1.Name = str |
out1.Name = "Monday" |
Filtering on row values |
out1 += in1 |
if in1['Billing Total'] > 100.00 and in1.Status == 'Active': out1 += in1 |
Narrow down to list of fields on output from input |
out1 += in1.field1 out1 += in1.field2 out1 += in1.field3 |
out1 += in1 |
Excluding fields |
out1 += in1 out1 -= in1.Billing_Ref out1 -= in1.Company_Code |
out1 += in1.Billing_Total |
Renaming a field |
out1 += in1 out1 -= in1.Customer_ID out1.Customer_Ref = in1.Customer_ID |
|
First or last execution of data |
|
If node.firstExec: Do x |
Create and populate a new output field with record count |
out1 += in1 out1.RecordCount = long |
out1 += in1 out1.RecordCount = node.execCount |
Convert the data type to Integer |
out1 += in1 out1 -= in1.Account_Number out1.Account_Number = int |
out1 += in1 #cast each value of Account_Number to an integer out1.Account_Number = int(in1.Account_Number) |
Get a property value |
out1 += in1 out1.tmpdir = str |
node.properties
out1.tmpDir= node.properties.getString('','ls.brain.outputtempdir') #userName= node.properties.getString('','graph.userName') #dataflowName= node.properties.getString('','graph.graphname') #out1.tmpdir= tmpDir + "/" + userName + "/" + dataflowName |
Get Count or rows |
out1 += in1 out1.gc = int out1.gc=group.count() |
|
Output all fields and the running sum of a field |
out1 += in1 out1.SumTotal = float |
out1 += in1 if node.firstExec: sumTotal = 0 |
Logging an error and failing the node upon encountering an error condition |
out1 += in1 |
out1 += in1 #AccountNumber is a required field, error if not present if in1.AccountNumber ==Null: msg = 'Account Number missing on record: ' record = str(node.execCount) node.logger.error(msg + record) raise node.fail() |
Replace function |
out1 += in1
|
# Replace "a" with "b" out1.Name1 = str(in1.Name.replace('a' , 'b')) |
Formatting a string to date |
out1 += in1 out1 -= in1.Transaction_Date out1.Transaction_Date = datetime.date |
out1 += in1 out1.Transaction_Date = datetime.datetime.strptime(in1.Transaction_Date, '%Y-%m-%d').date() |
How to handle NULL values in a field |
|
if in1.type ==Null: out1.MyColumn2 = None else: out1.MyColumn2 = in1.type.replace("primary","PR***") |
Formatting string date field to a date data type where the field has NULLS and blanks in it |
out1 += in1 out1 -= in1.dueDate out1.dueDate = datetime.date |
out1 += in1 if in1.dueDate ==Null or in1.dueDate == "": out1.dueDate = None else: out1.dueDate = datetime.datetime.strptime(in1.dueDate, '%Y-%m-%d').date() |
To make string uppercase |
out1.Name = str |
out1.Name = in1.Name.upper() |
Add timestamp to record |
out1.ts = datetime.datetime |
out1.ts = datetime.datetime.now()
|
Convert a string timestamp to a timestamp data type |
out1 += in1 out1.timestamp = datetime.datetime |
out1.timestamp = datetime.datetime.strptime(in1.DateTime, '%d/%m/%Y %H:%M:%S') |
Add x hours to a timestamp |
|
In1.mydatetime + datetime.timedelta(hours=12) |
Minus x hours to a timestamp |
|
In1.mydatetime - datetime.timedelta(hours=12) |
Calculate duration between 2 datetimes (remove nanoseconds) |
start = datetime.datetime.now().replace(microsecond=0) out1.end = datetime.datetime out1.duration = str |
out1.end = datetime.datetime.now().replace(microsecond=0) out1.duration = str(out1.end-start)
#Alternate method to get duration in days duration = fields.fieldX - fields.fieldY out1.duration = duration.days |
Get Month and Year from input date field |
|
|
Combine multiple string fields together |
|
out1.Test = in1.Country + " " + in1.CountryCode |
Split node |
out1 += in1 out2 += in1 |
if inputs[0]['Original Field Name'] is not Null or inputs[0]['New Field']: outputs[0] += inputs[0]
if inputs[0]['REQUIRED'] is not Null: outputs[1] += inputs[0] |
Output only records that contain a specific column and value |
out1 += in1 f_I_ADDR_IND=False if hasattr(in1, 'I_ADDR_IND'): f_I_ADDR_IND=True |
if f_I_ADDR_IND: v_I_ADDR_IND = in1.I_ADDR_IND else: v_I_ADDR_IND = None if v_I_ADDR_IND == "D" or v_I_ADDR_IND == None: out1 += in1 else: pass |
Don’t output |
|
out1 += in1 out1 = none |
Flag Duplicates in an Aggregate GroupBy: fields["PASSFRST"],\ fields["VALCARR"],\ fields["DEPDATE"],\ fields["ORIGIN"],\ fields["DESTINAT"],\ fields["TRANTYPE"],\ fields["EXCHANGE"] |
out1 += in1 out1.DUPLICATE = bool
|
if node.firstInGroup: out1.DUPLICATE=False else: out1.DUPLICATE=True |
Regular Expressions |
|
re.match() - only find matches if they occur at the start of the string being searched. re.search() - doesn’t restrict to beginning of the string re.findall() |
Regex to find values in brackets and replace |
import re |
replaced = in1.FieldValue out1 += in1 m = re.findall('\[\'.*?\]',in1.FieldValue) str1 = '-'.join(m).replace("\'","\\\'") replaced = re.sub('\[\'.*?\]', str1, in1.FieldValue) out1.FieldValue = replaced |
Use a Regular Expression |
SpeedFound += in1 SpeedNotFound += in1 SpeedFound.Speed = str |
import re s=None p1 = re.compile('\s\d+(Mbps|Mb|MB|Kb|KB)')
if inputs[0]['OFFER'] is not Null: s = p1.search(inputs[0]['OFFER'])
if s: SpeedFound.Speed = s.group() SpeedFound += in1 else: SpeedNotFound += in1 |
Sum Field in Aggregate |
out1+="MySum" |
out1 += in1 if node.firstExec: ms = 0 ms=ms+float(in1['myinteger']) out1.MySum=ms |
Show directory parameters |
out1.directoryToDelete = str |
user=node.properties.getString('u','graph.userName') df=node.properties.getString('df','graph.graphName').replace(" ","+") dir="{{%ls.brain.node.tempdir%}}"+"/"+user+"/"+ df out1.directoryToDelete = dir |
Ordered Output |
|
Use OrderedDict objects for the input fields and newly created fields:
from collections import OrderedDict
inFlds = OrderedDict() inFlds['0'] = in1.id inFlds['1'] = in1.rand inFlds['2'] = in1.color
newFlds = OrderedDict() newFlds['Debug'] = str newFlds['Username'] = str newFlds['Event Dttm'] = str newFlds['Source IP'] = str newFlds['Resource'] = str newFlds['Request'] = str
for fld in inFlds: out1 += inFlds[fld]
for fldName in newFlds: out1[fldName] = newFlds[fldName] Get into the habit of defining all newly created string fields as 'unicode' data type rather than 'str' data type unless you really need to limit the values to ASCII characters |
Loop through input fields |
FieldStr = node.properties.getString('Columns to Concatenate','rtp_Columns_to_Concatenate', None) if FieldStr ==Null: node.logger.error('The \'Columns to Concatenate\' property must be configured.') raise node.fail() else: FieldList = FieldStr.split(",") NumFields = len(FieldList) |
i = 1 while i < NumFields: thisField = FieldList[i].strip().strip("'") if inputs[0][thisField] is not Null: ## Append the value to the concatenated string _concatStr = _concatStr + _delimiter + unicode(inputs[0][thisField]) else: if _missingValueAction == 'Error': node.logger.error('Missing Value in field \'' + thisField + '\' at record ' + str(node.execCount)) raise node.fail() if _missingValueAction == 'Insert Blank': ## Insert a delimiter _concatStr = _concatStr + _delimiter if _missingValueAction == 'Skip': ## Do nothing pass i +=1 |
Arithmetic |
|
v_result = eval(in1.fieldA + 1) |
FilePath name from filename and directory |
|
|
Replace Null function for string |
def replaceNull(x): if x is Null: return "" else: return x |
out1.color = str(in1.color) + replaceNull(in1.junk) |
Replace Special Characters |
|
|
Exclude Fields Using Pattern |
out1 -= patterns.regex('^Xsd', in1) *XSd is the String you want to exlude. This will exclude fields starting with “Xsd” |
|
Copy an object |
import copy |
out1.obj_1 = unicode out1.obj_2 = unicode
#### ProcessRecords ## Create a compound list object list_1 = ['Ann',1,['Hi','There!']]
## Assign another variable to have ## the structure and values of the ## first but use a separate object list_2 = copy.deepcopy(list_1)
## Now update some elements using the ## name of the second variable list_2[2][0] = 'Bye' list_2[2][1] = 'Bye!'
## Output string representations of ## both variables out1.obj_1 = unicode(list_1) |
Try Statement |
Try: X=1 Except: X=1 Else: X=0 Finally: X=9 |
|
Check if Field exist on input |
rc = inputs[0].find('OwnerRelationshipUID') # -1 means not found. If found it will be the index if rc < 0: ou1.OwnerRelationshipUID = unicode |
|
Replace unknown character |
str.decode("utf-8").replace(u"\u2022", "*") #Use Wikipedia to get list of characters when/if it fails for a specific unknown character https://en.wikipedia.org/wiki/List_of_Unicode_characters |
|
Encode in a different character set. For example Eagle financial data |
import sys reload(sys) sys.setdefaultencoding('ISO-8859-1') |
|
int, long, double, unicode, string, date, time, datetime and boolean
Output Types
Python |
Dataverse |
bool |
boolean |
int |
int |
long |
long |
float |
double |
datetime.date |
date |
datetime.time |
time |
datetime.datetime |
datetime |
str |
string |
unicode |
unicode |
All the date time stuff you could want
import datetime
now = datetime.datetime.now()
print "Current date and time using str method of datetime object:"
print str(now)
print "Current date and time using instance attributes:"
print "Current year: %d" % now.year
print "Current month: %d" % now.month
print "Current day: %d" % now.day
print "Current hour: %d" % now.hour
print "Current minute: %d" % now.minute
print "Current second: %d" % now.second
print "Current microsecond: %d" % now.microsecond
print "Current date and time using strftime:"
print now.strftime("%Y-%m-%d %H:%M")
print "Current date and time using isoformat:"
print now.isoformat()
Results:
Current date and time using str method of datetime object:
2014-09-26 16:34:40.278298
Current date and time using instance attributes:
Current year: 2014
Current month: 9
Current day: 26
Current hour: 16
Current minute: 34
Current second: 40
Current microsecond: 278298
Current date and time using strftime:
2014-09-26 16:34
Current date and time using isoformat:
2014-09-26T16:34:40.278298
Directly from the time module documentation, here are more options to use with strftime:
Directive |
Meaning |
Notes |
%a |
Locale's abbreviated weekday name. |
|
%A |
Locale's full weekday name. |
|
%b |
Locale's abbreviated month name. |
|
%B |
Locale's full month name. |
|
%c |
Locale's appropriate date and time representation. |
|
%d |
Day of the month as a decimal number [01,31]. |
|
%H |
Hour (24-hour clock) as a decimal number [00,23]. |
|
%I |
Hour (12-hour clock) as a decimal number [01,12]. |
|
%j |
Day of the year as a decimal number [001,366]. |
|
%m |
Month as a decimal number [01,12]. |
|
%M |
Minute as a decimal number [00,59]. |
|
%p |
Locale's equivalent of either AM or PM. |
(1) |
%S |
Second as a decimal number [00,61]. |
(2) |
%U |
Week number of the year (Sunday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Sunday are considered to be in week 0. |
(3) |
%w |
Weekday as a decimal number [0(Sunday),6]. |
|
%W |
Week number of the year (Monday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Monday are considered to be in week 0. |
(3) |
%x |
Locale's appropriate date representation. |
|
%X |
Locale's appropriate time representation. |
|
%y |
Year without century as a decimal number [00,99]. |
|
%Y |
Year with century as a decimal number. |
|
%Z |
Time zone name (no characters if no time zone exists). |
|
%% |
A literal "%" character. |
Create Date
import datetime
a = '2010-01-31'
datee = datetime.datetime.strptime(a, "%Y-%m-%d")
datee.month
Out[9]: 1
datee.year
Out[10]: 2010
datee.day
Out[11]: 31
Default emit
matches += rightInput.difference(leftInput)
i.e.
outputs[0] += inputs[0]
outputs[0] += inputs[1].difference(inputs[0])
Comments
0 comments
Please sign in to leave a comment.