This article provides information on how you can use Python in the Transform node to extract elements from a date data type field and format a field date field to a string field with a specified format.
[Update: this article now includes examples which use the Calculate Fields node, see the updated information at the end of the article]
You will notice that when you output a field that has been formatted as a date data type, the output takes the format YYYY-MM-DD. There are times you may want or must change the format of the output e.g. you want your report to show the date as MMM-YYYY or because a downstream application expects a specific format, such as DD-MMM-YYYY.
The Python datetime module provides a set of operators to transform date date, time and datetime objects. You can read about the details of the datetime module in the Python documentation.
The datetime module is automatically imported when the Transform node is executed so you do not have to explicitly import the module within your scripts.
The strftime() operator enables you to create string representations of date, time and datetime objects, and apply a format code to specify the format of the string representation of the datetime object. The Python documentation for strftime() provides details of the format codes that can be applied. The format codes enable you to format a complete date in a particular string representation e.g. 02/28/2017, 28-02-2017 or 28 Feb 2017. You can also extract individual elements of a datetime object e.g. just the month as a padded representation of a decimal value: '02' or as a string value: 'February'.
The attached data flow provides some examples of the use of strftime().
Update:
The Calculate Fields node can be used to streamline the creation of new fields. Like the Transform node, the Calculate Fields node leverages Python language expressions. However, the Calculate Fields node streamlines the process of creating new fields because it enables you to define the new field metadata (name and data type) within the grid editor - you only need to define the actual expression that will be evaluated to generate the output values. The node also streamlines the handling of Null values in the data.
An updated version of the example data flow is attached which includes examples using the Calculate Fields node as well as the original Transform node examples.
Note: Data360 Analyze 3.6.0 or above is required to import this data flow.
Comments
0 comments
Please sign in to leave a comment.