We recommend switching to the latest versions of Edge, Firefox, Chrome or Safari. Using Internet Explorer will result in a loss of website functionality.

Importing pure python packages into Analyze's transform nodes

Comments

4 comments

  • Avatar
    Adrian Williams

    Re. the cmd shell error: The Jython instance that is bundled with Data360 Analyze is not intended or configured to provide shell access. The above error is not associated with attempting to use the pip module as it would also be generated by commands such as jython --version

    Before commenting on your other questions, I need to state the following caveats:

    • Only modules that are part of the Jython standard library or are shipped as part of the Data360 Analyze are covered by the Infogix support policy. 
    • When an Analyze instance is upgraded, any packages installed into the Jython site-packages directory will be lost. For this reason it is recommended that any custom packages are installed in a directory outside of the Analyze installation directory. The directory used for the local repository would not be on the Python library path so it is necessary to add the directory to the path for all nodes that utilize locally installed packages.
    • When a package is installed by pip the normal operation is to also attempt to install any dependent packages. The primary package you want to install could be a pure Python package but the package may depend on other packages that are written in C/C++ which can lead to an installation failure. Installing non-pure Python packages on Linux typically requires the relevant compiler to be installed in the OS.
    • Not all packages have been written to the required standards and some packages may not operate  correctly when they are installed in a custom directory rather than the site-packages directory. Indeed, some packages may still fail to operate even when installed into the default site-packages directory.

     

    The version of Jython bundled with Analyze includes pip. This can be accessed from within the scripts of a Transform node. The following provides an example of how the pure-Python PyPDF2 package can be installed by leveraging pip within a Transform node:

     

     

    The pip module is not intended to be imported programmatically from within Jython. In the example ConfigureFields script above, pip is called using the subprocess module. 

    After importing the required libraries the script defines the target directory where the package is to be installed. The script then constructs the target parameter for the pip install command.

    A function is defined to call pip within a subprocess.

    After defining the name of the package to be installed the function is called to perform the installation, The standard output from the installation command is captured so it can be output by the node as confirmation of what was installed.

    The ProcessRecords script is then used to output the results.

     

    As previously mentioned, any node that needs to leverage a package installed into the local repository must include the path in the Jython library search path. Below is an example of how the repository's path can be appended to the search path:

     

     

     

    Attached files

    Jython Transform Node Script to Install a Pure-Python Package in a Custom Repository.txt
    Jython Transform Node Script to Import Package from Custom Repository.txt

     

    0
    Comment actions Permalink
  • Avatar
    Adrian Williams

    Note: in the above example the re package was imported.  The re package is not required for the installation and can be omitted.

    0
    Comment actions Permalink
  • Avatar
    M M

    Before I saw the above, what I did was:

    1. use the pip from my Python installation to download and install my desired pure Python package.

    2. it had no dependencies, so no worries there

    3. copied the package folder from the Python installation to the D360 Jython's site-packages folder

    4. set the path as you did above in your example Transform node, and then import the new package

    Clumsy, but it worked.

    Thanks for your response - very useful.

    0
    Comment actions Permalink
  • Avatar
    Adrian Williams

    I'm glad you found a solution that worked for you.

    I would still recommend using a directory outside of the Analyze install directory if possible as it would overcome the (almost inevitable) issue where the nodes that leverage the custom packages stop working after upgrading the Analyze software. 

    You may want to consider creating a target Jython site-packages directory within your Analyze instance's 'site' directory as this would automatically be retained when the software is upgraded (and the installed packages would be included in the system backups).

    The location of the site directory can be determined using the property substitution r"{{%ls.appDataDir%}}"  in your code.

     

    0
    Comment actions Permalink

Please sign in to leave a comment.