Is Data360 Analyze compatible with MS Office ?
To get Microsoft Office data from documents stored in SharePoint, our nodes are compatible with Sharepoint 2013. Support for SharePoint 2010 is deprecated. To pull data directly from any of these file types via the Excel node: .xls, .xlsx, .xlsm, then any Microsoft version is supported.
Does Data360 Analyze use 64-bit features?
Yes, Data360 Analyze is optimized for and can only be installed on 64-bit architecture. It's not possible to install it on a 32-bit machine.
What is the average bandwidth per user profile?
Unfortunately, we are unable to predict this. It is a function of how many edits a person makes in the GUI. The GUI uses REST APIs to communicate with the application. There is almost zero bandwidth involved with actually processing data – that is all done on the server. Bandwidth is required to display data in the data viewer. The amount of bandwidth is typically low as only the first 1000 records are displayed by default within each data viewer tab. The bandwidth will increase when the data has a large number (100's) fields.
Does Data360 Analyze have any restrictions regarding network latency?
During the server installation process, you will be asked to accept the default port number for the Data360 Analyze Server, Tomcat and the Postgres Database or specify any other port number that is not in use. If one or more of the default ports is detected as being in use, the installer will ask you to provide an alternative. Enter any number that is not already in use, for example, for the Data360 Analyze server, you could try 7732.
The port numbers are configurable, but by default, Tomcat starts listening at 8080, and the stop port is 8089. For Postgres, the default port is 5432. And for the Analyze Server it's 7731.
Are certain functions restricted when other processes (e.g.: database administrative procedures, batches processing...) are occurring within Data360 Analyze?
Batch processing and Database Admin procedures should not impact the running of Data360 Analyze.
The only major restriction is while the nightly system backup is running (by default at 2 A.M). When the backup is running you user access the system is disabled. Notifications are provided to users to indicate this event. If you wish to edit your backup settings to a more suitable time, please see the Help section: System administration > Editing backup settings.
Multiple users each processing separate (large) sets of data in parallel can tax system resources. You will need a large enough machine to handle the desired complexity and the desired speed of producing results.
What is the maximum database size for Data360 Analyze?
This is quite difficult to answer. User data (input and output sources) are not stored in the database. The database is for storage of the “source code” and “configuration data” – not user data. The database increases with the quantity of assets (data flows, custom library nodes, etc)
What is the period of time stored in the database?
The database is for assets (data flows, custom nodes, schedules, etc), system state and system settings. Except for you purging something manually, there are no automated processes that would be considered retention. References to historical scheduled runs are stored in the system state. System settings determine how long scheduled runs are retained before being automatically purged.
What is the maximum volume of records stored in a table?
We don’t store “data” (e.g. temporary data generated by a data flow run) in the database. Temporary data is stored in the accessible file system of the server – the maximum size of data that you wish to process is entirely controlled by how large the attached disks are.
Does Data360 Analyze have database clusterization capabilities?
The internal database doesn’t store user data. If you craft one or more of your output data locations to be a database, then we can support whatever the JDBC driver from the database vendor will support. Data360 Analyze can connect to a database that's in a cluster, for example Hadoop, Oracle or MongoDB.
Does Data360 Analyze support scaling up?
If you increase the size of the server, the extra capacity will allow you to process more data faster, etc. There is no upper limit on how many CPU's, RAM, or Disk that the system is capable of using, except as controlled by the license you purchase.
Can Data360 Analyze work in a distributed fashion having multiple application / web servers connected to one or more database servers, all presenting the same set of data?
At this time, our software only supports a single server. The ability to run processes across an entire cluster is also on the road map.
Does Data360 Analyze require high performance storage?
Generally the faster the storage, the faster you can process your data. The faster access times of SSDs typically result in improved system performance compared wit HDDs. There are slightly different specifications for the Server & Desktop products. Please refer to the Release Notes and see page 1. Both products require at least 8GB of RAM and 4 cores for installation.
Especially when using the server product, temp space can be consumed if it is not managed. There are several ways for you to monitor and reduce disk space, please see this article for more details: Monitor and Clear Disk Space
Is Data360 Analyze designed to be available 24 x 7?
Availability can be near to 100% up time. A short period of time is required for the nightly backup, during which normal data processing is suspended. We see more downtime due to required OS upgrades than we do from our own application.
Does Data360 Analyze provide an automated mechanism for job execution status report?
There is a GUI screen where you can review the status of runs within the limits of the history that you choose to retain. For individual scheduled data flows, a user-specified data flow can be triggered to run on successful completion of a primary data flow. Similarly, a data flow can be run or system notification email can be generated if a primary scheduled data flow fails to complete successfully.
Does Data360 Analyze provide notification events for job processing?
Each Schedule can either run an entire process to deal with notification, or it can simply email a distribution list to notify that there was a failure. A follow-on data flow can be run when a primary scheduled data flow completes successfully. Alternatively, a data flow or system notification message can be generated if a scheduled data flow fails to complete successfully.
Does Data360 Analyze provide error handling for job processing?
This is very dependent upon what type of errors might happen. The list in the GUI tells you what jobs failed, and you can open a failed job and review the problems. A data flow or system notification message can be generated if a scheduled data flow fails to complete successfully.
Does Data360 Analyze have its own password repository, are the passwords encrypted?
Yes, they are one-way encrypted and not recoverable. The passwords are stored in the database mentioned above.
In Data360 Analyze, is it possible to encrypt the login information?
It is possible to secure communication between the client and the server with HTTPS. It is also possible to use LDAP or SSO to control passwords.
Are security rules hard-coded in the database?
Password management is not provided for Local users. In the case where more complex security rules are to be used for user password management, then Analyze can be integrated with LDAP or configured to support SSO.
Does Data360 Analyze provide data security mechanisms for communication with other applications?
Several of our connections to external data are with JDBC connectors. Any external security would be controlled by the available options on those JDBC drivers – as provided by the (external) application vendor. When using HTTPS communication the connection can be secured using TLS 1.2.
In Data360 Analyze, are the communications among server / databases / connectors / gateways encrypted?
All components of Analyze are installed on a single server. Internal connections are used between the Analyze components. These connections are not encrypted.
Does Data360 Analyze operate on user session?
Yes, the session token(s) for access using the REST API last for only 4 hours or so before they must be renewed. (If a user is active in an edit session, they are auto-renewed)
Do customizations and enhancements persist across upgrades?
Yes, if you build your own custom nodes (or if you configure a pre-built node to create a custom node), the modified nodes will be maintained and will be available after an upgrade.
Does Data360 Analyze record access to directories?
The logs lae-audit.log and lae-access.log in the 'site' directory have the full history and can be parsed for historical information. There is also an API call that can enumerate the current permissions, but there’s no official log of the history of the permissions. You would have to parse the audit logs. The logs are stored in JSON fragments and can be parsed using the JSON Data node.
Where is the authentication and identification repository located inside Data360 Analyze?
It is located inside the Postgres database server, unless you chose LDAP. Without LDAP, the local user passwords are stored (encrypted) inside the Postgres database. With LDAP, the passwords never touch Postgres, they are sent directly out to your authentication server. The user identities are stored in Postgres.
In Data360 Analyze, how do access levels, or rights inside the database, work?
Access controls to user assets (data flows, nodes, schedules, etc) stored in the database are controlled from within the application's GUI, we do not support customers querying, updating or modifying the packaged Postgres database. See the Managing permissions topic in the system Help documentation for further details.
Can Data360 Analyze work with Neural Networks (for instance Pytorch, Tensorflow)?
Either or both of those may be available through the use of R, or you can write your own custom Python or Java nodes to achieve almost anything those languages are capable of. For example, reference the Java node below:
Analyze can also work with 3rd party python libraries such as Word2Vec Neural Networks.
Can Data360 Analyze create random forest models?
Yes, the Decision Forest node leverages the Random Forest ensemble algorithm for classification and regression tasks. Because of licensing restrictions on the name “Random Forrest”, these are categorized as “Decision Forest” nodes. See below. It is also possible to utilize the R node and leverage the CRAN package 'Random Forest' functionality to model data using Random Forests.
Can Data360 Analyze implement association rules (AGRAWAL), algorithm of decision rule (QUINLAN1987) and algorithm of hypothesis test?
Yes, they can be encapsulated in the R node. The R environment needs to have the prerequisite CRAN package(s) e.g. arules, caret, etc. installed. The user would paste the custom R script into the nodes. The Market Basket Analysis and Market Basket Miner nodes are also able to leverage the CRAN arules functionality.
For example, using R: https://www.rdocumentation.org/packages/arulesNBMiner/versions/0.1-5/topics/Agrawal
In Data360 Analyze, is it possible to parameterize the log archiving time? For instance, to store it for only 30 days?
You can easily build a data flow to prune older logs, or modify the one in this article to meet your requirements: Delete Old Log Files
There are configuration settings in the UI for configuring the retention of temp files for runs associated with scheduled data flow.
In Data360 Analyze, is it possible to export to PPT and PDF formats?
We do not ship with PPT or PDF output nodes; however, they are available upon request - we can provide examples of these custom nodes. These nodes do not form part of the Analyze product.
Is it possible to see the server resource consumption in Data360 Analyze?
When using the Server version of the product, resource monitoring data flows are available that can be scheduled for ongoing resource monitoring for historical analysis.
Does Data360 Analyze provide run time statistics to determine usage and efficiency?
Usage statistics can be retrieved and reported. The output from the data flow mentioned in the above question is Microsoft Excel-based but can easily be converted to input for visualization dashboards.