Advantages of using Python/R nodes within dataverse vs just directly scripting on R or Python

Comments

3 comments

  • Avatar
    Adrian Williams

     

    Data3Sixty Analyze (the new name for the application formerly known as Dataverse) is an enterprise-grade application for agile data analysis, data preparation and advanced analytics. ETL tasks are only one-use case for the application.

     

    A primary benefit of using Data3Sixty Analyze is the graphical data flow paradigm which offers greater transparency of the analytic logic compared to the use of a purely scripted approach. This enhances collaboration between different stakeholders within an organisation (data scientists, data engineers, data analysts and business users) - who typically have differing skill sets, knowledge and tools. Data preparation also consumes a significant proportion of the time spend building analytic applications – a task that is ideally suited to using a graphical approach.

     

    The use of a graphical approach accelerates the development of data-rich, analytically complex applications as the application can be built out iteratively as a collaborative task which enables stakeholders to have confidence in the data and view intermediate results at any step in the process. The collaborative build-out approach can be contrasted with a traditional waterfall approach where data is made available to 'data consumers' by a 'producer'. Typically, this can lead to repeated to-ing and fro-ing to correct misunderstandings in the requirements, incorporate additional sources of data, etc. that introduce delays and reduce business agility.

     

    Data3Sixty Analyze can leverage the power of the R and Python languages to implement advanced analytics to build statistical analyses and predictive models. The analytic logic can be encapsulated and turned into reusable custom library nodes that can be shared and used in other analyses – enabling data analysts and business users to exploit advanced statistical and predictive capabilities without needing to understand R/Python code.

     

    Data3Sixty Analyze also provides – as a separately licensed feature – the ability to use pre-built statistical and predictive analytic capabilities including linear regression, logistic regression, clustering, classification, affinity analysis and time series analysis. These statistical and predictive nodes leverage an enterprise-grade embedded R computation engine instead of using an Open Source R (OSR) environment – which provides improved memory management capabilities compared with OSR.

     

    The Data3Sixty Analyze Enterprise Server edition provides governance controls that enable administrators to define granular access controls, allowing users to share and execute parameterized data flows and incorporate custom library nodes into their own data flows while (where required) preventing them from modifying the logic contained within the custom library node.

    0
    Comment actions Permalink
  • Avatar
    rakshit bhargava

    Thanks Adrian for clarifying things.

    0
    Comment actions Permalink
  • Avatar
    Adrian Williams

    You may also be interested in this article from last year which shows how you can incorporate R-based machine learning into a data flow.

    https://lavastorm.zendesk.com/hc/en-us/articles/115003069289-Machine-Learning-with-Dataverse-and-R

     

     

    0
    Comment actions Permalink

Please sign in to leave a comment.



Powered by Zendesk