Today we are excited to announce our most anticipated feature ever. Model upload allows data scientists to upload trained Scikit-Learn models and automate their workflows to send the output to business applications quickly. Why is this important?
According to TechRepublic, 52% of data professionals say they have trouble demonstrating the impact data science has on business outcomes, and half of all Big Data projects fail, according to Gartner.com. The leading causes are the difficulty to put models into production, security, and organization. However, it turns out that building a model is just the beginning of the journey in terms of a Machine Learning workflow.
With our new Model upload feature, we are looking to tackle all three challenges and make it very simple for data professionals to prove value. Here's why:
- Difficulty to put models into production: With the Upload Model tool, Data scientists will only need to upload their model and select the specific business application where the business side needs the data. In addition, data scientists don't have to worry about scaling machines, scheduling, and developing APIs. The process will be automatically managed. This tool promises to cut time to production considerably, from weeks to minutes. The benefit is not only for small companies with 1 or 2 data scientists and no ML Ops. In addition, it benefits big data science teams, considering that they usually have scientists embedded in business units who struggle to prototype quickly.
- Organization: One of the main challenges in a data-driven organization is the lack of collaboration. With automated workflows, Datagran becomes an efficient XOPs tool that eliminates silos between OPs teams and business units, allowing them to control how the output of models hit their business apps.
- Security: Datagran has built an ecosystem that helps organizations protect their data like never before. From ELTs (extract load transform) built in-house that offer SSL (Secure Sockets Layer) and key encryption, data governance dashboards, prevention of data download and upload to versioning, and governance in terms of how the data and models are used.
Here is a step by step on how to use the new Model Upload feature:
- Create a dump or pipeline dump in your local trained model. Here's a code example.
Head to the Models section in your project and upload your model. You will need to upload a .txt file, choose your python version and parameters as well.
- Add your model to a project.
- Head to Pipelines and build a workflow. Drag and drop the Models Operator and connect it to your source or to another operator. In this example, we are connecting it to an SQL operator.
Then, press the edit button to select the model and version you want to work with. Save it and run it by pressing the green button.
- Connect another operator to add logic or drag and drop an Action to send the output to a business application.
We are incredibly excited about how Datagran will impact the future of Data Science and will love to get feedback from you. Our team worked hard to ship this new feature, and we look forward to hear how you can make use of it.