Action Workflows Introduction

Understand & Use Workflows in Diffgram

Introduction

Workflows allow you to:

  • Access and use a growing ecosystem of machine learning, and training data specific tools, like hugging face, autoML training, train custom models, clean lab, and more
  • Surface your training data processes. Create custom views, steps, and more.
  • Manage the data that comes in and out of Diffgram. They allow you to perform specific actions whenever some events happen in the system and are completely customizable.
2811

Concepts

More then just task stages, more then just mappings. It's a whole new way to work with training data. Workflows blend a variety of existing concepts to create the best thing for training data. Here's a few example of what we mean:

  • Flexible steps. Manual and automatic steps live comfortably beside each other.
  • Bring all your existing pipelines, training data/ML/AI tools, and discover some new ones. This is not yet another pipeline thing. It's all stuff specific to training data, with concepts accessible to variety of people. For example you can actually work with each step directly - it's not just a configuration system.
  • Built to be extensible and scalable. Easy to extend existing actions, and add new ones. And it scales with a dedicated queue service that uses any AMPQ compatible provider, with our reference installation using Rabbit MQ.

Why

Basically before it was "figure it out yourself". Now there's a super clear path to go from "zero to hero" with training data, including pre-labeling, model training, sampling, etc.

Getting Started

📘

Alpha Preview Release

Please note our current release is an early preview. See roadmap

Actions

A workflow is composed of multiple actions. Each action is a "step" in the workflow. For example, you can have one action to create a labeling task when a file is uploaded and a second action to export that into JSON automatically when the task is completed.

Examples of Actions that can be added in Diffgram

  • Create labeling tasks on file uploads
  • Pre label your new files before creating tasks.
  • Post your labeled data to a webhook
  • Trigger an Export when all files are labeled.
  • Trigger a model training when all files are labeled.
  • Send an email when all files are labeled.
  • Run a Vertex AI Model for prelabeling data
  • Run an AWS Text Analytics model before labeling your text files.

This is just a small set of the actions that we will be supporting. Eventually Diffgram will allow you to write your own custom actions to take the customization of your training data pipeline to the next level.

Let's see how to create a basic workflow that creates a task when a file is uploaded.

Creating your first workflow

Go to the project menu and select "New Workflow" at the bottom part of the menu.

395

This will take you to the workflow creation tool. From there you can see multiple actions you can select to start creating your workflow.

1465

For this example, select "Human Labeling Task" as the first step.

Now you can see the action configuration wizard. Select the proper task template and triggers for this action so that each time a file is uploaded, a file is added to an existing task template like in the following images:

1470 1470 1470 1470

Now turn On your workflow by clicking the top toggle on the screen. You will see now the attached task template on the action step so you can continously monitor the status of your tasks as new files are added.

1454

Now try uploading a file! You will see the labeling task being created automatically for you.