Working on Production Deep Learning?

Create the highest quality training data with Diffgram

91% Take Weeks & Months to get to a "Seed" Dataset

And most take months to go from seed to Beta.

August 2020 Survey

Virtually everyone plans to iterate

Yet only 14% have dedicated solutions

Introducing Diffgram - Patent Pending Database & Control System

What if your team could get the first version ready in 1/10 the time? And use that same process for Production Systems?
Diffgram does this by making the process "non-blocking". Patent Pending. 3-4x ROI on staff time.*

*Based on internal study vs without Diffgram.

Where does it sit in my stack?

Diffgram is software in-between raw input data and Deep Learning algorithms. Diffgram covers all 5 major sub categories.

More on What is Diffgram

What is the value?

  1. Immediate Value
  2. Ongoing Value
  3. Long Term Value

Turns manual processes on the order of weeks into a few days

In comparison to blocking manual processes that take on the order of weeks to complete, with Diffgram your data science team can start working on algorithms as soon as a "Seed" set is completed. This turns a multi-week startup process into days . This benefit is for every single dataset!

Literal Annotation Interface

Diffgram provides a best in class concrete interface to view, curate, quality assurance, and do original annotation work.

More on Value

Data pipelines

These graphs are dynamically created based on simple user inputs. The core principle is the ability to "watch" a dataset. Task templates are created by your team. The Diffgram task system manages the rest.

Diffgram includes both the literal annotation interface and pipelines.

1) Data Pipelines

Super easy UI based setup. (API & SDK too)

2) World Class Training Data for Vision and NLP

Customizable tooling supports spatial types including Box, Polygon, Lines and more. Attribute support for groups of nested labels, free text, and multiple select. Video up to 60 FPS and 4k.

Diffgram partners with leading providers for NLP interfaces. Use the power of Diffgram for your data and state of the art NLP interfaces.

One click Cloud Integrations

Compatible with your cloud data in AWS, Google, Azure, or Private cloud. Use frameworks like Tensorflow, Pytorch or commercial AutoML services. Connect your models for Pre Labeling.

Update your data - Stop wasting datasets!

Imagine for example you have class "sign" and it's performing at say ~80% average precision.
How do you improve it?

One approach is to break it into multiple sub classes eg

Then you can run the data and determine

sign_yield -> 80%
sign_highway -> 50% ouch
sign_warning -> 70%

And repeat


But how do you actually do that in code? With Training Data? In Production?

A mature product with an AMAZING Roadmap

With over 6,500 commits to the core code base, over 30 million of annotations done, and thousands of projects, Diffgram is the most mature, modern system available.
Our plan of Intent in 2021 and 2022 is to:

  • The easiest and most powerful data system including integration, pipelines, and tasks
  • Be further integrated with other providers. Diffgram is a one stop shop for your data.
  • The most powerful video labeling studio
    Help shape our plan of Intent!

Understanding the Total Cost of Ownership

Executive Summary

It's commonly assumed that the literal annotation is the biggest cost center. In fact there are many costs involved, such as Administering Datasets, Data Prep, Curation, Set Iteration, and more, that cumulatively far exceed annotation. Annotation, is in fact one of the ways value is added to the system.

Read more on Total Cost of Ownership

Compatibility and Common Questions

Yes - you can run the software anywhere you desire - just like a database

Did this page help you?