Use Cases

🚧
This doc is out of date and pending removal.
See https://github.com/diffgram/diffgram

Create, Update, And Maintain Datasets

Create, Update, and Maintain a single Dataset (optional train/val/test splits).
Create, Update, and Maintain many Datasets.
Create, Update, and Maintain many Datasets with interrelationships.

Create processes for working with Deep Learning systems

Reduce single points of failure (single data scientist, single annotator, single computer node)
Distribute some human control to many annotation firms (including those by API)
Create a system of record & version control

Compliance and threat actors

Defeat adversarial approaches by rapidly retraining
Compliance (ie Who labeled it? Who can export sets?)

Launch faster

Faster time to market with automation and ready to go software
Faster time to market with faster hypothesis testing and iteration
Faster time to market with rapid response to changing distributions

Control costs

Reduce engineering effort (ie Integration)
Monitor annotation costs, Monitor permissions Monitor overall workflow between including at boundaries (Kanban board from import through model triggers / return)
Speed up annotation (curation, literal annotation speedups)

Reduce engineering burden

For literal connection from data to ml hardware
In the Training Data process itself ie with Prep, Annotation, Datasets
Reduce Total Cost of Ownership (ToC),
ie formats, integrations, new techniques, reviews etc…
Monitor dataset effectiveness

Explore more

Explore and iterate on ideas for how to structure Datasets.
Create better datasets by involving Subject Matter Experts earlier and more often

Updated almost 4 years ago