Use Cases

🚧

This doc is out of date and pending removal.

See https://github.com/diffgram/diffgram

Create, Update, And Maintain Datasets

  • Create, Update, and Maintain a single Dataset (optional train/val/test splits).
  • Create, Update, and Maintain many Datasets.
  • Create, Update, and Maintain many Datasets with interrelationships.

Create processes for working with Deep Learning systems

  • Reduce single points of failure (single data scientist, single annotator, single computer node)
  • Distribute some human control to many annotation firms (including those by API)
  • Create a system of record & version control

Compliance and threat actors

  • Defeat adversarial approaches by rapidly retraining
  • Compliance (ie Who labeled it? Who can export sets?)

Launch faster

  • Faster time to market with automation and ready to go software
  • Faster time to market with faster hypothesis testing and iteration
  • Faster time to market with rapid response to changing distributions

Control costs

  • Reduce engineering effort (ie Integration)
  • Monitor annotation costs, Monitor permissions Monitor overall workflow between including at boundaries (Kanban board from import through model triggers / return)
  • Speed up annotation (curation, literal annotation speedups)

Reduce engineering burden

  • For literal connection from data to ml hardware
  • In the Training Data process itself ie with Prep, Annotation, Datasets
  • Reduce Total Cost of Ownership (ToC),
    ie formats, integrations, new techniques, reviews etc…
  • Monitor dataset effectiveness

Explore more

  • Explore and iterate on ideas for how to structure Datasets.
  • Create better datasets by involving Subject Matter Experts earlier and more often