Importing Introduction

Suggest Edits

Choosing a Strategy

Options include:

Import Wizard
Includes pulling from Cloud or drag and drop.
SDK and APIs

Import From Cloud Providers Quickstart

A basic example

Import Wizard (PreLabel)

Import Wizard Docs
Import pre-labeled data, including for images and video

Notes

Preparing your Inputs

File formats

Images require no specific preparation.
Videos please review Video Specifications..
Text
3D Point Cloud 3D Annotation Guide

Visualize Progress and History

Going to the upload section of the UI will show your results for inputs from the SDK too. Including info on files processing (such as video.)

Timing

Create your Dataset first. New data can be added and synced after.

From Connections

Connections are the primary recommended way of moving data.
Use existing cloud data sources. And provides an easy path for event driven expansions.

From the API / SDK

See technical reference

From Drag and Drop

Project / Import

Image Formats

.jpg, .jpeg, .png

1 or 3 color channels
Alpha channel if applicable is suppressed.

Video

CSV media (not for pre-label case)

One media URL per line.

Image must be directly accessible through the url, if you are using a cloud storage provider please create a signed url.

If the file does not end with a supported video type, it must have a "content-type" header set, ie "content-type:image/jpg". This is applicable for some cloud storage providers.

CSV Sample Format

Benefits of a Strong File concept

Efficient data use. Imagine annotating a massive dataset, and then deciding afterwards you would like to add 1 more class. Since the data is already in Diffgram, you can specify the integer file_id, with no need to re-transfer the data.
Visually see files through the UI, including visually seeing which files are attached to a job
File system, including copy, mirroring and organizing files into datasets.
Add more files to a job over time, and then launch when (and if) ready. This gives you complete control. For example you could default to sending data to Diffgram, and then only choose to launch jobs for new training data that's actually needed.

Updated over 3 years ago