Importing Introduction
Choosing a Strategy
Options include:
- Import Wizard
Includes pulling from Cloud or drag and drop. - SDK and APIs
Import From Cloud Providers Quickstart
A basic example
Import Wizard (PreLabel)
Import Wizard Docs
Import pre-labeled data, including for images and video
Notes
Preparing your Inputs
File formats
- Images require no specific preparation.
- Videos please review Video Specifications..
- Text
- 3D Point Cloud 3D Annotation Guide
Visualize Progress and History
Going to the upload section of the UI will show your results for inputs from the SDK too. Including info on files processing (such as video.)
Timing
- Create your Dataset first. New data can be added and synced after.
From Connections
Connections are the primary recommended way of moving data.
Use existing cloud data sources. And provides an easy path for event driven expansions.
From the API / SDK
From Drag and Drop
Project / Import
Image Formats
.jpg, .jpeg, .png
1 or 3 color channels
Alpha channel if applicable is suppressed.
CSV media (not for pre-label case)
One media URL per line.
Image must be directly accessible through the url, if you are using a cloud storage provider please create a signed url.
If the file does not end with a supported video type, it must have a "content-type" header set, ie "content-type:image/jpg". This is applicable for some cloud storage providers.
Benefits of a Strong File concept
- Efficient data use. Imagine annotating a massive dataset, and then deciding afterwards you would like to add 1 more class. Since the data is already in Diffgram, you can specify the integer file_id, with no need to re-transfer the data.
- Visually see files through the UI, including visually seeing which files are attached to a job
- File system, including copy, mirroring and organizing files into datasets.
- Add more files to a job over time, and then launch when (and if) ready. This gives you complete control. For example you could default to sending data to Diffgram, and then only choose to launch jobs for new training data that's actually needed.
Updated over 2 years ago