- You need to have a diffgram project created.
- You need to setup a connection to labelbox
Contact Us to prioritize adding your specific data types.
- Spatial Types: bounding boxes, points and polygons. Polylines and segmentation masks are not yet supported.
- Media Types: The migration tool only migrates images for now. We will add support for text, video and the rest of media types supported by Diffgram.
- Nested attributes are by default converted into our tree-view attribute. If you need to support nesting groups of different types please Contact Us.
Go to the Menu. Access Project / Settings.
Now click on the new project migration button in green.
Select your labelbox connection from the dropdown and hit next.
Or Setup a New One
Select your Labelbox project name from the dropdown, then optionally check the box to add both the files and the labelbox ontology into diffgram.
Diffgram will automatically sense the volume of data to be imported and provide you this review page to ensure it lines up with your expectations.
Confirm the data to migrate and click "Start Project Migration"
Done! Now you have all your project data in Diffgram. Your data will start loading and you can inspect it as it comes in through the input page and the studio / data explorer.
In order to be respectful to Labelbox servers we rate limit requests. This may mean that large migrations may take time. Contact us if there are any concerns.
This migration tool is provided "as is". By the nature of making a transformation from one system to another there will be differences in how the data is structured, assumptions, expectations, network issues, etc. It is up to you to verify the data is represented to your satisfaction.
Diffgram and Labelbox have similar functions. Sometimes they are called different things.
|Set of Labels, Attributes, Templates. A collection of features and their relationships.||Schema||Schema, Ontology, Taxonomy|
|Batch of Human Work. Includes a specified Schema, Datasets, and other definitions of the work.||Tasks||Project|
|Tasks, Datasets, Users, And other Assets||Project||Workspace|
|Raw Data (e.g. Image, Audio) and associated Annotations.||File||Row, Asset, Data Row|
|Annotation||Annotation or Instance||Annotation|
|Multiple Annotations||Annotation List or Instance List||Label|
Labelbox has a bug in the way they treat EXIF data. Diffgram automatically fixes this bug by importing data in the portrait format and then performing a rotation on Labelbox data to align geometry / spatial points.
If you are using EXIF data, please verify your ML process is loading data with the EXIF orientation correct. EXIF Orientation Assumptions for Images
You can create multiple Diffgram Projects and import the associated Labelbox Projects, or import multiple Labelbox Projects into one Diffgram project.
Yes. We import all datasets attached to project by default. The names will be maintained.
So for example a Project "Alpha" in Labelbox with datasets A, B, C will result in A, B, C Diffgram datasets in the existing Diffgram Project.
We need an underlying ontology to be able to migrate labels and in Labelbox ontologies are directly attached to project.
No it will only add non-existing elements. There are no destructive actions on the migration. If there is an existing label/attribute it will skip it. If there is an existing file it will show an error on the input table.
It skips labels and attributes that already exist. Files get enqueued again, but duplicates will be automatically blocked becuase process media checks if filename already exists in that set.
At the moment please do one migration for each Labelbox project.
It's entirely up to you. Examples
- Create one Diffgram project for each set of Schema in Labelbox, then import all associated Labelbox projects to that one Diffgram project.
- Create one Diffgram project for each Labelbox project
For more context see the "Getting Familiar with Diffgram Concepts" table to see how a Diffgram Project scope compares to a Labelbox Project becuase they are not the same.
Yes, the migration runs on the batch processing service, so there is no need to stay on the page once it's started.
Currently there is not, but you can archive the dataset after.
Updated about 1 year ago