Export Walkthrough
Getting started from the UI
The easiest way to get started is to export from the UI.
When we first arrive we see some options and an empty list:
Here it has defaulted to the directory Luckfrill.
When we click generate we see
For large exports it may be several minutes or longer. Small exports are faster. You may need to click the refresh button to see updated status and results.
We must allow pop-ups for the download button to work. When we click download we see this file come up:
When I first load the file it may be hard to read.
We can "un-minify" to expand the JSON. (Your editor may do this automatically)
And that's it! Now we have the file ready review and use.
For details on the meaning of the file please refer to to the details here,.
Completed Files Only
By default, exports from the UI only export files marked as complete.
This is optional.
Caution: File Completion Status is different from Task Status
Because a file may be worked on by multiple tasks, the file status is separate from the task status. For example, Creating new tasks using existing files will reset the File completed status. Use the check values to verify that the export is exporting what you expect.
Check Values
A file count is provided as a check value. We highly recommend making sure the file count matches the expected value, ie for number of tasks complete, or viewing files in the studio.
For more general inspection and visibility on work see Reporting.
If you have additional ideas on check values please send feedback.
Export Troubleshooting and Alignment - Overview
There are a variety of ways data can be modified, including tasks, data-streams, and connections. There are also a variety of settings and choices for exporting. It's important to be aware of these differences.
Alignment between Engineering, Project Admins and Data Science
The Export is where the "rubber hits the road" in terms of the results. Communication on expectations and understanding of settings here is important to get a good result.
Assumptions
- Do the settings chosen at import align with my export expectations?
For example
- If a video was split, then global frame numbers will be available, and this will change how the clip is presented at export.
- Resolution - double check provided width and height (inside generated file).
- Was the Job/Tasks etc completed to my expectations?
May visually inspect / spot check Tasks
Export settings
Correct, meaning the right match for the above expectations:
-
Is the correct source selected?
For example, for generated exports, can go to the job from the source, and then visually inspect the tasks to ensure alignment. -
Is the correct name (sub option) selected?
Is the {job_name, directory name} a match? -
Check options?
ie: Is {completed, all} as expected? -
Look at check values, file counts.
-
Advanced conditions
Modified files, removed instances, etc.
By default removed instances are optionally available on the UI but not included in the export.
Data is randomly not lining up.
While bugs are always possible, in general export errors are usually more dramatic then they first seem. Often the above troubleshooting steps can help resolve issues.
In general, anything that is rendering in the studio / tasks can be exported. The system saves automatically on a regular basis, and reloads data for confirmations / inspections.
Export File Considerations
Considerations when reviewing exports:
-
Order may not be preserved.
For example, attribute groups, instance_list, and even files in general. So if for example a flag (such as complete/not complete) causes files to be excluded, then it may appear that data is randomly missing. -
Some resources are unique per project.
For example, a label "Apple" is created in two projects. It will have a different File ID. -
There are limits on various list sizes.
For example, it's possible to hit a limit on how many instances are generated per file.
While this is rare, if you have tried the other troubleshooting items and are still "missing" information please contact support.
Other administrative considerations:
- Am I looking at the correct file?
Files are named with the export ID, datetime, and other information to help identify them.
Note in data stream cases, there could be various reasons that the latest file is not pushed/available in your cloud bucket.
- When was the file generated?
For example, if the file was generated a few hours ago, and new instances have been added (or status changes etc), then the export will need to be regenerated.
Note that completed status does not "lock" a file, so a completed file may still be modified if needed.
Export Generation through SDK
SDK Version >= 0.1.7.6
From a Job
First we get the job by the id. Then we create a new export.
job = project.job.get_by_id(id = your_job_id)
data = job.generate_export()
print(data)
This assumes we have defined a project
object as defined here.
By default it returns the results as JSON:
This is similar to if we had clicked Generate on the UI, then Download, then loaded the file into our application.
Must store or retrieve job ID
The assumption is that when a job is being created your system will store the job_id to retrieve the job. If the job id is not stored it can be retrieved from the web UI by navigating to the job. The URL is of the format job/job_id.
Return URL
Another option is to generate a URL to the resource. This can be useful for downloading the file to some other system or storage option.
your_job_id = 1
job = project.job.get_by_id(id = your_job_id)
url = job.generate_export(
return_type = "url")
print(url)
For the return_type
we have selected return by url
. We print the URL. To test it we can copy the URL into our browser and download the JSON file.
Trigger Export generation without blocking
Returns meta data, not annotations
It returns an
Export
object. This is not the actual annotation data.
Recommended approach for large datasets and/or repeated downloads
By default, the calling export results will wait until the export generation is complete to return.
However for longer exports, or in other batch cases, you may wish to trigger the export generation without blocking processing.
We can do this by setting wait_for_export_generation = False
.
# SDK Version >= 0.1.7.6
export = job.generate_export(
wait_for_export_generation = False)
time.sleep(2)
data = export.access_data(
return_type = "url" )
Now it will return information regarding the Export object.
We can now manually call export link to get the export.
Returns immediately
Because this method returns before the export is complete it's possible
access_data()
will throwException: Export not ready yet.
In this example we add atime.sleep()
. Alternatively you could check the export status and download upon success.
Access existing exports
import time
# SDK Version >= 0.1.7.6
export_list = project.export.list()
export = export_list[0]
time.sleep(2)
data = export.access_data( format = "url")
Updated over 4 years ago