Compound Files Ingestion
A Guide For Uploading Compound Files In Diffgram.
Compound files are a concept in Diffgram where you can combine multiple media assets into a single File entity making it easier to reason a group your data and manage large annotation projects. Some examples of compound file use cases are.
- Multiple Page Document Pictures
- Multi Camera Annotation Scenes
- Pictures that are related in some meaningful way to the business.
Compound Files Are Supported For Images Only
Later versions of Diffgram will add support for text, audio, video and combinations of those.
Uploading Compound Files Using The SDK
For uploading files with the SDK, you will need the Diffgram SDK version 0.10.2
or greater. We will start by uploading a 2 document page:
from diffgram import Project
from diffgram.file.compound_file import CompoundFile
project = Project(host = "https://diffgram.com",
project_string_id = "replace_with_project_string",
client_id = "replace_with_client_id",
client_secret = "replace_with_client_secret")
parent = CompoundFile(
project=project,
name='myFirstCompoundFile',
directory_id=project.default_directory.id
)
parent.add_child_from_local(path='path/to/your_file.jpg')
parent.add_child_from_local(path='path/to/your_second_file.jpg')
parent.upload()
Uploading Files With Connections to Storage Providers
You can upload files with connection and blob path using the add_child_from_blob_path
method
# Replace the connection ID with the Diffgram Connection ID of your storage provider.
parent.add_child_from_blob_path(blob_path = 'my/path/to/blob', connection_id = 25)
parent.upload()
Upload
The add methods add it locally to the SDK compound file object.
Upload triggers the actual API transmission.
So, once you have added all the child files necessary, you can start the upload.
# Upload the files.
parent.upload()
The response is a python dict with the input id, batch id, and other metadata.
{
"ann_is_complete": "None",
"bucket_name": "None",
"connection_id": "None",
"count_instances_changed": "None",
"created_time": "2023-01-24T18:31:34.129171",
"hash": "38183bfec2fbed86968ac4d8f7eb305338f9c67318ff016811854463b15f10eb",
"id": 2967,
"input": {
"archived": "False",
"batch_id": "None",
"created_time": "Tue, 24 Jan 2023 18:31:34 GMT",
"description": "None",
"directory": {
"id": 106,
"nickname": "Default"
},
"file_id": 2967,
"id": 2227,
"instance_list": "None",
"media_type": "compound",
"mode": "None",
"newly_copied_file_id": "None",
"original_filename": "myFirstCompoundFile",
"parent_file_id": "None",
"percent_complete": 100,
"processing_deferred": "False",
"raw_data_blob_path": "None",
"retry_count": 0,
"retry_log": "None",
"source": "from_compound",
"status": "success",
"status_text": "None",
"task_id": "None",
"time_last_attempted": "None",
"time_updated": "None",
"total_time": "None",
"update_log": {
"error": {},
"info": {},
"success": "False"
},
"video_split_duration": 30,
"video_was_split": "None"
},
"original_filename": "myFirstCompoundFile",
"state": "added",
"time_last_updated": "2023-01-24T18:31:34.149508",
"type": "compound",
"video_id": "None",
"video_parent_file_id": "None"
}
Uploading Using Direct API Calls
For Direct API Calls, the process is divided in 2 steps.
- First create the parent file using the Compound File Create Endpoint
- After you have an ID for the compound file, upload your files with the usual API calls, but now providing the compound file ID on the
parent_id
parameter in any of the below endpoints.
Parent ID must be a compound file
Using File ID of a file that is not compound may cause unexpected behaviors. Please make sure to use the ID obtained as a result of calling the Compound File Create Endpoint
Updated 5 days ago