add()
Add links from file_ids to the dataset.
Does not modify existing links
file_id_list = [1025104, 1025103]
apples_directory.add(file_id_list = file_id_list)
Context for Directory Add
Compared to copying the file:
Pros:
- Faster and less memory
- New tasks created (If sync'd with a task template)
- Organizes updated data in a new set.
(each update may be a new set.)
Cons:
- Does not version (beyond relating the files to a new set)
Notes:
- Unlike the
move
operator, this does not touch existing links. Eg the "original" dataset will still have the file link. - (Per file) Errors if an existing link is present. Eg "1025061": "File already in dataset id: 3298"
Example: Adding existing files while updating
Let's imagine we want to update some files with new data. When it's complete we want that file to reflect in a new dataset. If the new dataset is watched by a Task Template, then tasks will automatically be created.
The below code uses the frame packet map concept, read more about it here:
Frame Packet Map
id = 1 # replace me
frame_packet_map = None # replace me with a real frame_packet_map.
# Get an existing dataset, may be unrelated to current file
destination_dataset = project.directory.get("destination_dataset") # replace name
# Get existing file and update it
file = project.file.get_by_id(id = id)
file.update(frame_packet_map = frame_packet_map)
# Add link to new dataset
result = destination_dataset.add(file_id_list = [file.id])
Example: Add Files
Eg Files that were already updated, and just want to add links
file_id_list = [243, 444, 422, ...] # Populate with known list of file ids
destination_dataset = project.directory.get("destination_dataset")
result = destination_dataset.add(file_id_list = file_id_list)
print(result) # your error handling
Pseudo code, more optimal example for many files:
def your_custom_file_update_function(id, frame_packet_map):
file = project.file.get_by_id(id = id)
file.update(frame_packet_map = frame_packet_map)
# your error handling
# loop through using your_custom_update_file_function()
# After, call once
destination_dataset = project.directory.get("destination_dataset")
result = destination_dataset.add(file_id_list = file_id_list)
print(result) # your error handling
Error handling
Returns the result in regular log format: The dict log contains error
and info
keys.
Within error and info:
When possible, the key corresponds to the file id.
The approach is "greedy", in that it will execute the files that it can, and return errors for the ones it can't. For example, here 3 files had errors and 2 were successful.
For example, you can walk the info dict and cross reference an internal list to confirm the files were added successfully. The overall 'success' flag will only be true if all files were successful.
Example results dict
"log": {
"error": {
"1025061": "File already in dataset id: 3298",
"Missing file id(s)": "One or more files are missing the id",
"1": "Invalid ID for this project."
},
"info": {
"1025103": true,
"1025104": true
},
"success": false
}
}
Best practices
- 50-100 items at a time.
Updated about 3 years ago