Uploading & Updating Files With Attributes

Learn how to upload files with attributes using the SDK

This page will show you basic examples for uploading files with instances and attributes (both global attributes and instance attributes).

Note: For updating data, the same principles apply. Check: Updating Files With API calls

SDK Example

Let's take a simple example for uploading an image with bounding boxes:

def upload_image_with_3_boxes():
  signed_url = "https://storage.googleapis.com/diffgram_public/example_data/000000001323.jpg"

  instance_list = []
  def create_box_instance(
          sequence_number: int = None,
          name: str = None):
      return {
          "name": name,
          "type": "box",
          "x_max": random.randint(180, 220),
          "x_min": random.randint(100, 120),
          "y_max": random.randint(230, 260),
          "y_min": random.randint(130, 160)
      }

  for i in range(3):
    instance_list.append(
      mock_box_from_external_format(name = "my_label_name"))

    result = project.file.from_url(
      signed_url,
      media_type = "image",
      instance_list = instance_list
    )
    print(result)

In this example we upload a single image URL with 3 bounding boxes. The boxes have no attributes yet. And the file has not global attributes either. Let's see how to add an instance attribute first

Instance Level Attributes

To add attributes to an instance, you just have to specify the attribute_groups key on the instance object. This is an object with the format <attribute_group_id>:<attribute_group_selected_value_object>.

Let's see at each of these values in detail:

attribute_group_id: This value is the attribute template group ID you want to fill. In order to get the attribute group value you want to use, you can call the function project.get_attributes(schema_id)
attribute_group_selected_value_object: This is an object indicating the selected value or the value inputted from the annotator. It is composed of the following:

Radio Buttons Attribute

{
display_name: "My selected option" # The display name of the selected attribute option
id: 2 #The ID of the selected attribute option
name: "My selected option" # The internal name of the selected attribute option.,
  
}

You can find the available options to select on the attribute_template_list key of each attribute group when calling the project.get_attributes(schema_id) function.

Tree View Attribute

{
	5: {name: "my option", selected: true} 
}

In the tree view case, we have a nested object. The key of this object is the id of the option (in this case the "myoption" attribute option has id 5). And the value of this object is the name of the option along with a selected attribute.

Putting this together on the instance it will look something like this (Assuming the radio button attribute has id=1 and tree view attribute has id=2.

def create_box_instance(
          sequence_number: int = None,
          name: str = None):
      return {
          "name": name,
          "type": "box",
          "x_max": random.randint(180, 220),
          "x_min": random.randint(100, 120),
          "y_max": random.randint(230, 260),
          "y_min": random.randint(130, 160),
          "attribute_groups": {1: {
          						        			"display_name": "My selected option" # The display name of the selected attribute option
											        			"id": 2 #The ID of the selected attribute option
            					       				"name": "My selected option" # The internal name of the selected attribute option.
          												},
                               2: {
                               	5: {"name": "my option", "selected": true} 
                               }
         }
      }

Select Attribute

This is the format for setting the value of a select attribute

{
   2: {display_name: "apple", id: 5, name: 5}
}

Here the ID 2 corresponds to the ID of the Select Attribute Group Template, and the ID 5 corresponds to the option with the value apple

Multiple Select Attributes

This is the format for setting the value of a multiple select attribute

{
   3: [{display_name: "Lion", id: 11, name: 11}, {display_name: "Cow", id: 12, name: 12}]

}

The above format is setting the attribute template with ID 3 to have the options with ID 11 and 12 selected. Which have the values of Lion and Cow respectively.

Free Text

{
   4: "text attribute value goes here"
}

The above format is setting the attribute template with ID 4 to have the text value : "text attribute value goes here".

Slider Attributes

{
   5: 28
}

The above format is setting the attribute template with ID 5 to have the slider value : 28.

Time Attribute

{
   6: "06:25"
}

The above format is setting the time attribute template with ID 6 to have the time value : "06:25". Notice that format for time attributes is HH:MM. Timezones are nos supported so UTC can be assumed.

Date Attribute

{
   7: "2023-02-17"
}

The above format is setting the DATE attribute template with ID 7 to have the time value : "2023-02-17". Notice that format for date attributes is YYYY:MM:DD.

Uploading Files With Global Attributes

Global attributes work in a similar way as instance attributes, the only difference is that you have to declare a special type of instance called global, this instance does not need a label name so you can leave that field in blank, same for the x and y values.

def create_global_instance(
          sequence_number: int = None,
          name: str = None):
      return {
          "type": "global",
          "attribute_groups": {3: {
          						        			"display_name": "My selected Global Attribute",
											        		  "id": 10,
            					       				"name": "My selected Global Attribute"
          												}
         }
      }

In the above example the global attribute is id=3 and the selected option (radio button) is 10

Full Example with a Global Attribute

import random
from diffgram import Project

project = Project(
    project_string_id = "my_project",
    client_id = "my_client_id",
    client_secret = "my_secret",
)

print(project.client_id)

dataset = project.directory.get('Default')


def mock_box_from_external_format(
        sequence_number: int = None,
        name: str = None):
    return {
        "name": name,
        "number": sequence_number,
        "type": "box",
        "x_max": random.randint(180, 220),
        "x_min": random.randint(100, 120),
        "y_max": random.randint(230, 260),
        "y_min": random.randint(130, 160)
    }


def create_global_attribute():
    return {
        "name": None,
        "number": None,
        "type": "global",
        'attribute_groups': {
            1: {
                'display_name': 'R',
                'id': 1,
                'name': 1
            }
        }
    }


def upload_image_with_instances(project):
    signed_url = "https://storage.googleapis.com/diffgram_public/example_data/000000001323.jpg"

    instance_list = []

    for i in range(3):
        instance_list.append(
            mock_box_from_external_format(name = "my_label"))
    global_attr = create_global_attribute()

    instance_list.append(global_attr)
    result = project.file.from_url(
        signed_url,
        media_type = "image",
        instance_list = instance_list
    )
    print(result)


upload_image_with_instances(project)

Updating Existing Files With New Attributes

The only thing to keep in mind when updating existing files with new attributes, is that the instance_list payload must include the existing id of the instances to be updated. Otherwise, the instance will be ignored for global instances, or considered as new instances in the case of non global instances.Then we can call the file.update() method or use the direct API call Input Packet

Based on the above example:

def update_global_instance(project):
    instance_list = []
    # This instances does not have reference an attribute ID 
    global_attr_instance = create_global_attribute()
		global_attr_instance['id'] = <ID_OF_THE_EXISTING_INSTANCE>
    instance_list.append(global_attr)
    result = project.file.update(
        instance_list = instance_list,
      	overwrite = False
    )
    print(result)

You can get the id: <ID OF THE EXISTING GLOBAL INSTANCE>by getting the existing instance list of the child file with this API call:
Get Instances for a File
In that list you should get an array as a response. One of the items in that array will be the global instance which has type: 'global'. That should be the ID you;re looking for

The only case where you can provide an instance_list without the existing IDs is if you call use the update_with_existing mode when calling the Input Packet API, which is equivalent to the overwrite=True when using the Python SDK.

📘

Remember: Provide the ID of the instance when updating attributes

If you are using update mode, provide the IDs of the instances for each instance list. If you are using update_with_existing, you can provide the instances without IDs but all existing instances on the file to be updated will be deleted.