Query Syntax Reference

Understand the base concepts of the query Syntax for diffgram.

Optimizing your Query

Queries work similar to SQL in that the more specific the query, often the faster it will run. An overly broad query may timeout.

Main Filter Objects

The query syntax consists of 4 different base objects to filter on.

  • Datasets
  • Attributes
  • Labels
  • Files

Logical Operators

You can combine the expressions with the above base objects using the following boolean operators.

  • and: Logical AND expression between the two operators. <op1> AND <op2>
  • or : Logical OR expression between the two operators.<op1> OR <op2>\

Operator Details:

1. Datasets

Use the dataset keyword to filter your project data by a specific set of datasets.


  • in [<dataset_id_1>, <dataset_id_2>]: Return the datasets contained in the given list of ID's


Get datasets with ID: 5 and 7

dataset.id in [5,7]

2. Attributes

Use the attributes keyword to filter based on the value for certain attributes.


  • attributes.<attr_name> = <attr_value_id>


Get All Attributes with color "Blue" where "Blue" option ID is 458

attribute.color = 458

3. Labels

Use the labels keyword to filter based on the existence or count of labels in a file.


  • Equality (=)labels.<label_name> = <count>: Get files with the exact count of the given label_name
  • Greater Than (>, >=) labels.<label_name> > <count>: Get files at least the count of labels withgiven label_name
  • Less Than / Less Than Equals (< ,<=) labels.<label_name> < <count>: Get files at most the count of labels withgiven label_name


Get All files with at least one "stop sign" label

labels.stop_sign >= 1

Get All labels with one pedestrian and less than 5 cars

labels.pedestrian = 1 AND labels.car < 5

4. Files


  • file.created_time [ "<=" | ">=" ] "YYY-MM-DD"To filter by the time the file was created.
  • file.ann_is_complete = "true"/ "false"To filter by tasks that have all tasks completed / some tasks not completed.
  • file.metadata.<key> = <value>: Get the files with the metadata field key set to value

Reserved values

reserved_columns = ['created_time', 'time_last_updated', 'type', 'ann_is_complete', 'original_filename', 'task_id', 'frame_number', 'parent_id']

Please note that reserved values are directly available on file where as file.metadata contains arbitrary customer supplied values.


Time defaults to the start of the day e.g. "2023-06-15" -> "2023-06-15T00:00:0".

Note that operators must be >= or <= as equality = will be cast to 2023-06-15T00:00:0 and not return useful results.


Get All tasks from May 2023

file.created_time >= "2023-05-01" and file.created_time <= "2023-05-31"

Get All Completed Tasks from May

file.ann_is_complete = "true" and file.created_time >= "2023-05-01" and file.created_time <= "2023-05-31" 


More Operators and Queries Support Coming Soon

Feel free to send us a message or open a Github issue if you find any useful query ideas.

API Usage Considerations

Considerations while using the raw API

  • You must include the directory_id in the metadata params, even if the directory is also in the query