Category Annotations

In this guide, we'll use the Python SDK to construct category annotations for image classification tasks. A category annotation object in Nucleus has four components:

  • Label of the annotation in the class taxonomy, e.g. shirt or dress
  • Reference ID of the item to which to apply the annotation
  • Name of the taxonomy the taxonomy of the label, optional but recommended
  • Metadata, optional key-value pairs pertaining to the annotation, e.g. color = blue

Nucleus currently supports single-label categorization natively, and can support multi-label classification with a workaround.

For more information, check out our Python SDK reference for CategoryAnnotations.

Taxonomies

To begin, we recommend manually creating and adding your own taxonomies to your datasets. This allows you to apply more than one category annotation (and prediction) to a single image, where each annotation pertains to a different taxonomy.

By default, if taxonomy_name is not provided in the CategoryAnnotation object, Nucleus will create a "default taxonomy" automatically based on the set of all unique labels in the uploaded annotation payload. If you continue to upload more CategoryAnnotations, new labels will automatically be added to the default taxonomy.

You can create a taxonomy and associate it with a dataset using Dataset.add_taxonomy:

import nucleus
client = nucleus.NucleusClient("YOUR_SCALE_API_KEY")
dataset = client.get_dataset("YOUR_DATASET_ID")

response = dataset.add_taxonomy(
    taxonomy_name="clothing_type",
    taxonomy_type="category",
    labels=["shirt", "trousers", "dress"],
    update=False
)

Once added, you can then use labels from and reference this taxonomy in any CategoryAnnotation by taxonomy_name.

By setting update=True, you can add new labels to an existing taxonomy (e.g. from labels=["shirt", "trousers", "dress"] to labels=["shirt", "trousers", "dress", "jacket"]). We do not yet support changing or deleting existing labels, but this is coming soon! In the meantime, we recommend deleting the taxonomy and recreating it from scratch with the updated/removed labels.

Single Category Annotations

For single-label categorization tasks, simply create one CategoryAnnotation per image, per taxonomy. For instance, if you have 2 taxonomies clothing_type and designer and 100 images, you would create 200 annotations as follows:

from nucleus import CategoryAnnotation

annotations = []
annotations.append(CategoryAnnotation(
    label="shirt",
    reference_id="image_1", 
    taxonomy_name="clothing_type",
    metadata={"color": "off-white"}
))
annotations.append(CategoryAnnotation(
    label="Virgil Abloh"
    reference_id="image_1",
    taxonomy_name="designer",
))

Multi-Category Annotations

Currently, the best way to add multi-category annotations or predictions to an item is to use multiple taxonomies. Rather than a single taxonomy with N classes, you can add N binary taxonomies each with two classes: True or False.

For example, consider a multi-category model predicting classes: shirt, trousers, and dress. You would first add three True/False taxonomies for each of the three classes. To add a CategoryAnnotation for an item of both the shirt and dress class, you would simply add True predictions for the shirt and dress taxonomies, and a False prediction for the trousers taxonomy.

from nucleus import CategoryAnnotation, NucleusClient
client = nucleus.NucleusClient("YOUR_SCALE_API_KEY")
dataset = client.get_dataset("YOUR_DATASET_ID")

# add binary taxonomies
for label in ["shirt", "trousers", "dress"]:
    dataset.add_taxonomy(
        taxonomy_name=label,
        taxonomy_type="category",
        labels=["True", "False"]
    )

# create CategoryAnnotations
annotations = []
annotations.append(CategoryAnnotation(
    label="True",
    reference_id="image_1", 
    taxonomy_name="shirt",
))
annotations.append(CategoryAnnotation(
    label="False",
    reference_id="image_1", 
    taxonomy_name="trousers",
))
annotations.append(CategoryAnnotation(
    label="True",
    reference_id="image_1", 
    taxonomy_name="dress",
))

Note: this is a temporary workaround for native multi-category support in Nucleus.

Scene Category Annotations

Nucleus supports uploading and visualizing category annotations at the scene level. Like regular category annotations, a SceneCategoryAnnotation object contains a Label, Reference ID, Taxonomy Name, and Metadata.

To upload annotations for a scene (a video or a point cloud), begin by uploading all taxonomies present in your dataset as demonstrated earlier in this section.

Then, create a SceneCategoryAnnotation for a scene with a given reference_id the same way you would for a CategoryAnnotation.

from nucleus import SceneCategoryAnnotation

annotations = []
annotations.append(SceneCategoryAnnotation(
    label="running",
    reference_id="scene_1",
    taxonomy_name="action",
    metadata={ "weather": "rainy" },
))
annotations.append(SceneCategoryAnnotation(
    label="shooting a basketball",
    reference_id="scene_2",
    taxonomy_name="action",
    metadata={ "weather": "sunny" },
))

What's Next

Upload your newly constructed CategoryAnnotation and SceneCategoryAnnotation objects!