Category Annotations
In this guide, we'll use the Python SDK to construct category annotations for image classification tasks. A category annotation object in Nucleus has four components:
- Label of the annotation in the class taxonomy, e.g. shirt or dress
- Reference ID of the item to which to apply the annotation
- Name of the taxonomy the taxonomy of the label, optional but recommended
- Metadata, optional key-value pairs pertaining to the annotation, e.g. color = blue
Nucleus currently supports single-label categorization natively, and can support multi-label classification with a workaround.
For more information, check out our Python SDK reference for CategoryAnnotations
.
Taxonomies
To begin, we recommend manually creating and adding your own taxonomies to your datasets. This allows you to apply more than one category annotation (and prediction) to a single image, where each annotation pertains to a different taxonomy.
By default, if taxonomy_name
is not provided in the CategoryAnnotation
object, Nucleus will create a "default taxonomy" automatically based on the set of all unique labels in the uploaded annotation payload. If you continue to upload more CategoryAnnotations
, new labels will automatically be added to the default taxonomy.
You can create a taxonomy and associate it with a dataset using Dataset.add_taxonomy
:
import nucleus
client = nucleus.NucleusClient("YOUR_SCALE_API_KEY")
dataset = client.get_dataset("YOUR_DATASET_ID")
response = dataset.add_taxonomy(
taxonomy_name="clothing_type",
taxonomy_type="category",
labels=["shirt", "trousers", "dress"],
update=False
)
Once added, you can then use labels from and reference this taxonomy in any CategoryAnnotation
by taxonomy_name
.
By setting update=True
, you can add new labels to an existing taxonomy (e.g. from labels=["shirt", "trousers", "dress"]
to labels=["shirt", "trousers", "dress", "jacket"]
). We do not yet support changing or deleting existing labels, but this is coming soon! In the meantime, we recommend deleting the taxonomy and recreating it from scratch with the updated/removed labels.
Single Category Annotations
For single-label categorization tasks, simply create one CategoryAnnotation
per image, per taxonomy. For instance, if you have 2 taxonomies clothing_type
and designer
and 100 images, you would create 200 annotations as follows:
from nucleus import CategoryAnnotation
annotations = []
annotations.append(CategoryAnnotation(
label="shirt",
reference_id="image_1",
taxonomy_name="clothing_type",
metadata={"color": "off-white"}
))
annotations.append(CategoryAnnotation(
label="Virgil Abloh"
reference_id="image_1",
taxonomy_name="designer",
))
Multi-Category Annotations
Currently, the best way to add multi-category annotations or predictions to an item is to use multiple taxonomies. Rather than a single taxonomy with N classes, you can add N binary taxonomies each with two classes: True or False.
For example, consider a multi-category model predicting classes: shirt, trousers, and dress. You would first add three True/False taxonomies for each of the three classes. To add a CategoryAnnotation
for an item of both the shirt and dress class, you would simply add True predictions for the shirt and dress taxonomies, and a False prediction for the trousers taxonomy.
from nucleus import CategoryAnnotation, NucleusClient
client = nucleus.NucleusClient("YOUR_SCALE_API_KEY")
dataset = client.get_dataset("YOUR_DATASET_ID")
# add binary taxonomies
for label in ["shirt", "trousers", "dress"]:
dataset.add_taxonomy(
taxonomy_name=label,
taxonomy_type="category",
labels=["True", "False"]
)
# create CategoryAnnotations
annotations = []
annotations.append(CategoryAnnotation(
label="True",
reference_id="image_1",
taxonomy_name="shirt",
))
annotations.append(CategoryAnnotation(
label="False",
reference_id="image_1",
taxonomy_name="trousers",
))
annotations.append(CategoryAnnotation(
label="True",
reference_id="image_1",
taxonomy_name="dress",
))
Note: this is a temporary workaround for native multi-category support in Nucleus.
Scene Category Annotations
Nucleus supports uploading and visualizing category annotations at the scene level. Like regular category annotations, a SceneCategoryAnnotation
object contains a Label, Reference ID, Taxonomy Name, and Metadata.
To upload annotations for a scene (a video or a point cloud), begin by uploading all taxonomies present in your dataset as demonstrated earlier in this section.
Then, create a SceneCategoryAnnotation
for a scene with a given reference_id
the same way you would for a CategoryAnnotation
.
from nucleus import SceneCategoryAnnotation
annotations = []
annotations.append(SceneCategoryAnnotation(
label="running",
reference_id="scene_1",
taxonomy_name="action",
metadata={ "weather": "rainy" },
))
annotations.append(SceneCategoryAnnotation(
label="shooting a basketball",
reference_id="scene_2",
taxonomy_name="action",
metadata={ "weather": "sunny" },
))
Updated over 2 years ago
Upload your newly constructed CategoryAnnotation
and SceneCategoryAnnotation
objects!