Uploading Model Predictions
Uploading Model Predictions
By uploading model predictions to Nucleus, you can compare your predictions to ground truth annotations and discover problems with your models or dataset.
You can also upload predictions for unannotated data to enable curation and querying workflows. This can for instance help you identify the most effective subset of unlabeled data to label next.
Within Nucleus, models work as follows:
- Create a
Model. You can do this just once and reuse the model on multiple datasets. - Upload predictions to a dataset.
- Trigger calculation of model metrics.
You'll then be able to debug your models against your ground truth qualitatively with queries and visualizations, or quantitatively with metrics, plots, and other insights. You can also compare multiple models that have been run on the same dataset.
In terms of payload construction, the schema is largely shared between annotations and predictions, with additional optional attributes (confidence, class_pdf) for predictions. We've provided an example just for bounding box predictions, but it generalizes well to other prediction types.
from nucleus import NucleusClient, BoxPrediction
client = nucleus.NucleusClient("YOUR_SCALE_API_KEY")
model = client.create_model(
name="My Model",
reference_id="My-CNN",
metadata={"timestamp": "121012401"}
)
box_pred_1 = BoxPrediction(
label="car",
x=0,
y=0,
width=10,
height=10,
reference_id="image_1",
annotation_id="image_1_car_box_1",
confidence=0.6,
class_pdf={"car": 0.6, "truck": 0.4},
metadata={"vehicle_color": "red"}
)
box_pred_2 = BoxPrediction(
label="car",
x=4,
y=6,
width=12,
height=18,
reference_id="image_1",
annotation_id="image_1_car_box_2",
confidence=0.9,
class_pdf={"car": 0.9, "truck": 0.1},
metadata={"vehicle_color": "blue"}
)
job = dataset.upload_predictions(
model=model,
predictions=[box_pred_1, box_pred_2],
update=True,
asynchronous=True # async is recommended, but sync jobs can be easier to debug
)
# poll current status
job.status()
# block until upload completes
job.sleep_until_complete()
Calculate Model Metrics
After creating a model and uploading its predictions, you'll need to call this endpoint to update matches against ground truth annotations and calculate various metrics such as IOU. This will enable sorting by metrics, filtering down to false positives or false negatives, and a number of evaluation plots and metrics present in the Insights page.
You can continue to add model predictions to a dataset even after running the calculation of the metrics. However, the calculation of metrics will have to be retriggered for the new predictions to be matched with ground truth and update sorts, false positive/negative filters, and metrics used in the Insights page.
How Nucleus matches predictions to ground truth
During IOU calculation, predictions are greedily matched to ground truth by taking highest IOU pairs first. By default the matching algorithm is class-sensitive: it will treat a match as a true positive if and only if the labels are the same.
If you'd like to compute IOU by allowing associations between certain labels and predictions that don't have the same name, you can specify them using a list of
allowed_label_matches(shown in the example below).
from nucleus import NucleusClient
client = NucleusClient("YOUR_SCALE_API_KEY")
dataset = client.get_dataset(dataset_id="YOUR_DATASET_ID")
model = client.get_model(model_id="YOUR_MODEL_ID", dataset_id="YOUR_DATASET_ID")
"""
associate car and bus bounding boxes for IOU computation,
but otherwise force associations to have the same class (default)
"""
dataset.calculate_evaluation_metrics(model, options={
"allowed_label_matches": [
{
"ground_truth_label": "car",
"model_prediction_label": "bus"
},
{
"ground_truth_label": "bus",
"model_prediction_label": "car"
}
]
})
Once your predictions' metrics have finished processing, you can check out the Objects tab or Insights page to explore, visualize, and debug your models and data!
Updated about 4 years ago