Obtain Model Performance On a Subset

Get model performance over specific evaluation metrics on a specific subset of data

Note: This feature is in beta. If you want it to be available to you please reach out to us via the in App chat!


With Nucleus, you can obtain model performance over metrics of choice on a selected subset of data. This enables you to easily obtain performance over a split of the original dataset e.g. for validation set, or a specific set of scenarios e.g. edge / rare cases which are critical to pass.

Pre-reqs: Private Dataset, Annotations, Predictions


  1. If you have a private datatset i.e. one you created, skip to step 5. If not you can use Nucleus’ dataset cloning on an open dataset
  2. Click on “Datasets” in the sidebar to load the dataset overview page
  3. Scroll down to find “Pandaset” and click on the copy icon
  4. Wait for the dataset to be cloned, once completed it with appear under “Your Datasets”
  5. Open the dataset and create a new slice containing items of interest
  6. Once created, open the slice from the sidebar and click “Create Scenario Test”
  7. In the modal, type the “Name”, select a “Metric” and define a passing “Threshold”
  8. Click “Confirm” to create the test. You be redirected to the scenario test page
  9. On the top right click on “Evaluate” and select a model to evaluate on this subset
  10. This will take a few seconds to complete, you can check progress in the jobs page accessible via the sidebar
  11. Once completed, the model will show up as a new row under the graph in scenario test evaluation tab
  12. The row will show the aggregate performance, comparison with the threshold and a pass/fail state
  13. Click on the row to view the items. Each item will have an aggregate score and a pass/fail state
  14. Click on any item of interest to view in the detail view
  15. You can repeat the same process from Step 9 onwards for evaluating more models