Obtain Model Performance On a Subset
Get model performance over specific evaluation metrics on a specific subset of data
Note: This feature is in beta. If you want it to be available to you please reach out to us via the in App chat!
Description
With Nucleus, you can obtain model performance over metrics of choice on a selected subset of data. This enables you to easily obtain performance over a split of the original dataset e.g. for validation set, or a specific set of scenarios e.g. edge / rare cases which are critical to pass.
Pre-reqs
: Private Dataset, Annotations, Predictions
Steps
- If you have a private datatset i.e. one you created, skip to step 5. If not you can use Nucleus’ dataset cloning on an open dataset
- Click on “Datasets” in the sidebar to load the dataset overview page
- Scroll down to find “Pandaset” and click on the copy icon
- Wait for the dataset to be cloned, once completed it with appear under “Your Datasets”
- Open the dataset and create a new slice containing items of interest
- Once created, open the slice from the sidebar and click “Create Scenario Test”
- In the modal, type the “Name”, select a “Metric” and define a passing “Threshold”
- Click “Confirm” to create the test. You be redirected to the scenario test page
- On the top right click on “Evaluate” and select a model to evaluate on this subset
- This will take a few seconds to complete, you can check progress in the jobs page accessible via the sidebar
- Once completed, the model will show up as a new row under the graph in scenario test evaluation tab
- The row will show the aggregate performance, comparison with the threshold and a pass/fail state
- Click on the row to view the items. Each item will have an aggregate score and a pass/fail state
- Click on any item of interest to view in the detail view
- You can repeat the same process from Step 9 onwards for evaluating more models
Updated about 2 years ago