Often ML engineers don’t know the failure modes upfront, for example when developing for a new application or with a new architecture altogether. In this case it is important to easily get to the subsets of data which need the most improvement. Nucleus helps you find these underperforming slices automatically by clustering visually and semantically coherent items.
To achieve this, Nucleus uses embeddings and an evaluation metric (mAP) under the hood to identify groups of data where the performance on the evaluation metric is lower than on the entire dataset. All I need to do is to specify the dataset and the model version I want to evaluate.
To obtain these clusters navigate to "Underperforming Slices" using the sidebar and specify a dataset and model. Additionally, you can also specify the number of clusters.
For each cluster, you can see how the performance distribution varies compared to the entire dataset. Also, you can visually inspect the items to see visual commonality. If you find a cluster of interest you can create a slice and use that for further exploration e.g. querying, visualization.
Updated about 1 year ago