Find Rare Edge Cases
Gathering rare items based on adjustable visual similarity
Description
With this workflow, Nucleus enables you to find rare items starting with only a single manually identified positive. This is very powerful to efficiently mine large amounts of rare edge cases, which will massively increase model performance when added to training data.
Pre-reqs
: Dataset with embeddings
Steps
- Open an indexed dataset
- Select any image of interest e.g. one having police car
- Click the “autotag” button & select “create new image autotag”
- Enter a name and press “create tag” to proceed
- From the grid, select more samples which are similar to the original image
- Nucleus will use these samples to further refine the results being shown in the grid
- Continue refinement until you are satisfied with the items being shown on the grid
- Once you are satisfied, press “review” and “commit” the autotag
- Nucleus will use these manual positives to calculate similarity score for all dataset items
- Click on the “Autotag” button in the top navigation and select “Manage Autotags”
- Select your newly created autotag to view the similarity score distribution
- Similarity is normalized to a range of -1:1. Higher the score, more similar the image
- Click on “Query Autotag”, adjust the query threshold to your liking & get similar images
The final results will show samples which have a similarity score matching the query thresholds. You can find out about more advanced Autotag usage here.
Updated almost 3 years ago
What's Next