We provide a batch-export endpoint for slices (and datasets) that will pull down all the data that is required for training a model.
- all DatasetItems
- all Annotations
In order to speed up the export, the following pieces of data for annotations are not currently pulled down, although they may be in the future based on user needs.
- Annotation IDs
- Annotation-level metadata
This endpoint will time out for very large datasets or slices (>200k items) for now, but an async endpoint is coming soon. For item only export, we support paginated export of dataset items for datasets or slices of any size by using
The ID of a slice or a dataset can be retrieved by inspecting the URL while using Nucleus. Dataset ids being with ds, and slice ids begin with slc.
This endpoint does not take a payload.
The response will be a list of dictionaries. The following is the format of the dictionary, which is the same as the format returned from getting a single dataset item.
|dict||A dict where the keys represent annotation type, and the values are arrays of |
import nucleus client = nucleus.NucleusClient("YOUR_API_TOKEN") example_slice = client.get_slice("YOUR_SLICE_ID") example_dataset = client.get_dataset("YOUR_DATASET_ID") exported_rows_from_slice = example_slice.items_and_annotations() exported_rows_from_dataset = example_dataset.items_and_annotations() image_url = exported_rows_from_dataset['item'].image_location box_annotations = exported_rows_from_dataset['annotations']['box'] for item in dataset.items_generator(): print(item.reference_id)
Updated over 1 year ago