Export Embeddings
Use the Dataset.export_embeddings
method to export the embeddings vectors of the primary search index of a Dataset. The export_embeddings
method starts an asynchronous job, which uploads the embedding vectors as JSON files in batches to a Scale blob storage, from where they can be downloaded.
The result of the export_embeddings
method is an EmbeddingsExportJob
class, from where the status and the result of the job can be monitored.
import nucleus
client = nucleus.NucleusClient("YOUR_SCALE_API_KEY")
dataset = client.get_dataset("YOUR_DATASET_ID")
embeddings_job = dataset.export_embeddings(asynchronous=True)
result_urls = embeddings_job.result_urls(wait_for_completion=True)
print(result_urls) #['https://scale-temp.s3.dualstack.us-west-2.amazonaws.com/dataset_id_embedding_1.json', ...]
Updated about 1 year ago