Add Images

📘

Click for Python SDK

Dataset.append(
    items: List[DatasetItem],
    asynchronous: bool,
    update: bool
) -> Union[dict, AsyncJob]

A Dataset can be populated with labeled and unlabeled data. Using Nucleus, you can filter down the data inside your dataset using custom metadata about your images.

For instance, your local dataset may contain Sunny, Foggy, and Rainy folders of images. All of these images can be uploaded into a single Nucleus Dataset, with added metadata like {"weather": "Sunny"}.

To update an item's metadata, you can re-ingest the same items with the update argument set to true. Existing metadata will be overwritten for DatasetItems in the payload that share a reference_id with a previously uploaded DatasetItem. To retrieve your existing reference_ids, see Get Dataset Items.

Remote vs. Local Data Upload

Remote uploads take in a list of DatasetItems whereas local uploads must occur item-by-item, one API call at a time.

Uploading remotely hosted data

  1. Keep the content-type of the request as application/json.
  2. Specify the URL of the image location. Make sure Scale can access this URL.
  3. We currently support remote URLs with prefixes: gs:, s3:, http:, or https:.

Uploading from local storage

  1. Change the content-type of the request to multipart/form-data.
  2. In the image field, provide a local path on disk to the image.
  3. In the item field, provide information about the image such as your self-defined reference_id and any associated metadata.
curl "https://api.scale.com/v1/nucleus/dataset/ds_bw6de8s84pe0vbn6p5zg/append" \\
-u "YOUR_SCALE_API_KEY:" \
-H "Content-Type:multipart/form-data" \
-X POST \
-F "item={
    \"reference_id\": \"image_ref_300000\",
    \"metadata\": {
      \"License Plate\": \"ZPH-J27\",
      \"Recording Date\": \"2019-09-15\",
      \"Recording Time\": \"14:24:21\",
      \"weather\": \"sunny\",
      \"camera\": \"back\"
    }
  };type=application/json" \
-F "image=@{PATH_TO_IMAGE}"

The asynchronous endpoint returns an AsyncJob object that can sleep until complete, return current status, or return errors. There are two stages: first the metadata for the images is pulled into a reupload queue. Then the images are processed in batches of 3000. During this phase, the status will be updated every 3000 images.

Language
Credentials
Basic
base64
:
Click Try It! to start a request and see the response here!