Privacy Mode lets customers use Nucleus without sensitive raw data ever leaving their servers. With Privacy Mode, you can submit URLs to Nucleus that link to raw data assets like images or point clouds, instead of transferring that data to Scale. Access control is then completely in the hands of users: URLs may optionally be protected behind your corporate VPN or an IP whitelist. When you load a Nucleus web page, your browser will directly fetch the raw data from your servers without it ever being accessible to Scale.
upload_to_scale=False on a
DatasetItem. This means you can have a mix of data shared with scale, and data not shared with Scale.
from nucleus import DatasetItem my_private_image = DatasetItem( image_location="https://link_to_url/that/only/i/can_access.jpeg", reference_id="image1", upload_to_scale=False)
upload_to_scale=False on a
VideoScene. To use privacy mode on your video data, you must provide a
frame_rate, and an array
DatasetItems with urls to each frame of your video.
from nucleus import VideoScene, DatasetItem frame_urls = [ "https://link_to_url_of_frame_0/that/only/i/can_access.jpeg", "https://link_to_url_of_frame_1/that/only/i/can_access.jpeg", "https://link_to_url_of_frame_2/that/only/i/can_access.jpeg", "https://link_to_url_of_frame_3/that/only/i/can_access.jpeg", ] reference_ids = ["video-1-frame-0", "video-1-frame-1", "video-1-frame-2", "video-1-frame-3"] dataset_items =  for url, ref_id in zip(frame_urls, reference_ids): item = DatasetItem(image_location=url, reference_id=ref_id) dataset_items.append(item) scene = VideoScene( reference_id="video-1", video_location="https://link_to_url_of_video/that/only/i/can_access.mp4", frame_rate=5, items=dataset_items, upload_to_scale=False )
Certain Nucleus features, like similarity search and Autotag, depend on having model embeddings for the data in your Nucleus datasets. To support these features in conjunction with Privacy Mode, Nucleus offers two options:
- Custom embedding upload: You provide model embeddings for your
DatasetItems. In this case, Scale never needs access to your raw data. For details on how to do this see here.
- One-time Scale embedding generation: We use our pretrained models to generate embeddings on your data once, then ensure your raw data is permanently deleted from Scale’s servers, and set items to be in Privacy Mode.
In both cases, Scale never retains raw customer data. Scale will only store and index metadata—labels, model predictions and optional metadata attributes that you upload—while avoiding sensitive raw data. Note: we do not currently support scene-level embeddings (as opposed to frame image-level embeddings, which are supported).
If after using Nucleus, you have identified a slice of data that you would like to send for labeling, you can use the
update=True feature in
Dataset.append to update your items from private to shared with Scale.
In order to send your Privacy Mode data for labeling, you will need to upload it to Scale servers. You can update your existing Privacy Mode
DatasetItems such that they are uploaded to Scale as follows:
from nucleus import DatasetItem, NucleusClient my_private_image = DatasetItem( # assume the below link is now updated such that Scale has access image_location="https://link_to_url/that/only/i/can_access.jpeg", # this reference_id already exists reference_id="image1", upload_to_scale=True ) dataset = NucleusClient("YOUR_SCALE_API_KEY").get_dataset("YOUR_DATASET_ID") dataset.append([my_private_image], update=True)
Updated about 1 year ago