External Evaluation Functions

External evaluation functions can be used to upload model evaluation results based on functions you define. You do not have to upload the evaluation function source code, just give it a name, add it to a scenario test, and upload the results for each item.

Creating an External Function

External functions are created based on a unique name:

import nucleus

client = nucleus.NucleusClient(YOUR_SCALE_API_KEY)

client.validate.create_external_eval_function("custom_mAP_fn")

# fetch the newly created function
all_external_fns = client.validate.eval_functions.external_functions
eval_fn = all_external_fns["custom_mAP_fn"]

Creating a Scenario Test with an External Function

As defined in the Create Scenario Test Page, scenario tests can be created by calling the create_scenario_test method.

Note: at the moment a scenario test can only consist of external or non-external functions, a mix of them is not allowed.

# Select a slice ID for the basis of the scenario test (you must be the owner of this slice)
slice_id = 'slc_c7wwfmphxdeg048wq6hg'

sc_test = client.validate.create_scenario_test('my_scenario_test', slice_id, [eval_fn])

Upload Result to the Scenario Test

The last step is to aggregate the result for each item in your slice, and upload them to the scenario test.
Each result consists of the ref_id to an item, a score and a weight. The score and weight must be normalized between [0..1], the weight maybe left out, and will default to 1.

from nucleus.validate import EvaluationResult

# Model on which the evaluation function ran on
model_id = 'prj_bybpa3gjmjc30es761y0'  # Scale Efficient Det Model Zoo

# Aggregate the results on a per item basis
eval_results = [
    EvaluationResult(item_ref_id='aec019e2-93ce64eb.jpg', score=0.9, weight=0.5),
    EvaluationResult(item_ref_id='afa7cd59-e22c1803.jpg', score=0.8, weight=0.7),
    EvaluationResult(item_ref_id='45259f44-68ab3846.jpg', score=0.3),
    EvaluationResult(item_ref_id='4f9b45d6-93651617.jpg', score=0.5),
]

# upload
sc_test.upload_external_evaluation_results(eval_fn, eval_results, model_id)

Now head over to the Scenario Tests Dashboard to see the scenario test with the uploaded results.