External Evaluation Functions
External evaluation functions can be used to upload model evaluation results based on functions you define. You do not have to upload the evaluation function source code, just give it a name, add it to a scenario test, and upload the results for each item.
Creating an External Function
External functions are created based on a unique name:
import nucleus
client = nucleus.NucleusClient(YOUR_SCALE_API_KEY)
client.validate.create_external_eval_function("custom_mAP_fn")
# fetch the newly created function
all_external_fns = client.validate.eval_functions.external_functions
eval_fn = all_external_fns["custom_mAP_fn"]
Creating a Scenario Test with an External Function
As defined in the Create Scenario Test Page, scenario tests can be created by calling the create_scenario_test
method.
Note: at the moment a scenario test can only consist of external or non-external functions, a mix of them is not allowed.
# Select a slice ID for the basis of the scenario test (you must be the owner of this slice)
slice_id = 'slc_c7wwfmphxdeg048wq6hg'
sc_test = client.validate.create_scenario_test('my_scenario_test', slice_id, [eval_fn])
Upload Result to the Scenario Test
The last step is to aggregate the result for each item in your slice, and upload them to the scenario test.
Each result consists of the ref_id
to an item, a score
and a weight
. The score
and weight
must be normalized between [0..1]
, the weight
maybe left out, and will default to 1
.
from nucleus.validate import EvaluationResult
# Model on which the evaluation function ran on
model_id = 'prj_bybpa3gjmjc30es761y0' # Scale Efficient Det Model Zoo
# Aggregate the results on a per item basis
eval_results = [
EvaluationResult(item_ref_id='aec019e2-93ce64eb.jpg', score=0.9, weight=0.5),
EvaluationResult(item_ref_id='afa7cd59-e22c1803.jpg', score=0.8, weight=0.7),
EvaluationResult(item_ref_id='45259f44-68ab3846.jpg', score=0.3),
EvaluationResult(item_ref_id='4f9b45d6-93651617.jpg', score=0.5),
]
# upload
sc_test.upload_external_evaluation_results(eval_fn, eval_results, model_id)
Now head over to the Scenario Tests Dashboard to see the scenario test with the uploaded results.
Updated over 2 years ago