Monitor

Once our model is deployed, we want to make sure that it performs as well in production as it did in training. We can opt in to logging by calling logPrediction. Later on, as we get official diagnoses for patients, we can call logTrueValue and use the same identifier as we used in the call to logPrediction.

1
2
3
4
5
6
7
8
9
10
11
12
13
# Log the prediction. Tangram.log_prediction(model, %Tangram.LogPredictionArgs{ identifier: "John Doe", options: predict_options, input: input, output: output, }) # Later on, if we get an official diagnosis for the patient, log the true value. Make sure to match the `identifier`. Tangram.log_true_value(model, %Tangram.LogTrueValueArgs{ identifier: "John Doe", true_value: "Positive", })
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// Log the prediction. err = model.LogPrediction(tangram.LogPredictionArgs{ Identifier: "John Doe", Input: input, Options: predictOptions, Output: output, }) if err != nil { log.Fatal(err) } // Later on, if we get an official diagnosis for the patient, log the true value. Make sure to match the `identifier`. err = model.LogTrueValue(tangram.LogTrueValueArgs{ Identifier: "John Doe", TrueValue: "Positive", }) if err != nil { log.Fatal(err) }
1
2
3
4
5
6
7
8
9
10
11
12
13
// Log the prediction. model.logPrediction({ identifier: "6c955d4f-be61-4ca7-bba9-8fe32d03f801", input, options, output, }) // Later on, if we get an official diagnosis for the patient, log the true value. Make sure to match the `identifier`. model.logTrueValue({ identifier: "6c955d4f-be61-4ca7-bba9-8fe32d03f801", trueValue: "Positive", })
1
2
3
4
5
// Log the predicton $model->log_prediction('71762b29-2296-4bf9-a1d4-59144d74c9d9', $input, $output, $options); // Later on, if we get an official diagnosis for the patient, log the true value. Make sure to match the `identifier`. $model->log_true_value('71762b29-2296-4bf9-a1d4-59144d74c9d9', 'Positive');
1
2
3
4
5
6
7
8
9
10
11
12
13
# Log the prediction. model.log_prediction( identifier="John Doe", input=input, output=output, options=predict_options, ) # Later on, if we get an official diagnosis for the patient, log the true value. Make sure to match the `identifier`. model.log_true_value( identifier="John Doe", true_value="Positive", )
1
2
3
4
5
6
7
8
9
10
11
12
13
# Log the prediction. model.log_prediction( identifier: 'John Doe', input: input, output: output, options: options ) # Later on, if we get an official diagnosis for the patient, log the true value. Make sure to match the `identifier`. model.log_true_value( identifier: 'John Doe', true_value: 'Positive' )
1
2
3
4
5
6
7
8
9
10
11
12
13
// Log the prediction. model.log_prediction(tangram::LogPredictionArgs { identifier: "John Doe".into(), input, options: Some(options), output, })?; // Later on, if we get an official diagnosis for the patient, log the true value. Make sure to match the `identifier`. model.log_true_value(tangram::LogTrueValueArgs { identifier: "John Doe".into(), true_value: "Positive".into(), })?;

Back in the app, we can look up a prediction by its identifier, and get an explanation that shows how each feature affects the output.

Production Predictions

Positive
Predicted Class
97.48%
Probability

Now let's see how accurate our model has been in production. Let's open the app and choose Production Metrics in the sidebar.

Production Metrics

Accuracy
-4.47%
83.33%
78.87%
Training
Production

Uh oh! It's a bit lower than we expected. Let's try to find the cause. Under "Production Stats", we see that the "chest_pain" column has an alert and a high invalid values count. Click on the column to view more details.

Production Stats

Column Stats

StatusColumnTypeAbsent CountInvalid Count
All good
ageNumber00
All good
genderEnum00
High Invalid Values Count
!
chest_painEnum00
All good
resting_blood_pressureNumber00
All good
cholesterolNumber00
All good
fasting_blood_sugar_greater_than_120Enum00
All good
resting_ecg_resultEnum00
All good
exercise_max_heart_rateNumber00
All good
exercise_induced_anginaEnum00
All good
exercise_st_depressionNumber00
All good
exercise_st_slopeEnum00
All good
fluoroscopy_vessels_coloredEnum00
All good
thallium_stress_testEnum00

It looks like there is a large discrepancy between the value "asymptomatic" in production versus training. In the table below, we see a high number of invalid values with the string "asx". It looks like we are accidentally using the string "asx" in our code instead of "asymptomatic" for the chest pain column. We can update our code to use the correct value and follow the metrics going forward to confirm they bounce back.

High Invalid Values Count

chest_pain

This Month's Distribution of Unique Values for chest_pain
Training
Production

Unique Values

ValueTraining CountProduction CountTraining FractionProduction Fraction
asymptomatic1339548.72%23.11%
atypical angina437615.75%18.49%
non-angina pain7612227.84%29.68%
typical angina21257.69%6.08%

Invalid Values

ValueCountProduction Fraction
asx9322.63%

Hooray! You made it to the end! In this guide, we learned how to train a model, make predictions from our code, tune our model, and monitor it in production. If you want help using Tangram with your own data, send us an email at hello@tangram.dev or ask a question on GitHub Discussions.