Customizing Confusion Matrix Weights
Why customize the confusion matrix weights?
When training a model that contains a Boolean goal variable, the system uses an iterative process to select a true/false threshold that will generate the most accurate model. That is a model that finds the most true positive and true negative results.
However, in some scenarios, a straight forward classification of the data does not yield the most useful model. By customizing the confusion matrix weights, you can optimize the selection of the Boolean threshold in a way that generates a model more suited to a given business scenario. Consider the scenarios below.
Maximize classification
Suppose you are trying to identify patients with a specific diagnosis from large dataset. The number of patients who actually have the disease may be quite low, but the consequences for missing the diagnosis in a single patient could be serious. By setting a higher True Positive weight during model training, you can influence where the Boolean threshold is set to optimize the classification of true positive cases.
Minimize cost of failure
Suppose you are trying to predict a machine failure. You estimate that the cost of not predicting a failure (false negative) is about $90 in unplanned downtime and service charges. The cost for incorrectly predicting a failure (false positive) is a $20 service charge. You can customize the matrix weights to act as a cost matrix that will optimize the model based on the costs of false positive or false negative results.
Was this helpful?