Confusion Matrix for Classification in ML.
Being an ML and Data Science enthusiast it's our duty to create a model with accuracy near about cent per cent (HaHa!!!) which is definitely not possible for real-world use cases.
A model with 80 to 95% accuracy is considered to be a good model. But it is not so easy to attain that. To analyze the model’s accuracy we need Confusion Matrics. Excited!!! to know about Confusion Matrix? Let's see.
What is CONFUSION MATRIX?
A confusion matrix is a tabular way of visualizing the performance of your prediction model. Each entry in it denotes the number of predictions made by the model where it classified the classes correctly or incorrectly.
The matrix compares the actual target values with those predicted by the machine learning model.
Confusion Matrix for Binary Classification.
First, let’s learn what is Binary classification.
Binary classification refers to predicting one of two classes. This means the dataset in which there are only 2 outcomes(say True or False, Pass or Fail, etc.) possible. So either the outcome is correct (i.e. True) or incorrect(i.e. False). The outcome which is in our favour is known as Positive and which is not is known as Negative
Confusion Matrix for Binary Classificationcomprises of 4 terms. The 4 terms are following:
- TP(True Positive): The model predicted correctly that the outcome is in our favour.
2. TN(True Negative): The model predicted correctly that the outcome is not in our favour.
3. FP(False Positive a.k.a. Type 1 Error): The model predicted that the outcome is in our favour but in reality, it is not.
4. FP(False Negative a.k.a. Type 2Error): The model predicted that the outcome is not in our favour but reality, it is.
Let us understand these terms with an example.
Let say there is a Web Service provider that applies(deploy) a detector for analysing that whether the client is safe or not. Here the model will provide 2 outcomes as “Safe” and “Not Safe”.
Our outcome is considered to be True Positive if the prediction done by our model is “Safe” and actually the client is “Safe”. And in this prediction service provider makes a profit. So this is actually in our favour.
Our outcome is considered to be True Negative if the prediction done by our model is “Not Safe” and in reality, the client is “Not Safe” and thus this prediction saves the service provider from getting hacked.
Now comes the interesting terms of Confusion Matrix, the 2 errors(FP and FN).
Let say our model predicted that the client is “Not Safe” for the server but the actual outcome is “Safe”. So in this case our model prediction is wrong(False). And service provider didn’t make any profit hence it is not in our favour and thus this is considered as False Negative (Type-2 Error).
Now let say the model predicted that the client is “Safe” for the server and the service provider allows the client to access the web pages. But the actual outcome is “Not Safe”. As this client is not safe and thus it can hack the server and do some malicious activity inside their server. So it can prove to be very harmful to the service provider. This is considered as False Positive (Type-1 Error).
This is why the False Positive (Type-1 Error) in Confusion Matrix could be very damaging to the service provider. Although they can still bear False Negative (Type-2 Error).
Therefore Type-1 Error in the Confusion Matrix is very dangerous and it should have 0 tolerance. And this how Confusion Matrix helps in finding that if the model is giving any False Positive prediction/s.
I hope you got to learn something new about Confusion Matrix by reading this article and you understood every bit of this.