Confusion Matrix and Cyber Security
Confusion Matrix: In the field of machine learning and specifically the problem of statistical classification, a confusion matrix, also known as an error matrix, is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one . Each row of the matrix represents the instances in an actual class while each column represents the instances in a predicted class, or vice versa – both variants are found in the literature.
Confusion Matrix Case Study
Let’s pretend we have a two-class classification problem of predicting whether a photograph contains a boy or a girl.
We have a test dataset of 10 records with expected outcomes and a set of predictions from our classification algorithm.
Expected | Predicted |
Boy | Girl |
Boy | Boy |
Girl | Girl |
Boy | Boy |
Girl | Boy |
Girl | Girl |
Girl | Girl |
Boy | Boy |
Boy | Girl |
Girl | Girl |
The algorithm made 7 of 10 predictions correct with an accuracy of 70%
accuracy= total correct predictions/total predictions*100
accuracy=7/10*100
In this classification errors were made
First, we must calculate no of correct predictions for each class
Boys classified as boys: 3
Girls classified as girls: 4
Now, we can calculate the number of incorrect predictions for each class, organized by the predicted value.
Boys classified as girls: 2
Girls classified as boys: 1
We can now arrange these values into the 2-class confusion matrix:
|
Boys |
Girls |
Boys |
3 |
1 |
Girls |
2 |
4 |
- The total actual boys in the dataset is the sum of the values on the boys column (3 + 2)
- The total actual girls in the dataset is the sum of values in the girls column (1 +4).
- The correct values are organized in a diagonal line from top left to bottom-right of the matrix (3 + 4).
- More errors were made by predicting boys as girls than predicting girls as boys.
Now we can summarize confusion matrix as follows:
- TP: True Positive: Predicted values correctly predicted as actual positive
- FP: Predicted values incorrectly predicted an actual positive. i.e., Negative values predicted as positive
- FN: False Negative: Positive values predicted as negative
- TN: True Negative: Predicted values correctly predicted as an actual negative
What is the accuracy of the machine learning model for this classification task?
Accuracy represents the number of correctly classified data instances over the total number of data instances.
In this example, Accuracy = (3 + 4)/(3 + 4 + 1 + 2 ) = 0.7 and in percentage the accuracy will be 70%.
Cyber Security
Network security is the practice of securing a computer network from intruders, whether targeted attackers or opportunistic malware.
· Application security focuses on keeping software and devices free of threats. A compromised application could provide access to the data its designed to protect. Successful security begins in the design stage, well before a program or device is deployed.
· Information security protects the integrity and privacy of data, both in storage and in transit.
· Operational security includes the processes and decisions for handling and protecting data assets. The permissions users have when accessing a network and the procedures that determine how and where data may be stored or shared all fall under this umbrella.
· Disaster recovery and business continuity define how an organization responds to a cyber-security incident or any other event that causes the loss of operations or data. Disaster recovery policies dictate how the organization restores its operations and information to return to the same operating capacity as before the event. Business continuity is the plan the organization falls back on while trying to operate without certain resources.
Thank you for visiting my blog😊
👍👍
ReplyDeleteThank you so much for this explanation :)))
ReplyDeleteVery helpful 👍
ReplyDelete