*R : Insights From Confusion Matrix Of A Classifier*

*R : Insights From Confusion Matrix Of A Classifier*

Monday, March 21, 2016

When evaluating a classifier,generating a confusion matrix for the model gives indication on the performance of the model.Confusion matrix provides a statistical view on the parameters of Accuracy,Precision and Recall capabities of the model considering the True /Fase (T/F) and Positive/Negative (P/N) classification of the results.

High precision for a model indicates that model can return substantially higher relevant results than irrelevant while high recall indicates that algorithm returned most of the relevant results

Mathematical notation for these parameters are as below

Accuracy : (TP+TN)/(TP+FP+FN+TN)

Precision: tp/(TP + FP)

Recall : TP/(TP + FN)

As an example,below codes takes input of an attack classifier which classifies attack as suspicious or compromise based on attack attributes and produces a statistics on the parameters of Accuracy,Precision and Recall

#Make the confusion matrix

confmat<-table(attack$classify, classpred) # where attacks$classify is the predicted variable and #classpred is the classification model

# Assign TP, FN, FP and TN using confmat

TP <- confmat[1,1]

FN <- confmat[1,2]

FP <- confmat[2,1]

TN <- confmat[2,2]

# Calculate the Accuracy

acc <- (TP + TN) / (TP + FN + FP + TN)

# Calculate the Precision

prec <- TP / (TP + FP)

# Calculate the Recall

rec <- TP / (TP + FN)