###### Who Am I?

Hi.....I am Sanjay Menon.A Data Science enthusiast who loves exploring the application of Data Science principles in Security Analytics.I have rich experience across large enterprises like Symantec,Deutsche Bank,J.P.Morgan ,HP and Mercedez-Benz.

I hold Masters degree in Information Security from Royal Holloway,University of London and holds Certificate in Data Science from University of Washington and Certificate in Statistical Inference and Regression Techniques from Duke University.I am a CISSP and also holds Hortonworks Certified Hadoop Developer certification along with certificate from Elastic for ELK stack along with many other Technical as well as Security domain certifications

14-Nov-2017

Summary description of an incident tickets contains the major information about the incident.The data however is mostly unstructured and major effort to utilize this data in analytics requires high effort in the ETL process.Once we have the data in a structured for...

21-Mar-2016

When evaluating a classifier,generating a confusion matrix for the model gives indication on the performance of the model.Confusion matrix provides a statistical view on the parameters of Accuracy,Precision and Recall capabities of the model considering the True /Fase...

29-Feb-2016

Kmeans is the most popular among the clustering techniques but comes with an overhead of selecting an optimum cluster size for more effective output.Hierarchical Clustering option is a good option in such scenarios since there is no need to provide the cluster informat...

28-Feb-2016

K-means is quite widely used to cluster the data into groups with similar attributes and can identify outliers within a set of events.This becomes particularly relevant for security domain where the focus is to prioritize investigation from a large number of security e...

26-Feb-2016

igraph is a collection of network analysis tools available with R and is quite useful in creating linked graphs.

In the below examples,z is the subset data of with attacks and sourceaddress from a sample dataset ac and a linked graph is created between these 2 entiti...

26-Feb-2016

ggplot2 is a plotting system for R and is based on the grammar of graphics. It provided a powerful model of graphics that makes it easy to produce complex multi-layered graphics. ggplot2 is a good tool to plot time series plots.

In our example, we consider with data fro...

14-Feb-2016

A common investigative method in intrusion analysis is to identify outliers and have a focussed investigation on these outlier.Common method is to use SD or IQR.Since IQR is considered to be more robust in handling outliers,this statistical method is more commonly used...