Comparative analysis of anomally detection algorithms
Attacks with no previously known signatures present a challenge on how to detect them. These attacks (commonly are referred to as zero day attacks) have not been experienced before and exploit vulnerability previously no known. However these attacks have characteristics that differ from those of normal of attack free packets. These changes in network traffic packets can be detected by comparing anomalous packets with those that do not have attacks. we selected algorithms that detect anomalies based on packet header and evaluated them by measuring three metrics (False positive ratio, accuracy and detection rate). This entailed use of two sets of tcpdump data .The first set of data was attack free training data that was used to train the algorithms so as to set a basis for the comparison with the data to be tested. The second data set contained labeled which had been previously identified. These attack have been carefully identified and their location in the dataset was known and documented The algorithms were trained using the training dataset and later attempted to detect the attacks in the test dataset. Once an anomaly was identified, the algorithms the produced a outputs containing IP address of the victim, date of the attack, score and field contributing the most to the anomaly. The dataset used also has an evaluation truth table that contains a list of all the attacks in the dataset. This table contains the date the attack occurred, time, and IP address of the victim computer. Analysis of the data entailed cleaning of the results of the algorithms to remove unnecessary fields using Ms Excel. These data was uploaded into MS SQL server and a column labeled status was added to the table containing the algorithm results. We compared results of the algorithms with the evaluation truth table detection list. This was done using a program created using Visual Basic and any matching record was updated with an entry of true positive in the status column while records not matching were marked with a false positive entry. The results indicated the algorithms have a high false positive ratio and a very low accuracy with Packet header anomaly detection algorithm being the performing algorithm among the algorithms evaluated.