Support Vector And Relevance Vector Machines: A Comparison And Their Use For Data Reduction
Abstract
Most machine learning algorithms suffer from the problems of over-fitting, complexity in
terms of computational time and memory storage due large training sets sizes. Two recent
kernel methods that have been developed to address these issues include the Support
Vector Machines (SVMs) and Relevant Vector Machines (RVMs). The key feature of
these two kernel methods is that they utilize fewer kernel functions, which form the
training model. The SVMs have some desirable properties that make it a very powerful
machine learning technique. The SVMs has already been successfully used for a wide
variety of problems, such as fraud detection, bio-informatics (e.g. protein folding
problem), data mining, and natural language learning. Relevance Vector Machines
(RVMs) have shown improved performance over the SVMs in both computational
complexities as well as in accuracy while utilizing fewer kernel functions ('Relevance
Vectors'), which implies a considerable saving in memory and computations in a
practical implementation. In this research project we first compare empirically the SVMs
and RVMs. Then we investigate how the resulting Support Vectors and/or Relevance
Vectors can be used as reduced training set for other machine learning algorithms without
loss in generalization but a reduction in computational cost of these algorithms. At the
moment we only consider K-Nearest Neighbour, Decision tress and Naive Bayesian.
Citation
Master of Science in Information SystemsPublisher
University of Nairobi School of Computing and Informatics