An intelligent model for detecting fraud in the non technical loss of commercial power: case of Kenya power-ruiru area.
Financial frauds are on the rise as a result large amount of money is lost by institutions to fraudsters, recognizing the problem of losses and the area of suspicious behavior is the challenge of fraud detection. Applying data mining techniques on financial statements can help in pointing out the fraudulent usage. It is important to understand the underlying business objectives to apply data mining objectives. Electricity consumer dishonesty is the main problem faced by power utilities that are managed by a financial billing system worldwide. Finding efficient measurements for detecting fraudulent electricity consumption has been an active research area in recent years. This research report presents a proposed model for detecting Non-Technical Loss (NTL) of commercial in electricity consumption utility using data mining techniques such as support vector machine, neural network, K-Nearest Neighbor and Naïve bayes. This work applies a suitable data mining technique in this field based on the customer information billing system for electricity consumption in selected accounts at Kenya Power Limited. The selected techniques are used in the design and development of a fraud detection model. The efficiency and accuracy of the model were tested and evaluated in order to get one accepted technique to be adopted by the Kenya Power Limited. From the results of the tested model, the biggest score for the fraud detection hitrate is achieved by support vector machine (SVM) classifier with 86.44% followed by K-Nearest Neighbor with 84.75% and classifier with the least optimal fraud detection rate is the Naïve bayes at 74.58%. Therefore this study adopted the SVM classifier for the following reasons. First, balancing technique used for ANN and K-Nearest Neighbor depend on random sampling in which decrease in the number of instances in the training data set to more than the half. Second, the SVM classifier depends on class weighting technique to balance data set without omitting any instance and finally SVM got the maximum accuracy score with balanced data sets.