Semi-Supervised Machine Learning and Adaptive Data Clustering Approach for Software Defect Prediction

Sumangala Patil (JNTU, Anantapur, United Kingdom (Great Britain)); A. Nagaraja Rao (SJT, Vellore Institute of Technology, India); Chigarapalle Shoba Bindu (Jawaharlal Nehru Technological University, India)

Software defect prediction is an important area of research due to its significant utilization in various real-time applications. The ever-increasing demand of software application has resulted in rapid development of software modules development that further has elevated the chances of faulty software production. In order to deal with this issue software testing solutions are recommended. As, manual testing require more human effort and time hence, automated testing has gained attention from industries and researchers. In this work, we present an automated software defect prediction strategy using semi-supervised machine learning approach. The complete process is divided into four main sections: pre-processing where we apply KNN based missing value imputation, feature selection where we apply PCA based optimal feature selection method, clustering where we apply SOM based clustering model and finally Naive Bayes classifier is implemented. Experimental study shows that proposed approach achieves promising performance when compared with the existing software defect prediction techniques.

Journal: International Journal of Simulation- Systems, Science and Technology- IJSSST V20

Published: Feb 27, 2019

DOI: 10.5013/IJSSST.a.20.01.19