Software Fault Prediction using Machine Learning Techniques

Kumari, Rupam (2018) Software Fault Prediction using Machine Learning Techniques. MTech thesis.

[img]PDF (Restricted upto 26/04/2021)
Restricted to Repository staff only

1376Kb

Abstract

Various classification techniques have been explored by the distinct researchers previously for the prediction of a software fault. It is noticed that the result of several technique changes for software to software and no technique has always given a good result in various fields.Moreover, ensemble methods take the benefits of different individual prediction techniques and produce a better performance as compared to a single technique. Most of the works are available to classify the software system whether it is fault prone or non-fault prone but very few efforts are present for the prediction of faults using ensemble techniques. The main objective of a presented system is to find how many faults detected in software. We have implemented the most popular and widely used machine learning algorithms such as Linear Regression, Decision Tree, Random Forest, Ada Boost, Extra Tree, k-Nearest Neighbour,Gradient-Boosting, Multi-Layer Perceptron, Bagging Regression, Bayesian Ridge Regression, Stochastic Gradient Descent and Support Vector Machine. These machine learning techniques have been implemented to find the better base learners for heterogeneous ensemble learning. We have selected the best 4 base learner (by analyzing the result of all used techniques on various data set) Support Vector Machine, bagging Regression, Random Forest and k- Nearest Neighbour for Stacking Regression to improve the performance of the model. Fifteen different data sets are used in this work is accumulated from publicly available Promise repository. The presented work gives a uniform result on all the data set. Several performance evaluation measures including AAE, ARE, MSE, RMSE and accuracy have been considered to evaluate the presented heterogeneous technique. We have observed that in all data set Stacking Regression gives a better result among all selected top four machine learning techniques. The AAE, ARE, MSE, RMSE and accuracy provide better performance of the presented system. Comparision between homogeneous regression technique and heterogeneous ensemble technique provides an enhanced result for prediction of the quantity of faults in utilized software. The principal consequence of this work is to provide greater use of limited testing sources helping in right on time and immediate recognition of greater part of the fault in software modules.

Item Type:Thesis (MTech)
Uncontrolled Keywords:Number of faults; Ensemble methods; Stacking Regression; k-NN; SVM; RF; BR; ARE; AAE; Accuracy; MSE; RMSE
Subjects:Engineering and Technology > Computer and Information Science > Information Security
Divisions: Engineering and Technology > Department of Computer Science Engineering
ID Code:9875
Deposited By:IR Staff BPCL
Deposited On:26 Apr 2019 15:34
Last Modified:26 Apr 2019 15:34
Supervisor(s):Mohapatra, Ramesh Kumar

Repository Staff Only: item control page