Software project outcomes heavily depend on natural language requirements,often causing diverse interpretations and issues like ambiguities and incomplete or faulty requirements.Researchers are exploring machine learn...Software project outcomes heavily depend on natural language requirements,often causing diverse interpretations and issues like ambiguities and incomplete or faulty requirements.Researchers are exploring machine learning to predict software bugs,but a more precise and general approach is needed.Accurate bug prediction is crucial for software evolution and user training,prompting an investigation into deep and ensemble learning methods.However,these studies are not generalized and efficient when extended to other datasets.Therefore,this paper proposed a hybrid approach combining multiple techniques to explore their effectiveness on bug identification problems.The methods involved feature selection,which is used to reduce the dimensionality and redundancy of features and select only the relevant ones;transfer learning is used to train and test the model on different datasets to analyze how much of the learning is passed to other datasets,and ensemble method is utilized to explore the increase in performance upon combining multiple classifiers in a model.Four National Aeronautics and Space Administration(NASA)and four Promise datasets are used in the study,showing an increase in the model’s performance by providing better Area Under the Receiver Operating Characteristic Curve(AUC-ROC)values when different classifiers were combined.It reveals that using an amalgam of techniques such as those used in this study,feature selection,transfer learning,and ensemble methods prove helpful in optimizing the software bug prediction models and providing high-performing,useful end mode.展开更多
The software engineering technique makes it possible to create high-quality software.One of the most significant qualities of good software is that it is devoid of bugs.One of the most time-consuming and costly softwar...The software engineering technique makes it possible to create high-quality software.One of the most significant qualities of good software is that it is devoid of bugs.One of the most time-consuming and costly software proce-dures isfinding andfixing bugs.Although it is impossible to eradicate all bugs,it is feasible to reduce the number of bugs and their negative effects.To broaden the scope of bug prediction techniques and increase software quality,numerous causes of software problems must be identified,and successful bug prediction models must be implemented.This study employs a hybrid of Faster Convolution Neural Network and the Moth Flame Optimization(MFO)algorithm to forecast the number of bugs in software based on the program data itself,such as the line quantity in codes,methods characteristics,and other essential software aspects.Here,the MFO method is used to train the neural network to identify optimal weights.The proposed MFO-FCNN technique is compared with existing methods such as AdaBoost(AB),Random Forest(RF),K-Nearest Neighbour(KNN),K-Means Clustering(KMC),Support Vector Machine(SVM)and Bagging Clas-sifier(BC)are examples of machine learning(ML)techniques.The assessment method revealed that machine learning techniques may be employed successfully and through a high level of accuracy.The obtained data revealed that the proposed strategy outperforms the traditional approach.展开更多
Software is unavoidable in software development and maintenance.In literature,many methods are discussed which fails to achieve efficient software bug detection and classification.In this paper,efficient Adaptive Deep...Software is unavoidable in software development and maintenance.In literature,many methods are discussed which fails to achieve efficient software bug detection and classification.In this paper,efficient Adaptive Deep Learning Model(ADLM)is developed for automatic duplicate bug report detection and classification process.The proposed ADLM is a combination of Conditional Random Fields decoding with Long Short-Term Memory(CRF-LSTM)and Dingo Optimizer(DO).In the CRF,the DO can be consumed to choose the efficient weight value in network.The proposed automatic bug report detection is proceeding with three stages like pre-processing,feature extraction in addition bug detection with classification.Initially,the bug report input dataset is gathered from the online source system.In the pre-processing phase,the unwanted information from the input data are removed by using cleaning text,convert data types and null value replacement.The pre-processed data is sent into the feature extraction phase.In the feature extraction phase,the four types of feature extraction method are utilized such as contextual,categorical,temporal and textual.Finally,the features are sent to the proposed ADLM for automatic duplication bug report detection and classification.The proposed methodology is proceeding with two phases such as training and testing phases.Based on the working process,the bugs are detected and classified from the input data.The projected technique is assessed by analyzing performance metrics such as accuracy,precision,Recall,F_Measure and kappa.展开更多
The existing software bug localization models treat the source file as natural language, which leads to the loss of syntactical and structure information of the source file. A bug localization model based on syntactic...The existing software bug localization models treat the source file as natural language, which leads to the loss of syntactical and structure information of the source file. A bug localization model based on syntactical and semantic information of source code is proposed. Firstly, abstract syntax tree(AST) is divided based on node category to obtain statement sequence. The statement tree is encoded into vectors to capture lexical and syntactical knowledge at the statement level.Secondly, the source code is transformed into vector representation by the sequence naturalness of the statement. Therefore,the problem of gradient vanishing and explosion caused by a large AST size is obviated when using AST to the represent source code. Finally, the correlation between bug reports and source files are comprehensively analyzed from three aspects of syntax, semantics and text to locate the buggy code. Experiments show that compared with other standard models, the proposed model improves the performance of bug localization, and it has good advantages in mean reciprocal rank(MRR), mean average precision(MAP) and Top N Rank.展开更多
This paper introduces strategies to detect software bugs in earlier life cycle stage in order to improve test efficiency. Static analysis tool is one of the effective methods to reveal software bugs during software de...This paper introduces strategies to detect software bugs in earlier life cycle stage in order to improve test efficiency. Static analysis tool is one of the effective methods to reveal software bugs during software development. Three popular static analysis tools are introduced, two of which, PolySpace and Splint, are compared with each other by analyzing a set of test cases generatedd by the authors. PolySpace can reveal 60% bugs with 100% R/W ratio (ratio of real bugs and total warnings), while Splint reveal 73.3% bugs with 44% R/W ratio. And they are good at finding different categories of bugs. Two strategies are concluded to improve test efficiency, under the guideline that static analysis tools should be used in finding different categories of bugs according to their features. The first one aims at finding bugs as many as possible, while the second concentrates to reduce the average time on bug revelation. Experimental data shows the first strategy can find 100% bugs with 60% R/W ratio, the second one find 80% bugs with 66.7% R/W ratio. Experiment results prove that these two strategies can improve the test efficiency in both fault coverage and testing time.展开更多
基金This Research is funded by Researchers Supporting Project Number(RSPD2024R947),King Saud University,Riyadh,Saudi Arabia.
文摘Software project outcomes heavily depend on natural language requirements,often causing diverse interpretations and issues like ambiguities and incomplete or faulty requirements.Researchers are exploring machine learning to predict software bugs,but a more precise and general approach is needed.Accurate bug prediction is crucial for software evolution and user training,prompting an investigation into deep and ensemble learning methods.However,these studies are not generalized and efficient when extended to other datasets.Therefore,this paper proposed a hybrid approach combining multiple techniques to explore their effectiveness on bug identification problems.The methods involved feature selection,which is used to reduce the dimensionality and redundancy of features and select only the relevant ones;transfer learning is used to train and test the model on different datasets to analyze how much of the learning is passed to other datasets,and ensemble method is utilized to explore the increase in performance upon combining multiple classifiers in a model.Four National Aeronautics and Space Administration(NASA)and four Promise datasets are used in the study,showing an increase in the model’s performance by providing better Area Under the Receiver Operating Characteristic Curve(AUC-ROC)values when different classifiers were combined.It reveals that using an amalgam of techniques such as those used in this study,feature selection,transfer learning,and ensemble methods prove helpful in optimizing the software bug prediction models and providing high-performing,useful end mode.
文摘The software engineering technique makes it possible to create high-quality software.One of the most significant qualities of good software is that it is devoid of bugs.One of the most time-consuming and costly software proce-dures isfinding andfixing bugs.Although it is impossible to eradicate all bugs,it is feasible to reduce the number of bugs and their negative effects.To broaden the scope of bug prediction techniques and increase software quality,numerous causes of software problems must be identified,and successful bug prediction models must be implemented.This study employs a hybrid of Faster Convolution Neural Network and the Moth Flame Optimization(MFO)algorithm to forecast the number of bugs in software based on the program data itself,such as the line quantity in codes,methods characteristics,and other essential software aspects.Here,the MFO method is used to train the neural network to identify optimal weights.The proposed MFO-FCNN technique is compared with existing methods such as AdaBoost(AB),Random Forest(RF),K-Nearest Neighbour(KNN),K-Means Clustering(KMC),Support Vector Machine(SVM)and Bagging Clas-sifier(BC)are examples of machine learning(ML)techniques.The assessment method revealed that machine learning techniques may be employed successfully and through a high level of accuracy.The obtained data revealed that the proposed strategy outperforms the traditional approach.
文摘Software is unavoidable in software development and maintenance.In literature,many methods are discussed which fails to achieve efficient software bug detection and classification.In this paper,efficient Adaptive Deep Learning Model(ADLM)is developed for automatic duplicate bug report detection and classification process.The proposed ADLM is a combination of Conditional Random Fields decoding with Long Short-Term Memory(CRF-LSTM)and Dingo Optimizer(DO).In the CRF,the DO can be consumed to choose the efficient weight value in network.The proposed automatic bug report detection is proceeding with three stages like pre-processing,feature extraction in addition bug detection with classification.Initially,the bug report input dataset is gathered from the online source system.In the pre-processing phase,the unwanted information from the input data are removed by using cleaning text,convert data types and null value replacement.The pre-processed data is sent into the feature extraction phase.In the feature extraction phase,the four types of feature extraction method are utilized such as contextual,categorical,temporal and textual.Finally,the features are sent to the proposed ADLM for automatic duplication bug report detection and classification.The proposed methodology is proceeding with two phases such as training and testing phases.Based on the working process,the bugs are detected and classified from the input data.The projected technique is assessed by analyzing performance metrics such as accuracy,precision,Recall,F_Measure and kappa.
基金supported by the National Key R&D Program of China (2018YFB1702700)。
文摘The existing software bug localization models treat the source file as natural language, which leads to the loss of syntactical and structure information of the source file. A bug localization model based on syntactical and semantic information of source code is proposed. Firstly, abstract syntax tree(AST) is divided based on node category to obtain statement sequence. The statement tree is encoded into vectors to capture lexical and syntactical knowledge at the statement level.Secondly, the source code is transformed into vector representation by the sequence naturalness of the statement. Therefore,the problem of gradient vanishing and explosion caused by a large AST size is obviated when using AST to the represent source code. Finally, the correlation between bug reports and source files are comprehensively analyzed from three aspects of syntax, semantics and text to locate the buggy code. Experiments show that compared with other standard models, the proposed model improves the performance of bug localization, and it has good advantages in mean reciprocal rank(MRR), mean average precision(MAP) and Top N Rank.
基金the National High-Tech Research and Development(863) Program of China(No.2004AA1Z2390)the Science and Technology Commission of Shanghai Municipality (No.06dz15004).
文摘This paper introduces strategies to detect software bugs in earlier life cycle stage in order to improve test efficiency. Static analysis tool is one of the effective methods to reveal software bugs during software development. Three popular static analysis tools are introduced, two of which, PolySpace and Splint, are compared with each other by analyzing a set of test cases generatedd by the authors. PolySpace can reveal 60% bugs with 100% R/W ratio (ratio of real bugs and total warnings), while Splint reveal 73.3% bugs with 44% R/W ratio. And they are good at finding different categories of bugs. Two strategies are concluded to improve test efficiency, under the guideline that static analysis tools should be used in finding different categories of bugs according to their features. The first one aims at finding bugs as many as possible, while the second concentrates to reduce the average time on bug revelation. Experimental data shows the first strategy can find 100% bugs with 60% R/W ratio, the second one find 80% bugs with 66.7% R/W ratio. Experiment results prove that these two strategies can improve the test efficiency in both fault coverage and testing time.