Test Case Prioritization(TCP)techniques perform better than other regression test optimization techniques including Test Suite Reduction(TSR)and Test Case Selection(TCS).Many TCP techniques are available,and their per...Test Case Prioritization(TCP)techniques perform better than other regression test optimization techniques including Test Suite Reduction(TSR)and Test Case Selection(TCS).Many TCP techniques are available,and their performance is usually measured through a metric Average Percentage of Fault Detection(APFD).This metric is value-neutral because it only works well when all test cases have the same cost,and all faults have the same severity.Using APFD for performance evaluation of test case orders where test cases cost or faults severity varies is prone to produce false results.Therefore,using the right metric for performance evaluation of TCP techniques is very important to get reliable and correct results.In this paper,two value-based TCP techniques have been introduced using Genetic Algorithm(GA)including Value-Cognizant Fault Detection-Based TCP(VCFDB-TCP)and Value-Cognizant Requirements Coverage-Based TCP(VCRCB-TCP).Two novel value-based performance evaluation metrics are also introduced for value-based TCP including Average Percentage of Fault Detection per value(APFDv)and Average Percentage of Requirements Coverage per value(APRCv).Two case studies are performed to validate proposed techniques and performance evaluation metrics.The proposed GA-based techniques outperformed the existing state-of-the-art TCP techniques including Original Order(OO),Reverse Order(REV-O),Random Order(RO),and Greedy algorithm.展开更多
Software needs modifications and requires revisions regularly.Owing to these revisions,retesting software becomes essential to ensure that the enhancements made,have not affected its bug-free functioning.The time and ...Software needs modifications and requires revisions regularly.Owing to these revisions,retesting software becomes essential to ensure that the enhancements made,have not affected its bug-free functioning.The time and cost incurred in this process,need to be reduced by the method of test case selection and prioritization.It is observed that many nature-inspired techniques are applied in this area.African Buffalo Optimization is one such approach,applied to regression test selection and prioritization.In this paper,the proposed work explains and proves the applicability of the African Buffalo Optimization approach to test case selection and prioritization.The proposed algorithm converges in polynomial time(O(n^(2))).In this paper,the empirical evaluation of applying African Buffalo Optimization for test case prioritization is done on sample data set with multiple iterations.An astounding 62.5%drop in size and a 48.57%drop in the runtime of the original test suite were recorded.The obtained results are compared with Ant Colony Optimization.The comparative analysis indicates that African Buffalo Optimization and Ant Colony Optimization exhibit similar fault detection capabilities(80%),and a reduction in the overall execution time and size of the resultant test suite.The results and analysis,hence,advocate and encourages the use of African Buffalo Optimization in the area of test case selection and prioritization.展开更多
Both unit and integration testing are incredibly crucial for almost any software application because each of them operates a distinct process to examine the product.Due to resource constraints,when software is subject...Both unit and integration testing are incredibly crucial for almost any software application because each of them operates a distinct process to examine the product.Due to resource constraints,when software is subjected to modifications,the drastic increase in the count of test cases forces the testers to opt for a test optimization strategy.One such strategy is test case prioritization(TCP).Existing works have propounded various methodologies that re-order the system-level test cases intending to boost either the fault detection capabilities or the coverage efficacy at the earliest.Nonetheless,singularity in objective functions and the lack of dissimilitude among the re-ordered test sequences have degraded the cogency of their approaches.Considering such gaps and scenarios when the meteoric and continuous updations in the software make the intensive unit and integration testing process more fragile,this study has introduced a memetics-inspired methodology for TCP.The proposed structure is first embedded with diverse parameters,and then traditional steps of the shuffled-frog-leaping approach(SFLA)are followed to prioritize the test cases at unit and integration levels.On 5 standard test functions,a comparative analysis is conducted between the established algorithms and the proposed approach,where the latter enhances the coverage rate and fault detection of re-ordered test sets.Investigation results related to the mean average percentage of fault detection(APFD)confirmed that the proposed approach exceeds the memetic,basic multi-walk,PSO,and optimized multi-walk by 21.7%,13.99%,12.24%,and 11.51%,respectively.展开更多
To solve the problem of time-awarc test case prioritization,a hybrid algorithm composed of integer linear programming and the genetic algorithm(ILP-GA)is proposed.First,the test case suite which cm maximize the number...To solve the problem of time-awarc test case prioritization,a hybrid algorithm composed of integer linear programming and the genetic algorithm(ILP-GA)is proposed.First,the test case suite which cm maximize the number of covered program entities a d satisfy time constraints is selected by integer linea progamming.Secondly,the individual is encoded according to the cover matrices of entities,and the coverage rate of program entities is used as the fitness function and the genetic algorithm is used to prioritize the selected test cases.Five typical open source projects are selected as benchmark programs.Branch and method are selected as program entities,and time constraint percentages a e 25%and 75%.The experimental results show that the ILP-GA convergence has faster speed and better stability than ILP-additional and IP-total in most cases,which contributes to the detection of software defects as early as possible and reduces the software testing costs.展开更多
Generally,software testing is considered as a proficient technique to achieve improvement in quality and reliability of the software.But,the quality of test cases has a considerable influence on fault revealing capabi...Generally,software testing is considered as a proficient technique to achieve improvement in quality and reliability of the software.But,the quality of test cases has a considerable influence on fault revealing capability of software testing activity.Test Case Prioritization(TCP)remains a challenging issue since prioritizing test cases is unsatisfactory in terms of Average Percentage of Faults Detected(APFD)and time spent upon execution results.TCP ismainly intended to design a collection of test cases that can accomplish early optimization using preferred characteristics.The studies conducted earlier focused on prioritizing the available test cases in accelerating fault detection rate during software testing.In this aspect,the current study designs aModified Harris Hawks Optimization based TCP(MHHO-TCP)technique for software testing.The aim of the proposed MHHO-TCP technique is to maximize APFD and minimize the overall execution time.In addition,MHHO algorithm is designed to boost the exploration and exploitation abilities of conventional HHO algorithm.In order to validate the enhanced efficiency of MHHO-TCP technique,a wide range of simulations was conducted on different benchmark programs and the results were examined under several aspects.The experimental outcomes highlight the improved efficiency of MHHO-TCP technique over recent approaches under different measures.展开更多
Edge devices,due to their limited computational and storage resources,often require the use of compilers for program optimization.Therefore,ensuring the security and reliability of these compilers is of paramount impo...Edge devices,due to their limited computational and storage resources,often require the use of compilers for program optimization.Therefore,ensuring the security and reliability of these compilers is of paramount importance in the emerging field of edge AI.One widely used testing method for this purpose is fuzz testing,which detects bugs by inputting random test cases into the target program.However,this process consumes significant time and resources.To improve the efficiency of compiler fuzz testing,it is common practice to utilize test case prioritization techniques.Some researchers use machine learning to predict the code coverage of test cases,aiming to maximize the test capability for the target compiler by increasing the overall predicted coverage of the test cases.Nevertheless,these methods can only forecast the code coverage of the compiler at a specific optimization level,potentially missing many optimization-related bugs.In this paper,we introduce C-CORE(short for Clustering by Code Representation),the first framework to prioritize test cases according to their code representations,which are derived directly from the source codes.This approach avoids being limited to specific compiler states and extends to a broader range of compiler bugs.Specifically,we first train a scaled pre-trained programming language model to capture as many common features as possible from the test cases generated by a fuzzer.Using this pre-trained model,we then train two downstream models:one for predicting the likelihood of triggering a bug and another for identifying code representations associated with bugs.Subsequently,we cluster the test cases according to their code representations and select the highest-scoring test case from each cluster as the high-quality test case.This reduction in redundant testing cases leads to time savings.Comprehensive evaluation results reveal that code representations are better at distinguishing test capabilities,and C-CORE significantly enhances testing efficiency.Across four datasets,C-CORE increases the average of the percentage of faults detected(APFD)value by 0.16 to 0.31 and reduces test time by over 50% in 46% of cases.When compared to the best results from approaches using predicted code coverage,C-CORE improves the APFD value by 1.1% to 12.3% and achieves an overall time-saving of 159.1%.展开更多
By analyzing the average percent of faults detected (APFD) metric and its variant versions, which are widely utilized as metrics to evaluate the fault detection efficiency of the test suite, this paper points out so...By analyzing the average percent of faults detected (APFD) metric and its variant versions, which are widely utilized as metrics to evaluate the fault detection efficiency of the test suite, this paper points out some limitations of the APFD series metrics. These limitations include APFD series metrics having inaccurate physical explanations and being unable to precisely describe the process of fault detection. To avoid the limitations of existing metrics, this paper proposes two improved metrics for evaluating fault detection efficiency of a test suite, including relative-APFD and relative-APFDc. The proposed metrics refer to both the speed of fault detection and the constraint of the testing source. The case study shows that the two proposed metrics can provide much more precise descriptions of the fault detection process and the fault detection efficiency of the test suite.展开更多
<div style="text-align:justify;"> <span style="font-family:Verdana;">Software systems have become complex and challenging to develop and maintain because of the large size of test cases...<div style="text-align:justify;"> <span style="font-family:Verdana;">Software systems have become complex and challenging to develop and maintain because of the large size of test cases with increased scalability issues. Test case prioritization methods have been successfully utilized in test case management. However, the prohibitively exorbitant cost of large test cases is now the mainstream in the software industry. The growth of agile test-driven development has increased the expectations for software quality. Yet, our knowledge of when to use various path testing criteria for cost-effectiveness is inadequate due to the inherent complexity in software testing. Existing researches attempted to address the issue without effectively tackling the scalability of large test suites to reduce time in regression testing. In order to provide a more accurate way of fault detection in software projects, we introduced novel coverage criteria, called Incremental Cluster-based test case Prioritization (ICP), and investigated its potentials by making a comparative evaluation with three un-clustered traditional coverage-based criteria: Prime-Path Coverage (PPC), Edge-Pair Coverage (EPC) and Edge Coverage (EC) based on mutation analysis. By clustering test suites, based on their dynamic run-time behavior, the number of pair-wise comparisons is reduced significantly. To compare, we analyzed 20 functions from 25 C programs, instrumented faults into the programs, and used the Mull mutation tool to generate mutants and perform a statistical analysis of the results. The experimental results show that ICP can lead to cost-effective improvements in fault detection.</span> </div>展开更多
Mobile applications usually can only access limited amount of memory. Improper use of the memory can cause memory leaks, which may lead to performance slowdowns or even cause applications to be unexpectedly killed. Al...Mobile applications usually can only access limited amount of memory. Improper use of the memory can cause memory leaks, which may lead to performance slowdowns or even cause applications to be unexpectedly killed. Although a large body of research has been devoted into the memory leak diagnosing techniques after leaks have been discovered, it is still challenging to find out the memory leak phenomena at first. Testing is the most widely used technique for failure discovery. However, traditional testing techniques are not directed for the discovery of memory leaks. They may spend lots of time on testing unlikely leaking executions and therefore can be inefficient. To address the problem, we propose a novel approach to prioritize test cases according to their likelihood to cause memory leaks in a given test suite. It firstly builds a prediction model to determine whether each test can potentially lead to memory leaks based on machine learning on selected code features. Then, for each input test case, we partly run it to get its code features and predict its likelihood to cause leaks. The most suspicious test cases will be suggested to run at first in order to reveal memory leak faults as soon as possible. Experimental evaluation on several Android applications shows that our approach is effective.展开更多
In recent years,automatic program repair approaches have developed rapidly in the field of software engineering.However,the existing program repair techniques based on genetic programming suffer from requiring verific...In recent years,automatic program repair approaches have developed rapidly in the field of software engineering.However,the existing program repair techniques based on genetic programming suffer from requiring verification of a large number of candidate patches,which consume a lot of computational resources.In this paper,we propose a random search and code similarity based automatic program repair(RSCSRepair).First,to reduce the verification computation effort for candidate patches,we introduce test filtering to reduce the number of test cases and use test case prioritization techniques to reconstruct a new set of test cases.Second,we use a combination of code similarity and random search for patch generation.Finally,we use a patch overfitting detection method to improve the quality of patches.In order to verify the performance of our approach,we conducted the experiments on the Defects4J benchmark.The experimental results show that RSCSRepair correctly repairs up to 54 bugs,with improvements of 14.3%,8.5%,14.3%and 10.3%for our approach compared with jKali,Nopol,CapGen and Sim Fix,respectively.展开更多
文摘Test Case Prioritization(TCP)techniques perform better than other regression test optimization techniques including Test Suite Reduction(TSR)and Test Case Selection(TCS).Many TCP techniques are available,and their performance is usually measured through a metric Average Percentage of Fault Detection(APFD).This metric is value-neutral because it only works well when all test cases have the same cost,and all faults have the same severity.Using APFD for performance evaluation of test case orders where test cases cost or faults severity varies is prone to produce false results.Therefore,using the right metric for performance evaluation of TCP techniques is very important to get reliable and correct results.In this paper,two value-based TCP techniques have been introduced using Genetic Algorithm(GA)including Value-Cognizant Fault Detection-Based TCP(VCFDB-TCP)and Value-Cognizant Requirements Coverage-Based TCP(VCRCB-TCP).Two novel value-based performance evaluation metrics are also introduced for value-based TCP including Average Percentage of Fault Detection per value(APFDv)and Average Percentage of Requirements Coverage per value(APRCv).Two case studies are performed to validate proposed techniques and performance evaluation metrics.The proposed GA-based techniques outperformed the existing state-of-the-art TCP techniques including Original Order(OO),Reverse Order(REV-O),Random Order(RO),and Greedy algorithm.
基金This research is funded by the Deanship of Scientific Research at Umm Al-Qura University,Grant Code:22UQU4281755DSR02.
文摘Software needs modifications and requires revisions regularly.Owing to these revisions,retesting software becomes essential to ensure that the enhancements made,have not affected its bug-free functioning.The time and cost incurred in this process,need to be reduced by the method of test case selection and prioritization.It is observed that many nature-inspired techniques are applied in this area.African Buffalo Optimization is one such approach,applied to regression test selection and prioritization.In this paper,the proposed work explains and proves the applicability of the African Buffalo Optimization approach to test case selection and prioritization.The proposed algorithm converges in polynomial time(O(n^(2))).In this paper,the empirical evaluation of applying African Buffalo Optimization for test case prioritization is done on sample data set with multiple iterations.An astounding 62.5%drop in size and a 48.57%drop in the runtime of the original test suite were recorded.The obtained results are compared with Ant Colony Optimization.The comparative analysis indicates that African Buffalo Optimization and Ant Colony Optimization exhibit similar fault detection capabilities(80%),and a reduction in the overall execution time and size of the resultant test suite.The results and analysis,hence,advocate and encourages the use of African Buffalo Optimization in the area of test case selection and prioritization.
文摘Both unit and integration testing are incredibly crucial for almost any software application because each of them operates a distinct process to examine the product.Due to resource constraints,when software is subjected to modifications,the drastic increase in the count of test cases forces the testers to opt for a test optimization strategy.One such strategy is test case prioritization(TCP).Existing works have propounded various methodologies that re-order the system-level test cases intending to boost either the fault detection capabilities or the coverage efficacy at the earliest.Nonetheless,singularity in objective functions and the lack of dissimilitude among the re-ordered test sequences have degraded the cogency of their approaches.Considering such gaps and scenarios when the meteoric and continuous updations in the software make the intensive unit and integration testing process more fragile,this study has introduced a memetics-inspired methodology for TCP.The proposed structure is first embedded with diverse parameters,and then traditional steps of the shuffled-frog-leaping approach(SFLA)are followed to prioritize the test cases at unit and integration levels.On 5 standard test functions,a comparative analysis is conducted between the established algorithms and the proposed approach,where the latter enhances the coverage rate and fault detection of re-ordered test sets.Investigation results related to the mean average percentage of fault detection(APFD)confirmed that the proposed approach exceeds the memetic,basic multi-walk,PSO,and optimized multi-walk by 21.7%,13.99%,12.24%,and 11.51%,respectively.
基金The Natural Science Foundation of Education Ministry of Shaanxi Province(No.15JK1672)the Industrial Research Project of Shaanxi Province(No.2017GY-092)Special Fund for Key Discipline Construction of General Institutions of Higher Education in Shaanxi Province
文摘To solve the problem of time-awarc test case prioritization,a hybrid algorithm composed of integer linear programming and the genetic algorithm(ILP-GA)is proposed.First,the test case suite which cm maximize the number of covered program entities a d satisfy time constraints is selected by integer linea progamming.Secondly,the individual is encoded according to the cover matrices of entities,and the coverage rate of program entities is used as the fitness function and the genetic algorithm is used to prioritize the selected test cases.Five typical open source projects are selected as benchmark programs.Branch and method are selected as program entities,and time constraint percentages a e 25%and 75%.The experimental results show that the ILP-GA convergence has faster speed and better stability than ILP-additional and IP-total in most cases,which contributes to the detection of software defects as early as possible and reduces the software testing costs.
基金The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work under Grant Number(RGP.1/127/42)Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2022R237),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Generally,software testing is considered as a proficient technique to achieve improvement in quality and reliability of the software.But,the quality of test cases has a considerable influence on fault revealing capability of software testing activity.Test Case Prioritization(TCP)remains a challenging issue since prioritizing test cases is unsatisfactory in terms of Average Percentage of Faults Detected(APFD)and time spent upon execution results.TCP ismainly intended to design a collection of test cases that can accomplish early optimization using preferred characteristics.The studies conducted earlier focused on prioritizing the available test cases in accelerating fault detection rate during software testing.In this aspect,the current study designs aModified Harris Hawks Optimization based TCP(MHHO-TCP)technique for software testing.The aim of the proposed MHHO-TCP technique is to maximize APFD and minimize the overall execution time.In addition,MHHO algorithm is designed to boost the exploration and exploitation abilities of conventional HHO algorithm.In order to validate the enhanced efficiency of MHHO-TCP technique,a wide range of simulations was conducted on different benchmark programs and the results were examined under several aspects.The experimental outcomes highlight the improved efficiency of MHHO-TCP technique over recent approaches under different measures.
文摘Edge devices,due to their limited computational and storage resources,often require the use of compilers for program optimization.Therefore,ensuring the security and reliability of these compilers is of paramount importance in the emerging field of edge AI.One widely used testing method for this purpose is fuzz testing,which detects bugs by inputting random test cases into the target program.However,this process consumes significant time and resources.To improve the efficiency of compiler fuzz testing,it is common practice to utilize test case prioritization techniques.Some researchers use machine learning to predict the code coverage of test cases,aiming to maximize the test capability for the target compiler by increasing the overall predicted coverage of the test cases.Nevertheless,these methods can only forecast the code coverage of the compiler at a specific optimization level,potentially missing many optimization-related bugs.In this paper,we introduce C-CORE(short for Clustering by Code Representation),the first framework to prioritize test cases according to their code representations,which are derived directly from the source codes.This approach avoids being limited to specific compiler states and extends to a broader range of compiler bugs.Specifically,we first train a scaled pre-trained programming language model to capture as many common features as possible from the test cases generated by a fuzzer.Using this pre-trained model,we then train two downstream models:one for predicting the likelihood of triggering a bug and another for identifying code representations associated with bugs.Subsequently,we cluster the test cases according to their code representations and select the highest-scoring test case from each cluster as the high-quality test case.This reduction in redundant testing cases leads to time savings.Comprehensive evaluation results reveal that code representations are better at distinguishing test capabilities,and C-CORE significantly enhances testing efficiency.Across four datasets,C-CORE increases the average of the percentage of faults detected(APFD)value by 0.16 to 0.31 and reduces test time by over 50% in 46% of cases.When compared to the best results from approaches using predicted code coverage,C-CORE improves the APFD value by 1.1% to 12.3% and achieves an overall time-saving of 159.1%.
基金The National Natural Science Foundation of China(No.61300054)the Natural Science Foundation of Jiangsu Province(No.BK2011190,BK20130879)+1 种基金the Natural Science Foundation of Higher Education Institutions of Jiangsu Province(No.13KJB520018)the Science Foundation of Nanjing University of Posts&Telecommunications(No.NY212023)
文摘By analyzing the average percent of faults detected (APFD) metric and its variant versions, which are widely utilized as metrics to evaluate the fault detection efficiency of the test suite, this paper points out some limitations of the APFD series metrics. These limitations include APFD series metrics having inaccurate physical explanations and being unable to precisely describe the process of fault detection. To avoid the limitations of existing metrics, this paper proposes two improved metrics for evaluating fault detection efficiency of a test suite, including relative-APFD and relative-APFDc. The proposed metrics refer to both the speed of fault detection and the constraint of the testing source. The case study shows that the two proposed metrics can provide much more precise descriptions of the fault detection process and the fault detection efficiency of the test suite.
文摘<div style="text-align:justify;"> <span style="font-family:Verdana;">Software systems have become complex and challenging to develop and maintain because of the large size of test cases with increased scalability issues. Test case prioritization methods have been successfully utilized in test case management. However, the prohibitively exorbitant cost of large test cases is now the mainstream in the software industry. The growth of agile test-driven development has increased the expectations for software quality. Yet, our knowledge of when to use various path testing criteria for cost-effectiveness is inadequate due to the inherent complexity in software testing. Existing researches attempted to address the issue without effectively tackling the scalability of large test suites to reduce time in regression testing. In order to provide a more accurate way of fault detection in software projects, we introduced novel coverage criteria, called Incremental Cluster-based test case Prioritization (ICP), and investigated its potentials by making a comparative evaluation with three un-clustered traditional coverage-based criteria: Prime-Path Coverage (PPC), Edge-Pair Coverage (EPC) and Edge Coverage (EC) based on mutation analysis. By clustering test suites, based on their dynamic run-time behavior, the number of pair-wise comparisons is reduced significantly. To compare, we analyzed 20 functions from 25 C programs, instrumented faults into the programs, and used the Mull mutation tool to generate mutants and perform a statistical analysis of the results. The experimental results show that ICP can lead to cost-effective improvements in fault detection.</span> </div>
文摘Mobile applications usually can only access limited amount of memory. Improper use of the memory can cause memory leaks, which may lead to performance slowdowns or even cause applications to be unexpectedly killed. Although a large body of research has been devoted into the memory leak diagnosing techniques after leaks have been discovered, it is still challenging to find out the memory leak phenomena at first. Testing is the most widely used technique for failure discovery. However, traditional testing techniques are not directed for the discovery of memory leaks. They may spend lots of time on testing unlikely leaking executions and therefore can be inefficient. To address the problem, we propose a novel approach to prioritize test cases according to their likelihood to cause memory leaks in a given test suite. It firstly builds a prediction model to determine whether each test can potentially lead to memory leaks based on machine learning on selected code features. Then, for each input test case, we partly run it to get its code features and predict its likelihood to cause leaks. The most suspicious test cases will be suggested to run at first in order to reveal memory leak faults as soon as possible. Experimental evaluation on several Android applications shows that our approach is effective.
基金the Cultivation Programme for Young Backbone Teachers in Henan University of Technology,the Key Scientific Research Project of Colleges and Universities in Henan Province(No.22A520024)the Major Public Welfare Project of Henan Province(No.201300311200)the National Natural Science Foundation of China(Nos.61602154 and 61340037)。
文摘In recent years,automatic program repair approaches have developed rapidly in the field of software engineering.However,the existing program repair techniques based on genetic programming suffer from requiring verification of a large number of candidate patches,which consume a lot of computational resources.In this paper,we propose a random search and code similarity based automatic program repair(RSCSRepair).First,to reduce the verification computation effort for candidate patches,we introduce test filtering to reduce the number of test cases and use test case prioritization techniques to reconstruct a new set of test cases.Second,we use a combination of code similarity and random search for patch generation.Finally,we use a patch overfitting detection method to improve the quality of patches.In order to verify the performance of our approach,we conducted the experiments on the Defects4J benchmark.The experimental results show that RSCSRepair correctly repairs up to 54 bugs,with improvements of 14.3%,8.5%,14.3%and 10.3%for our approach compared with jKali,Nopol,CapGen and Sim Fix,respectively.