Software testing is an integral part of software development. Not only that testing exists in each software iteration cycle, but it also consumes a considerable amount of resources. While resources such as machinery a...Software testing is an integral part of software development. Not only that testing exists in each software iteration cycle, but it also consumes a considerable amount of resources. While resources such as machinery and manpower are often restricted, it is crucial to decide where and how much effort to put into testing. One way to address this problem is to identify which components of the subject under the test are more error-prone and thus demand more testing efforts. Recent development in machine learning techniques shows promising potential to predict faults in different components of a software system. This work conducts an empirical study to explore the feasibility of using static software metrics to predict software faults. We apply four machine learning techniques to construct fault prediction models from the PROMISE data set and evaluate the effectiveness of using static software metrics to build fault prediction models in four continuous versions of Apache Ant. The empirical results show that the combined software metrics generate the least misclassification errors. The fault prediction results vary significantly among different machine learning techniques and data set. Overall, fault prediction models built with the support vector machine (SVM) have the lowest misclassification errors.展开更多
To better evaluate the quality of software architecture,a metrics suite is proposed to measure the coupling of software architecture models,in which CBC is used to measure the coupling between components,CBCC is used ...To better evaluate the quality of software architecture,a metrics suite is proposed to measure the coupling of software architecture models,in which CBC is used to measure the coupling between components,CBCC is used to measure the coupling of transferring message between components,CBCCT is used to measure the coupling of software architecture,WCBCC is used to measure the coupling of transferring message with weight between components,and WCBCCT is used to measure the coupling of message transmission with weight in the whole software architecture. The proposed algorithm for the coupling metrics is applied to the design of serve software architecture. Analysis of an example validates the feasibility of this metrics suite.展开更多
In order to evaluate the structural complexity of class diagrams systematically and deeply, a new guiding framework of structural complexity is presented. An index system of structural complexity for class diagrams is...In order to evaluate the structural complexity of class diagrams systematically and deeply, a new guiding framework of structural complexity is presented. An index system of structural complexity for class diagrams is given. This article discusses the formal description of class diagrams, and presents the method of formally structural complexity metrics for class diagrams from associations, dependencies, aggregations, generalizations and so on. An applicable example proves the feasibility of the presented method.展开更多
Reliability engineering implemented early in the development process has a significant impact on improving software quality.It can assist in the design of architecture and guide later testing,which is beyond the scope...Reliability engineering implemented early in the development process has a significant impact on improving software quality.It can assist in the design of architecture and guide later testing,which is beyond the scope of traditional reliability analysis methods.Structural reliability models work for this,but most of them remain tested in only simulation case studies due to lack of actual data.Here we use software metrics for reliability modeling which are collected from source codes of post versions.Through the proposed strategy,redundant metric elements are filtered out and the rest are aggregated to represent the module reliability.We further propose a framework to automatically apply the module value and calculate overall reliability by introducing formal methods.The experimental results from an actual project show that reliability analysis at the design and development stage can be close to the validity of analysis at the test stage through reasonable application of metric data.The study also demonstrates that the proposed methods have good applicability.展开更多
Due to rapid development in software industry, it was necessary to reduce time and efforts in the software development process. Software Reusability is an important measure that can be applied to improve software deve...Due to rapid development in software industry, it was necessary to reduce time and efforts in the software development process. Software Reusability is an important measure that can be applied to improve software development and software quality. Reusability reduces time, effort, errors, and hence the overall cost of the development process. Reusability prediction models are established in the early stage of the system development cycle to support an early reusability assessment. In Object-Oriented systems, Reusability of software components (classes) can be obtained by investigating its metrics values. Analyzing software metric values can help to avoid developing components from scratch. In this paper, we use Chidamber and Kemerer (CK) metrics suite in order to identify the reuse level of object-oriented classes. Self-Organizing Map (SOM) was used to cluster datasets of CK metrics values that were extracted from three different java-based systems. The goal was to find the relationship between CK metrics values and the reusability level of the class. The reusability level of the class was classified into three main categorizes (High Reusable, Medium Reusable and Low Reusable). The clustering was based on metrics threshold values that were used to achieve the experiments. The proposed methodology succeeds in classifying classes to their reusability level (High Reusable, Medium Reusable and Low Reusable). The experiments show how SOM can be applied on software CK metrics with different sizes of SOM grids to provide different levels of metrics details. The results show that Depth of Inheritance Tree (DIT) and Number of Children (NOC) metrics dominated the clustering process, so these two metrics were discarded from the experiments to achieve a successful clustering. The most efficient SOM topology [2 × 2] grid size is used to predict the reusability of classes.展开更多
In the software engineering literature, it is commonly believed that economies of scale do not occur in case of software Development and Enhancement Projects (D&EP). Their per-unit cost does not decrease but increa...In the software engineering literature, it is commonly believed that economies of scale do not occur in case of software Development and Enhancement Projects (D&EP). Their per-unit cost does not decrease but increase with the growth of such projects product size. Thus this is diseconomies of scale that occur in them. The significance of this phenomenon results from the fact that it is commonly considered to be one of the fundamental objective causes of their low effectiveness. This is of particular significance with regard to Business Software Systems (BSS) D&EP characterized by exceptionally low effectiveness comparing to other software D&EP. Thus the paper aims at answering the following two questions: (1) Do economies of scale really not occur in BSS D&EP? (2) If economies of scale may occur in BSS D&EP, what factors are then promoting them? These issues classify into economics problems of software engineering research and practice.展开更多
Classes are key software components in an object-oriented software system. In many industrial OO software systems, there are some classes that have complicated structure and relationships. So in the processes of softw...Classes are key software components in an object-oriented software system. In many industrial OO software systems, there are some classes that have complicated structure and relationships. So in the processes of software maintenance, testing, software reengineering, software reuse and software restructure, it is a challenge for software engineers to understand these classes thoroughly. This paper proposes a class comprehension model based on constructivist learning theory, and implements a software visualization tool (MFV-Class) to help in the comprehension of a class. The tool provides multiple views of class to uncover manifold facets of class contents. It enables visualizing three object-oriented metrics of classes to help users focus on the understanding process. A case study was conducted to evaluate our approach and the toolkit.展开更多
Software systems have been employed in many fields as a means to reduce human efforts;consequently,stakeholders are interested in more updates of their capabilities.Code smells arise as one of the obstacles in the sof...Software systems have been employed in many fields as a means to reduce human efforts;consequently,stakeholders are interested in more updates of their capabilities.Code smells arise as one of the obstacles in the software industry.They are characteristics of software source code that indicate a deeper problem in design.These smells appear not only in the design but also in software implementation.Code smells introduce bugs,affect software maintainability,and lead to higher maintenance costs.Uncovering code smells can be formulated as an optimization problem of finding the best detection rules.Although researchers have recommended different techniques to improve the accuracy of code smell detection,these methods are still unstable and need to be improved.Previous research has sought only to discover a few at a time(three or five types)and did not set rules for detecting their types.Our research improves code smell detection by applying a search-based technique;we use the Whale Optimization Algorithm as a classifier to find ideal detection rules.Applying this algorithm,the Fisher criterion is utilized as a fitness function to maximize the between-class distance over the withinclass variance.The proposed framework adopts if-then detection rules during the software development life cycle.Those rules identify the types for both medium and large projects.Experiments are conducted on five open-source software projects to discover nine smell types that mostly appear in codes.The proposed detection framework has an average of 94.24%precision and 93.4%recall.These accurate values are better than other search-based algorithms of the same field.The proposed framework improves code smell detection,which increases software quality while minimizing maintenance effort,time,and cost.Additionally,the resulting classification rules are analyzed to find the software metrics that differentiate the nine code smells.展开更多
Regression testing is a widely studied research area,with the aim of meeting the quality challenges of software systems.To achieve a software system of good quality,we face high consumption of resources during testing...Regression testing is a widely studied research area,with the aim of meeting the quality challenges of software systems.To achieve a software system of good quality,we face high consumption of resources during testing.To overcome this challenge,test case prioritization(TCP)as a sub-type of regression testing is continuously investigated to achieve the testing objectives.This study provides an insight into proposing the ontology-based TCP(OTCP)approach,aimed at reducing the consumption of resources for the quality improvement and maintenance of software systems.The proposed approach uses software metrics to examine the behavior of classes of software systems.It uses Binary Logistic Regression(BLR)and AdaBoostM1 classifiers to verify correct predictions of the faulty and non-faulty classes of software systems.Reference ontology is used to match the code metrics and class attributes.We investigated five Java programs for the evaluation of the proposed approach,which was used to achieve code metrics.This study has resulted in an average percentage of fault detected(APFD)value of 94.80%,which is higher when compared to other TCP approaches.In future works,large sized programs in different languages can be used to evaluate the scalability of the proposed OTCP approach.展开更多
Technical debt is considered detrimental to the long-term success of software development,but despite the numerous studies in the literature,there are still many aspects that need to be investigated for a better under...Technical debt is considered detrimental to the long-term success of software development,but despite the numerous studies in the literature,there are still many aspects that need to be investigated for a better understanding of it.In particular,the main problems that hinder its complete understanding are the absence of a clear definition and a model for its identification,management,and forecasting.Focusing on forecasting technical debt,there is a growing notion that preventing technical debt build-up allows you to identify and address the riskiest debt items for the project before they can permanently compromise it.However,despite this high relevance,the forecast of technical debt is still little explored.To this end,this study aims to evaluate whether the quality metrics of a software system can be useful for the correct prediction of the technical debt.Therefore,the data related to the quality metrics of 8 different open-source software systems were analyzed and supplied as input to multiple machine learning algorithms to perform the prediction of the technical debt.In addition,several partitions of the initial dataset were evaluated to assess whether prediction performance could be improved by performing a data selection.The results obtained show good forecasting performance and the proposed document provides a useful approach to understanding the overall phenomenon of technical debt for practical purposes.展开更多
Network measures are useful for predicting fault-prone modules. However, existing work has not distinguished faults according to their severity. In practice, high severity faults cause serious problems and require fur...Network measures are useful for predicting fault-prone modules. However, existing work has not distinguished faults according to their severity. In practice, high severity faults cause serious problems and require further attention. In this study, we explored the utility of network measures in high severity faultproneness prediction. We constructed software source code networks for four open-source projects by extracting the dependencies between modules. We then used univariate logistic regression to investigate the associations between each network measure and fault-proneness at a high severity level. We built multivariate prediction models to examine their explanatory ability for fault-proneness, as well as evaluated their predictive effectiveness compared to code metrics under forward-release and cross-project predictions. The results revealed the following:(1) most network measures are significantly related to high severity fault-proneness;(2) network measures generally have comparable explanatory abilities and predictive powers to those of code metrics; and(3) network measures are very unstable for cross-project predictions. These results indicate that network measures are of practical value in high severity fault-proneness prediction.展开更多
This paper develops an improved structural software complexity metrics named information flow complexity which is closely related to the reliability of software. Together with the three software complexity metrics, th...This paper develops an improved structural software complexity metrics named information flow complexity which is closely related to the reliability of software. Together with the three software complexity metrics, the total software complexity is measured and some rules to reduce the complexity are presented in the paper. To illustrate and explain the process of measurement and reduction of software complexity, several examples and experiments are given. It is proposed that software complexity metrics can be measured earlier in software development and can provide substantial information of software systems whose reliabil- ity can be modeled and used in the determination of initial parameter estimation.展开更多
文摘Software testing is an integral part of software development. Not only that testing exists in each software iteration cycle, but it also consumes a considerable amount of resources. While resources such as machinery and manpower are often restricted, it is crucial to decide where and how much effort to put into testing. One way to address this problem is to identify which components of the subject under the test are more error-prone and thus demand more testing efforts. Recent development in machine learning techniques shows promising potential to predict faults in different components of a software system. This work conducts an empirical study to explore the feasibility of using static software metrics to predict software faults. We apply four machine learning techniques to construct fault prediction models from the PROMISE data set and evaluate the effectiveness of using static software metrics to build fault prediction models in four continuous versions of Apache Ant. The empirical results show that the combined software metrics generate the least misclassification errors. The fault prediction results vary significantly among different machine learning techniques and data set. Overall, fault prediction models built with the support vector machine (SVM) have the lowest misclassification errors.
基金Sponsored by the Science and Technology Department Term of Education of Heilongjiang Province(Grant No. 10541098)
文摘To better evaluate the quality of software architecture,a metrics suite is proposed to measure the coupling of software architecture models,in which CBC is used to measure the coupling between components,CBCC is used to measure the coupling of transferring message between components,CBCCT is used to measure the coupling of software architecture,WCBCC is used to measure the coupling of transferring message with weight between components,and WCBCCT is used to measure the coupling of message transmission with weight in the whole software architecture. The proposed algorithm for the coupling metrics is applied to the design of serve software architecture. Analysis of an example validates the feasibility of this metrics suite.
基金Science and Technology Department Term of Education of Heilongjiang Province(Grant No.11511127)
文摘In order to evaluate the structural complexity of class diagrams systematically and deeply, a new guiding framework of structural complexity is presented. An index system of structural complexity for class diagrams is given. This article discusses the formal description of class diagrams, and presents the method of formally structural complexity metrics for class diagrams from associations, dependencies, aggregations, generalizations and so on. An applicable example proves the feasibility of the presented method.
基金This work was supported by the National Natural Science Foundation of China(61572167)the National Key Research and Development Program of China(2016YFC0801804)the Natural Science Foundation for Anhui Higher Education Institutions of China(KJ2019A0482).
文摘Reliability engineering implemented early in the development process has a significant impact on improving software quality.It can assist in the design of architecture and guide later testing,which is beyond the scope of traditional reliability analysis methods.Structural reliability models work for this,but most of them remain tested in only simulation case studies due to lack of actual data.Here we use software metrics for reliability modeling which are collected from source codes of post versions.Through the proposed strategy,redundant metric elements are filtered out and the rest are aggregated to represent the module reliability.We further propose a framework to automatically apply the module value and calculate overall reliability by introducing formal methods.The experimental results from an actual project show that reliability analysis at the design and development stage can be close to the validity of analysis at the test stage through reasonable application of metric data.The study also demonstrates that the proposed methods have good applicability.
文摘Due to rapid development in software industry, it was necessary to reduce time and efforts in the software development process. Software Reusability is an important measure that can be applied to improve software development and software quality. Reusability reduces time, effort, errors, and hence the overall cost of the development process. Reusability prediction models are established in the early stage of the system development cycle to support an early reusability assessment. In Object-Oriented systems, Reusability of software components (classes) can be obtained by investigating its metrics values. Analyzing software metric values can help to avoid developing components from scratch. In this paper, we use Chidamber and Kemerer (CK) metrics suite in order to identify the reuse level of object-oriented classes. Self-Organizing Map (SOM) was used to cluster datasets of CK metrics values that were extracted from three different java-based systems. The goal was to find the relationship between CK metrics values and the reusability level of the class. The reusability level of the class was classified into three main categorizes (High Reusable, Medium Reusable and Low Reusable). The clustering was based on metrics threshold values that were used to achieve the experiments. The proposed methodology succeeds in classifying classes to their reusability level (High Reusable, Medium Reusable and Low Reusable). The experiments show how SOM can be applied on software CK metrics with different sizes of SOM grids to provide different levels of metrics details. The results show that Depth of Inheritance Tree (DIT) and Number of Children (NOC) metrics dominated the clustering process, so these two metrics were discarded from the experiments to achieve a successful clustering. The most efficient SOM topology [2 × 2] grid size is used to predict the reusability of classes.
文摘In the software engineering literature, it is commonly believed that economies of scale do not occur in case of software Development and Enhancement Projects (D&EP). Their per-unit cost does not decrease but increase with the growth of such projects product size. Thus this is diseconomies of scale that occur in them. The significance of this phenomenon results from the fact that it is commonly considered to be one of the fundamental objective causes of their low effectiveness. This is of particular significance with regard to Business Software Systems (BSS) D&EP characterized by exceptionally low effectiveness comparing to other software D&EP. Thus the paper aims at answering the following two questions: (1) Do economies of scale really not occur in BSS D&EP? (2) If economies of scale may occur in BSS D&EP, what factors are then promoting them? These issues classify into economics problems of software engineering research and practice.
基金Project supported by the National Basic Research Program (973)of China (No. 2002CB312101)+4 种基金 the National Natural ScienceFoundation of China (No. 60272031) Doctorate Research Foun-dation of the State Education Commission of China (No.20010335049) Zhejiang Provincial Natural Science Foundation ofChina (No. ZD0212)
文摘Classes are key software components in an object-oriented software system. In many industrial OO software systems, there are some classes that have complicated structure and relationships. So in the processes of software maintenance, testing, software reengineering, software reuse and software restructure, it is a challenge for software engineers to understand these classes thoroughly. This paper proposes a class comprehension model based on constructivist learning theory, and implements a software visualization tool (MFV-Class) to help in the comprehension of a class. The tool provides multiple views of class to uncover manifold facets of class contents. It enables visualizing three object-oriented metrics of classes to help users focus on the understanding process. A case study was conducted to evaluate our approach and the toolkit.
文摘Software systems have been employed in many fields as a means to reduce human efforts;consequently,stakeholders are interested in more updates of their capabilities.Code smells arise as one of the obstacles in the software industry.They are characteristics of software source code that indicate a deeper problem in design.These smells appear not only in the design but also in software implementation.Code smells introduce bugs,affect software maintainability,and lead to higher maintenance costs.Uncovering code smells can be formulated as an optimization problem of finding the best detection rules.Although researchers have recommended different techniques to improve the accuracy of code smell detection,these methods are still unstable and need to be improved.Previous research has sought only to discover a few at a time(three or five types)and did not set rules for detecting their types.Our research improves code smell detection by applying a search-based technique;we use the Whale Optimization Algorithm as a classifier to find ideal detection rules.Applying this algorithm,the Fisher criterion is utilized as a fitness function to maximize the between-class distance over the withinclass variance.The proposed framework adopts if-then detection rules during the software development life cycle.Those rules identify the types for both medium and large projects.Experiments are conducted on five open-source software projects to discover nine smell types that mostly appear in codes.The proposed detection framework has an average of 94.24%precision and 93.4%recall.These accurate values are better than other search-based algorithms of the same field.The proposed framework improves code smell detection,which increases software quality while minimizing maintenance effort,time,and cost.Additionally,the resulting classification rules are analyzed to find the software metrics that differentiate the nine code smells.
文摘Regression testing is a widely studied research area,with the aim of meeting the quality challenges of software systems.To achieve a software system of good quality,we face high consumption of resources during testing.To overcome this challenge,test case prioritization(TCP)as a sub-type of regression testing is continuously investigated to achieve the testing objectives.This study provides an insight into proposing the ontology-based TCP(OTCP)approach,aimed at reducing the consumption of resources for the quality improvement and maintenance of software systems.The proposed approach uses software metrics to examine the behavior of classes of software systems.It uses Binary Logistic Regression(BLR)and AdaBoostM1 classifiers to verify correct predictions of the faulty and non-faulty classes of software systems.Reference ontology is used to match the code metrics and class attributes.We investigated five Java programs for the evaluation of the proposed approach,which was used to achieve code metrics.This study has resulted in an average percentage of fault detected(APFD)value of 94.80%,which is higher when compared to other TCP approaches.In future works,large sized programs in different languages can be used to evaluate the scalability of the proposed OTCP approach.
文摘Technical debt is considered detrimental to the long-term success of software development,but despite the numerous studies in the literature,there are still many aspects that need to be investigated for a better understanding of it.In particular,the main problems that hinder its complete understanding are the absence of a clear definition and a model for its identification,management,and forecasting.Focusing on forecasting technical debt,there is a growing notion that preventing technical debt build-up allows you to identify and address the riskiest debt items for the project before they can permanently compromise it.However,despite this high relevance,the forecast of technical debt is still little explored.To this end,this study aims to evaluate whether the quality metrics of a software system can be useful for the correct prediction of the technical debt.Therefore,the data related to the quality metrics of 8 different open-source software systems were analyzed and supplied as input to multiple machine learning algorithms to perform the prediction of the technical debt.In addition,several partitions of the initial dataset were evaluated to assess whether prediction performance could be improved by performing a data selection.The results obtained show good forecasting performance and the proposed document provides a useful approach to understanding the overall phenomenon of technical debt for practical purposes.
基金supported by National Natural Science Foundation of China (Grant Nos. 61472175, 61472178, 61272082, 61272080, 91418202)Natural Science Foundation of Jiangsu Province (Grant No. BK20130014)Natural Science Foundation of Colleges in Jiangsu Province (Grant No. 13KJB520018)
文摘Network measures are useful for predicting fault-prone modules. However, existing work has not distinguished faults according to their severity. In practice, high severity faults cause serious problems and require further attention. In this study, we explored the utility of network measures in high severity faultproneness prediction. We constructed software source code networks for four open-source projects by extracting the dependencies between modules. We then used univariate logistic regression to investigate the associations between each network measure and fault-proneness at a high severity level. We built multivariate prediction models to examine their explanatory ability for fault-proneness, as well as evaluated their predictive effectiveness compared to code metrics under forward-release and cross-project predictions. The results revealed the following:(1) most network measures are significantly related to high severity fault-proneness;(2) network measures generally have comparable explanatory abilities and predictive powers to those of code metrics; and(3) network measures are very unstable for cross-project predictions. These results indicate that network measures are of practical value in high severity fault-proneness prediction.
基金the National Natural Science Foundation of China (No. 60473033)
文摘This paper develops an improved structural software complexity metrics named information flow complexity which is closely related to the reliability of software. Together with the three software complexity metrics, the total software complexity is measured and some rules to reduce the complexity are presented in the paper. To illustrate and explain the process of measurement and reduction of software complexity, several examples and experiments are given. It is proposed that software complexity metrics can be measured earlier in software development and can provide substantial information of software systems whose reliabil- ity can be modeled and used in the determination of initial parameter estimation.