Structural development defects essentially refer to code structure that violates object-oriented design principles. They make program maintenance challenging and deteriorate software quality over time. Various detecti...Structural development defects essentially refer to code structure that violates object-oriented design principles. They make program maintenance challenging and deteriorate software quality over time. Various detection approaches, ranging from traditional heuristic algorithms to machine learning methods, are used to identify these defects. Ensemble learning methods have strengthened the detection of these defects. However, existing approaches do not simultaneously exploit the capabilities of extracting relevant features from pre-trained models and the performance of neural networks for the classification task. Therefore, our goal has been to design a model that combines a pre-trained model to extract relevant features from code excerpts through transfer learning and a bagging method with a base estimator, a dense neural network, for defect classification. To achieve this, we composed multiple samples of the same size with replacements from the imbalanced dataset MLCQ1. For all the samples, we used the CodeT5-small variant to extract features and trained a bagging method with the neural network Roberta Classification Head to classify defects based on these features. We then compared this model to RandomForest, one of the ensemble methods that yields good results. Our experiments showed that the number of base estimators to use for bagging depends on the defect to be detected. Next, we observed that it was not necessary to use a data balancing technique with our model when the imbalance rate was 23%. Finally, for blob detection, RandomForest had a median MCC value of 0.36 compared to 0.12 for our method. However, our method was predominant in Long Method detection with a median MCC value of 0.53 compared to 0.42 for RandomForest. These results suggest that the performance of ensemble methods in detecting structural development defects is dependent on specific defects.展开更多
Aiming at the characteristic of the dependency between the application components and the application server platform, a rejuvenation strategy with two different levels of rejuvenation granularities is put forward in ...Aiming at the characteristic of the dependency between the application components and the application server platform, a rejuvenation strategy with two different levels of rejuvenation granularities is put forward in this paper including the application component reiuvenation and the application server system rejuvenation. The availability and maintenance cost functions are obtained by means of establishing the application server aging model and the boundary condition of the optimal rejuvenation time is analyzed. Theory analysis indicates that the two-level rejuvenation strategy is superior to the traditional single level one. Finally, evaluation experiments are carried out and numerical result shows that compared with the traditional rejuvenation policy, the rejuvenation strategy proposed in this paper can further increase availability of the application server and reduce maintenance cost.展开更多
Understanding control flows in a computer program is essential for many software engineering tasks such as testing, debugging, reverse engineering, and maintenance. In this paper ,we present a control flow analysis te...Understanding control flows in a computer program is essential for many software engineering tasks such as testing, debugging, reverse engineering, and maintenance. In this paper ,we present a control flow analysis technique to analyze the control flow in Java bytecode. To perform the analysis, we construct a control flow graph(CFG) for Java bytecode not only at the intraprocedural level but also at the interprocedural level. We also discuss some applications of a CFG in a maintenance environment for Java bytecode.展开更多
There exists a consensus that software architecture (SA) plays a central role in software development and also plays an important role in the lifecycle phases after software delivery. Particularly, SA can be used to r...There exists a consensus that software architecture (SA) plays a central role in software development and also plays an important role in the lifecycle phases after software delivery. Particularly, SA can be used to reduce the great difficulty and cost of software maintenance and evolution. In this paper, runtime software architecture (RSA) based on reflective middleware is proposed to support architecture-based software maintenance and evolution. In this approach, the actual states and behaviors of the runtime system can be observed and manipulated in a consistent and understandable way through its architectural view. Being an accurate, up-to-date, semantic and operable view of SA, RSA looks components and connectors as white-box entities to accurately and thoroughly describe the runtime system, extends traditional architecture description languages to formally describe itself and naturally inherit plentiful semantics in traditional views of SA, and utilizes reflective middleware to observe and manipulate the runtime system. In order to demonstrate the feasibility of this approach, a reflective J2EE application server, called PKUAS, is implemented to observe and manipulate the components, connectors and constraints in the runtime system. Finally, the performance evaluation proves that making RSA explicit and operable at runtime has little effect on the runtime system.展开更多
Code smell is the product of improper design and operation,which may be introduced in many situations.It will cause serious problems for further software development and maintenance.Currently,most code smell detection...Code smell is the product of improper design and operation,which may be introduced in many situations.It will cause serious problems for further software development and maintenance.Currently,most code smell detection methods detect through a single type of software data.There are restrictions on detecting code smells with complex definitions and characteristics.In this paper,an approach of applying multi-dimensional software data is proposed.A complex network was built through structural data and historical version data,and code smell instances were determined by searching the network.We designed two smells detection strategies were designed and evaluated them in four open source projects.The results demonstrate that the proposed method has 23%and 15%higher F-measures on Shotgun Surgery and Parallel Inheritance Hierarchy than the existing mainstream detection ways.The code smell detection based on multi-dimensional software data and complex network is effective,and this method of processing multidimensional software data is also applicable for data-driven software research.展开更多
文摘Structural development defects essentially refer to code structure that violates object-oriented design principles. They make program maintenance challenging and deteriorate software quality over time. Various detection approaches, ranging from traditional heuristic algorithms to machine learning methods, are used to identify these defects. Ensemble learning methods have strengthened the detection of these defects. However, existing approaches do not simultaneously exploit the capabilities of extracting relevant features from pre-trained models and the performance of neural networks for the classification task. Therefore, our goal has been to design a model that combines a pre-trained model to extract relevant features from code excerpts through transfer learning and a bagging method with a base estimator, a dense neural network, for defect classification. To achieve this, we composed multiple samples of the same size with replacements from the imbalanced dataset MLCQ1. For all the samples, we used the CodeT5-small variant to extract features and trained a bagging method with the neural network Roberta Classification Head to classify defects based on these features. We then compared this model to RandomForest, one of the ensemble methods that yields good results. Our experiments showed that the number of base estimators to use for bagging depends on the defect to be detected. Next, we observed that it was not necessary to use a data balancing technique with our model when the imbalance rate was 23%. Finally, for blob detection, RandomForest had a median MCC value of 0.36 compared to 0.12 for our method. However, our method was predominant in Long Method detection with a median MCC value of 0.53 compared to 0.42 for RandomForest. These results suggest that the performance of ensemble methods in detecting structural development defects is dependent on specific defects.
基金Supported by the National Natural Foundation ofChina (60473098) IBMChina Research Lab Joint Project
文摘Aiming at the characteristic of the dependency between the application components and the application server platform, a rejuvenation strategy with two different levels of rejuvenation granularities is put forward in this paper including the application component reiuvenation and the application server system rejuvenation. The availability and maintenance cost functions are obtained by means of establishing the application server aging model and the boundary condition of the optimal rejuvenation time is analyzed. Theory analysis indicates that the two-level rejuvenation strategy is superior to the traditional single level one. Finally, evaluation experiments are carried out and numerical result shows that compared with the traditional rejuvenation policy, the rejuvenation strategy proposed in this paper can further increase availability of the application server and reduce maintenance cost.
文摘Understanding control flows in a computer program is essential for many software engineering tasks such as testing, debugging, reverse engineering, and maintenance. In this paper ,we present a control flow analysis technique to analyze the control flow in Java bytecode. To perform the analysis, we construct a control flow graph(CFG) for Java bytecode not only at the intraprocedural level but also at the interprocedural level. We also discuss some applications of a CFG in a maintenance environment for Java bytecode.
文摘There exists a consensus that software architecture (SA) plays a central role in software development and also plays an important role in the lifecycle phases after software delivery. Particularly, SA can be used to reduce the great difficulty and cost of software maintenance and evolution. In this paper, runtime software architecture (RSA) based on reflective middleware is proposed to support architecture-based software maintenance and evolution. In this approach, the actual states and behaviors of the runtime system can be observed and manipulated in a consistent and understandable way through its architectural view. Being an accurate, up-to-date, semantic and operable view of SA, RSA looks components and connectors as white-box entities to accurately and thoroughly describe the runtime system, extends traditional architecture description languages to formally describe itself and naturally inherit plentiful semantics in traditional views of SA, and utilizes reflective middleware to observe and manipulate the runtime system. In order to demonstrate the feasibility of this approach, a reflective J2EE application server, called PKUAS, is implemented to observe and manipulate the components, connectors and constraints in the runtime system. Finally, the performance evaluation proves that making RSA explicit and operable at runtime has little effect on the runtime system.
基金Anhui Provincial Natural Science Foundation(2008085MF189,1908085MF206)National Natural Science Foundation of China(NO.61402007)the Scientific Research Foundation for the Returned Overseas Chinese Scholars,State Education Ministry.
文摘Code smell is the product of improper design and operation,which may be introduced in many situations.It will cause serious problems for further software development and maintenance.Currently,most code smell detection methods detect through a single type of software data.There are restrictions on detecting code smells with complex definitions and characteristics.In this paper,an approach of applying multi-dimensional software data is proposed.A complex network was built through structural data and historical version data,and code smell instances were determined by searching the network.We designed two smells detection strategies were designed and evaluated them in four open source projects.The results demonstrate that the proposed method has 23%and 15%higher F-measures on Shotgun Surgery and Parallel Inheritance Hierarchy than the existing mainstream detection ways.The code smell detection based on multi-dimensional software data and complex network is effective,and this method of processing multidimensional software data is also applicable for data-driven software research.