Currently, open-source software is gradually being integrated into industrial software, while industry protocolsin industrial software are also gradually transferred to open-source community development. Industrial pr...Currently, open-source software is gradually being integrated into industrial software, while industry protocolsin industrial software are also gradually transferred to open-source community development. Industrial protocolstandardization organizations are confronted with fragmented and numerous code PR (Pull Request) and informalproposals, and differentworkflowswill lead to increased operating costs. The open-source community maintenanceteam needs software that is more intelligent to guide the identification and classification of these issues. To solvethe above problems, this paper proposes a PR review prediction model based on multi-dimensional features. Weextract 43 features of PR and divide them into five dimensions: contributor, reviewer, software project, PR, andsocial network of developers. The model integrates the above five-dimensional features, and a prediction model isbuilt based on a Random Forest Classifier to predict the review results of PR. On the other hand, to improve thequality of rejected PRs, we focus on problems raised in the review process and review comments of similar PRs.Wepropose a PR revision recommendation model based on the PR review knowledge graph. Entity information andrelationships between entities are extracted from text and code information of PRs, historical review comments,and related issues. PR revisions will be recommended to code contributors by graph-based similarity calculation.The experimental results illustrate that the above twomodels are effective and robust in PR review result predictionand PR revision recommendation.展开更多
An open source software (OSS) ecosystem refers to an OSS development community composed of many software projects and developers contributing to these projects. The projects and developers co-evolve in an ecosystem....An open source software (OSS) ecosystem refers to an OSS development community composed of many software projects and developers contributing to these projects. The projects and developers co-evolve in an ecosystem. To keep healthy evolution of such OSS ecosystems, there is a need of attracting and retaining developers, particularly project leaders and core developers who have major impact on the project and the whole team. Therefore, it is important to figure out the factors that influence developers' chance to evolve into project leaders and core developers. To identify such factors, we conducted a case study on the GNOME ecosystem. First, we collected indicators reflecting developers' subjective willingness to contribute to the project and the project environment that they stay in. Second, we calculated such indicators based on the GNOME dataset. Then, we fitted logistic regression models by taking as independent variables the resulting indicators after eliminating the most collinear ones, and taking as a dependent variable the future developer role (the core developer or project leader). The results showed that part of such indicators (e.g., the total number of projects that a developer joined) of subjective willingness and project environment significantly influenced the developers' chance to evolve into core developers and project leaders. With different validation methods, our obtained model performs well on predicting developmental core developers, resulting in stable prediction performance (0.770, F-value).展开更多
Nowadays open source software becomes highly popular and is of great importance for most software engi- neering activities. To facilitate software organization and re- trieval, tagging is extensively used in open sour...Nowadays open source software becomes highly popular and is of great importance for most software engi- neering activities. To facilitate software organization and re- trieval, tagging is extensively used in open source communi- ties. However, finding the desired software through tags in these communities such as Freecode and ohloh is still chal- lenging because of tag insufficiency. In this paper, we propose TRG (tag recommendation based on semantic graph), a novel approach to discovering and enriching tags of open source software. Firstly, we propose a semantic graph to model the semantic correlations between tags and the words in software descriptions. Then based on the graph, we design an effec- tive algorithm to recommend tags for software. With com- prehensive experiments on large-scale open source software datasets by comparing with several typical related works, we demonstrate the effectiveness and efficiency of our method in recommending proper tags.展开更多
The development,integration,and distribution of the information and spatial data infrastructure(i.e.Digital Earth;DE)necessary to support the vision and goals of Future Earth(FE)will occur in a distributed fashion,in ...The development,integration,and distribution of the information and spatial data infrastructure(i.e.Digital Earth;DE)necessary to support the vision and goals of Future Earth(FE)will occur in a distributed fashion,in very diverse technological,institutional,socio-cultural,and economic contexts around the world.This complex context and ambitious goals require bringing to bear not only the best minds,but also the best science and technologies available.Free and Open Source Software for Geospatial Applications(FOSS4G)offers mature,capable and reliable software to contribute to the creation of this infrastructure.In this paper we point to a selected set of some of the most mature and reliable FOSS4G solutions that can be used to develop the functionality required as part of DE and FE.We provide examples of large-scale,sophisticated,mission-critical applications of each software to illustrate their power and capabilities in systems where they perform roles or functionality similar to the ones they could perform as part of DE and FE.We provide information and resources to assist the readers in carrying out their own assessments to select the best FOSS4G solutions for their particular contexts and system development needs.展开更多
Open source software (OSS) has become an indispensable part of society, not only for personal use but also for corporate use. Projects developed and operated by OSS are called open source projects, and the number of s...Open source software (OSS) has become an indispensable part of society, not only for personal use but also for corporate use. Projects developed and operated by OSS are called open source projects, and the number of such projects is increasing. On the other hand, because anyone can participate in an open source project, the progress of the project is uncertain due to differences in project members’ skills, development environments, and time zones of activity. Therefore, many users and companies need to understand the development and operation status of open source project. Then, the developers carefully make decisions on upgrading or installing new OSS. In this paper, we focus on the maintenance effort estimation for open source projects considering uncertainty. Also, we evaluate the project quantitatively using Earned Value Management (EVM). Moreover, we examine the appropriateness of the model for predicting the maintenance effort expeditures. Furthermore, we discuss the appropriateness of this EVM method.展开更多
A submergible robot model has been presented, and for 3D printing measures, their parts have been modified enough. It has been shown in our design that using printable connectors—a few engines and weight arrangements...A submergible robot model has been presented, and for 3D printing measures, their parts have been modified enough. It has been shown in our design that using printable connectors—a few engines and weight arrangements can be carried out, permitting distinctive moving prospects. After presenting our configuration and delineating a bunch of potential structures, a helpful model dependent on open-source equipment and programming arrangements has been presented conditionally. The model can be effectively tried in a few makes-a plunge streams and lakes throughout the planet. The unwavering quality of the printed models can be strained distinctly in generally shallow waters. Nonetheless, we accept that their accessibility will inspire the overall population to construct and test submerged robots, subsequently accelerating the improvement of imaginative arrangements and applications.展开更多
The </span></span><span><span><span style="font-family:"">software reliability model is the stochastic model to measure the software <span>reliability quantitatively....The </span></span><span><span><span style="font-family:"">software reliability model is the stochastic model to measure the software <span>reliability quantitatively. A Hazard-Rate Model is </span></span></span></span><span><span><span style="font-family:"">the </span></span></span><span><span><span style="font-family:"">well</span></span></span><span><span><span style="font-family:"">-</span></span></span><span><span><span style="font-family:"">known one as the</span></span></span><span><span><span style="font-family:""> typical software reliability model. We propose Hazard-Rate Models Consider<span>ing Fault Severity Levels (CFSL) for Open Source Software (OSS). The purpose of </span><span>this research is to </span></span></span></span><span><span><span style="font-family:"">make </span></span></span><span><span><span style="font-family:"">the Hazard-Rate Model considering CFSL adapt to</span></span></span><span><span><span style="font-family:""> </span></span></span><span><span><span style="font-family:"">baseline hazard function and 2 kinds of faults data in Bug Tracking System <span>(BTS)</span></span></span></span><span><span><span style="font-family:"">,</span></span></span><span><span><span style="font-family:""> <i>i.e.</i>, we use the covariate vectors in Cox proportional Hazard-Rate</span></span></span><span><span><span style="font-family:""> Model. Also, <span>we show the numerical examples by evaluating the performance of our pro</span><span>posed model. As the result, we compare the performance of our model with the</span> Hazard-Rate Model CFSL.展开更多
Radial Basis Function methods for scattered data interpolation and for the numerical solution of PDEs were originally implemented in a global manner. Subsequently, it was realized that the methods could be implemented...Radial Basis Function methods for scattered data interpolation and for the numerical solution of PDEs were originally implemented in a global manner. Subsequently, it was realized that the methods could be implemented more efficiently in a local manner and that the local approaches could match or even surpass the accuracy of the global implementations. In this work, three localization approaches are compared: a local RBF method, a partition of unity method, and a recently introduced modified partition of unity method. A simple shape parameter selection method is introduced and the application of artificial viscosity to stabilize each of the local methods when approximating time-dependent PDEs is reviewed. Additionally, a new type of quasi-random center is introduced which may be better choices than other quasi-random points that are commonly used with RBF methods. All the results within the manuscript are reproducible as they are included as examples in the freely available Python Radial Basis Function Toolbox.展开更多
Software reliability model is the tool to measure the software reliability quantitatively. Hazard-Rate model is one of the most popular ones. The purpose of our research is to propose the hazard-rate model considering...Software reliability model is the tool to measure the software reliability quantitatively. Hazard-Rate model is one of the most popular ones. The purpose of our research is to propose the hazard-rate model considering fault level for Open Source Software (OSS). Moreover, we aim to adapt our proposed model to the hazard-rate considering the imperfect debugging environment. We have analyzed the trend of fault severity level by using fault data in Bug Tracking System (BTS) and proposed our model based on the result of analysis. Also, we have shown the numerical example for evaluating the performance of our proposed model. Furthermore, we have extended our proposed model to the hazard-rate considering the imperfect debugging environment and showed numerical example for evaluating the possibility of application. As the result, we found out that performance of our proposed model is better than typical hazard-rate models. Also, we verified the possibility of application of proposed model to hazard-rate model considering imperfect debugging.展开更多
The bug tracking system is well known as the project support tool of open source software. There are many categorical data sets recorded on the bug tracking system. In the past, many reliability assessment methods hav...The bug tracking system is well known as the project support tool of open source software. There are many categorical data sets recorded on the bug tracking system. In the past, many reliability assessment methods have been proposed in the research area of software reliability. Also, there are several software project analyses based on the software effort data such as the earned value management. In particular, the software reliability growth models can </span><span style="font-family:Verdana;">apply to the system testing phase of software development. On the other</span><span style="font-family:Verdana;"> hand, the software effort analysis can apply to all development phase, because the fault data is only recorded on the testing phase. We focus on the big fault data and effort data of open source software. Then, it is difficult to assess by using the typical statistical assessment method, because the data recorded on the bug tracking system is large scale. Also, we discuss the jump diffusion process model based on the estimation method of jump parameters by using the discriminant analysis. Moreover, we analyze actual big fault data to show numerical examples of software effort assessment considering many categorical data set.展开更多
Recently, many open source software (OSS) developed by various OSS projects. Also, the reliability assessment methods of OSS have been proposed by several researchers. Many methods for software reliability assessment ...Recently, many open source software (OSS) developed by various OSS projects. Also, the reliability assessment methods of OSS have been proposed by several researchers. Many methods for software reliability assessment have been proposed by software reliability growth models. Moreover, our research group has been proposed the method of reliability assessment for the OSS. Many OSS use bug tracking system (BTS) to manage software faults after it released. It keeps a detailed record of the environment in terms of the faults. There are several methods of reliability assessment based on deep learning for OSS fault data in the past. On the other hand, the data registered in BTS differences depending on OSS projects. Also, some projects have the specific collection data. The BTS has the specific collection data for each project. We focus on the recorded data. Moreover, we investigate the difference between the general data and the specific one for the estimation of OSS reliability. As a result, we show that the reliability estimation results by using specific data are better than the method using general data. Then, we show the characteristics between the specified data and general one in this paper. We also develop the GUI-based software to perform these reliability analyses so that even those who are not familiar with deep learning implementations can perform reliability analyses of OSS.展开更多
Internet-scale open source software (OSS) pro- duction in various communities generates abundant reusable resources for software developers. However, finding the de- sired and mature software with keyword queries fr...Internet-scale open source software (OSS) pro- duction in various communities generates abundant reusable resources for software developers. However, finding the de- sired and mature software with keyword queries from a considerable number of candidates, especially for the fresher, is a significant challenge because current search services often fail to understand the semantics of user queries. In this paper, we construct a software term database (STDB) by analyzing tagging data in Stack Overflow and propose a correlationbased software search (CBSS) approach that performs correlation retrieval based on the term relevance obtained from STDB. In addition, we design a novel ranking method to optimize the initial retrieval result. We explore four research questions in four experiments, respectively, to evaluate the effectiveness of the STDB and investigate the performance of the CBSS. The experiment results show that the proposed CBSS can effectively respond to keyword-based software searches and significantly outperforms other existing search services at finding mature software.展开更多
Pull-based development has become an important paradigm for distributed software development.In this model,each developer independently works on a copied repository(i.e.,a fork)from the central repository.It is essent...Pull-based development has become an important paradigm for distributed software development.In this model,each developer independently works on a copied repository(i.e.,a fork)from the central repository.It is essential for developers to maintain awareness of the state of other forks to improve collaboration efficiency.In this paper,we propose a method to automatically generate a summary of a fork.We first use the random forest method to generate the label of a fork,i.e.,feature implementation or a bug fix.Based on the information of the fork-related commits,we then use the TextRank algorithm to generate detailed activity information of the fork.Finally,we apply a set of rules to integrate all related information to construct a complete fork summary.To validate the effectiveness of our method,we conduct 30 groups of manual experiment and 77 groups of case studies on Github.We propose Fea_(avg)to evaluate the performance of Fea_(avg)the generated fork summary,considering the content accuracy,content integrity,sentence fluency,and label extraction accuracy.The results show that the average of of the fork summary generated by this method is 0.672.More than 63%of project maintainers and the contributors believe that the fork summary can improve development efficiency.展开更多
Background The PandaX-4T experiment is a next-generation dark matter search program located in the China Jinping Underground Laboratory.To ensure the stability of the complex instrument during the planned operation,th...Background The PandaX-4T experiment is a next-generation dark matter search program located in the China Jinping Underground Laboratory.To ensure the stability of the complex instrument during the planned operation,the status of the facility should be monitored continuously.Purpose The paper reports the design of the slow control system for the experiment.The system is used to monitor the facility status and generate an alarm signal while the detector is in abnormal status.Methods Low-cost hardware is employed for the distributed data collection.Python-based data collection program was developed.We also used the open source software for data storage,visualization and anomaly detection,so that the development of the system is simplified.Conclusion We finally achieved a low-cost and robust slow control system for the prototype detector in short time.It is running well currently.The system will be integrated with the PandaX-4T facility in late 2019.展开更多
基金support of National Social Science Fund(NSSF)under Grant(No.22BTQ033).
文摘Currently, open-source software is gradually being integrated into industrial software, while industry protocolsin industrial software are also gradually transferred to open-source community development. Industrial protocolstandardization organizations are confronted with fragmented and numerous code PR (Pull Request) and informalproposals, and differentworkflowswill lead to increased operating costs. The open-source community maintenanceteam needs software that is more intelligent to guide the identification and classification of these issues. To solvethe above problems, this paper proposes a PR review prediction model based on multi-dimensional features. Weextract 43 features of PR and divide them into five dimensions: contributor, reviewer, software project, PR, andsocial network of developers. The model integrates the above five-dimensional features, and a prediction model isbuilt based on a Random Forest Classifier to predict the review results of PR. On the other hand, to improve thequality of rejected PRs, we focus on problems raised in the review process and review comments of similar PRs.Wepropose a PR revision recommendation model based on the PR review knowledge graph. Entity information andrelationships between entities are extracted from text and code information of PRs, historical review comments,and related issues. PR revisions will be recommended to code contributors by graph-based similarity calculation.The experimental results illustrate that the above twomodels are effective and robust in PR review result predictionand PR revision recommendation.
基金This work is supported by the National Key Research and Development Program of China under Grant No. 2016YFB0800400, the National Basic Research 973 Program of China under Grant No. 2014CB340404, the National Natural Science Foundation of China under Grant Nos. 61572371, 61273216, and 61272111, the China Postdoctoral Science Foundation (CPSF) under Grant No. 2015M582272, the Natural Science Foundation of Hubei Province of China under Grant No. 2016CFB158, and the Fundamental Research Funds for the Central Universities of China under Grant No. 2042016kf0033.
文摘An open source software (OSS) ecosystem refers to an OSS development community composed of many software projects and developers contributing to these projects. The projects and developers co-evolve in an ecosystem. To keep healthy evolution of such OSS ecosystems, there is a need of attracting and retaining developers, particularly project leaders and core developers who have major impact on the project and the whole team. Therefore, it is important to figure out the factors that influence developers' chance to evolve into project leaders and core developers. To identify such factors, we conducted a case study on the GNOME ecosystem. First, we collected indicators reflecting developers' subjective willingness to contribute to the project and the project environment that they stay in. Second, we calculated such indicators based on the GNOME dataset. Then, we fitted logistic regression models by taking as independent variables the resulting indicators after eliminating the most collinear ones, and taking as a dependent variable the future developer role (the core developer or project leader). The results showed that part of such indicators (e.g., the total number of projects that a developer joined) of subjective willingness and project environment significantly influenced the developers' chance to evolve into core developers and project leaders. With different validation methods, our obtained model performs well on predicting developmental core developers, resulting in stable prediction performance (0.770, F-value).
文摘Nowadays open source software becomes highly popular and is of great importance for most software engi- neering activities. To facilitate software organization and re- trieval, tagging is extensively used in open source communi- ties. However, finding the desired software through tags in these communities such as Freecode and ohloh is still chal- lenging because of tag insufficiency. In this paper, we propose TRG (tag recommendation based on semantic graph), a novel approach to discovering and enriching tags of open source software. Firstly, we propose a semantic graph to model the semantic correlations between tags and the words in software descriptions. Then based on the graph, we design an effec- tive algorithm to recommend tags for software. With com- prehensive experiments on large-scale open source software datasets by comparing with several typical related works, we demonstrate the effectiveness and efficiency of our method in recommending proper tags.
文摘The development,integration,and distribution of the information and spatial data infrastructure(i.e.Digital Earth;DE)necessary to support the vision and goals of Future Earth(FE)will occur in a distributed fashion,in very diverse technological,institutional,socio-cultural,and economic contexts around the world.This complex context and ambitious goals require bringing to bear not only the best minds,but also the best science and technologies available.Free and Open Source Software for Geospatial Applications(FOSS4G)offers mature,capable and reliable software to contribute to the creation of this infrastructure.In this paper we point to a selected set of some of the most mature and reliable FOSS4G solutions that can be used to develop the functionality required as part of DE and FE.We provide examples of large-scale,sophisticated,mission-critical applications of each software to illustrate their power and capabilities in systems where they perform roles or functionality similar to the ones they could perform as part of DE and FE.We provide information and resources to assist the readers in carrying out their own assessments to select the best FOSS4G solutions for their particular contexts and system development needs.
文摘Open source software (OSS) has become an indispensable part of society, not only for personal use but also for corporate use. Projects developed and operated by OSS are called open source projects, and the number of such projects is increasing. On the other hand, because anyone can participate in an open source project, the progress of the project is uncertain due to differences in project members’ skills, development environments, and time zones of activity. Therefore, many users and companies need to understand the development and operation status of open source project. Then, the developers carefully make decisions on upgrading or installing new OSS. In this paper, we focus on the maintenance effort estimation for open source projects considering uncertainty. Also, we evaluate the project quantitatively using Earned Value Management (EVM). Moreover, we examine the appropriateness of the model for predicting the maintenance effort expeditures. Furthermore, we discuss the appropriateness of this EVM method.
文摘A submergible robot model has been presented, and for 3D printing measures, their parts have been modified enough. It has been shown in our design that using printable connectors—a few engines and weight arrangements can be carried out, permitting distinctive moving prospects. After presenting our configuration and delineating a bunch of potential structures, a helpful model dependent on open-source equipment and programming arrangements has been presented conditionally. The model can be effectively tried in a few makes-a plunge streams and lakes throughout the planet. The unwavering quality of the printed models can be strained distinctly in generally shallow waters. Nonetheless, we accept that their accessibility will inspire the overall population to construct and test submerged robots, subsequently accelerating the improvement of imaginative arrangements and applications.
文摘The </span></span><span><span><span style="font-family:"">software reliability model is the stochastic model to measure the software <span>reliability quantitatively. A Hazard-Rate Model is </span></span></span></span><span><span><span style="font-family:"">the </span></span></span><span><span><span style="font-family:"">well</span></span></span><span><span><span style="font-family:"">-</span></span></span><span><span><span style="font-family:"">known one as the</span></span></span><span><span><span style="font-family:""> typical software reliability model. We propose Hazard-Rate Models Consider<span>ing Fault Severity Levels (CFSL) for Open Source Software (OSS). The purpose of </span><span>this research is to </span></span></span></span><span><span><span style="font-family:"">make </span></span></span><span><span><span style="font-family:"">the Hazard-Rate Model considering CFSL adapt to</span></span></span><span><span><span style="font-family:""> </span></span></span><span><span><span style="font-family:"">baseline hazard function and 2 kinds of faults data in Bug Tracking System <span>(BTS)</span></span></span></span><span><span><span style="font-family:"">,</span></span></span><span><span><span style="font-family:""> <i>i.e.</i>, we use the covariate vectors in Cox proportional Hazard-Rate</span></span></span><span><span><span style="font-family:""> Model. Also, <span>we show the numerical examples by evaluating the performance of our pro</span><span>posed model. As the result, we compare the performance of our model with the</span> Hazard-Rate Model CFSL.
文摘Radial Basis Function methods for scattered data interpolation and for the numerical solution of PDEs were originally implemented in a global manner. Subsequently, it was realized that the methods could be implemented more efficiently in a local manner and that the local approaches could match or even surpass the accuracy of the global implementations. In this work, three localization approaches are compared: a local RBF method, a partition of unity method, and a recently introduced modified partition of unity method. A simple shape parameter selection method is introduced and the application of artificial viscosity to stabilize each of the local methods when approximating time-dependent PDEs is reviewed. Additionally, a new type of quasi-random center is introduced which may be better choices than other quasi-random points that are commonly used with RBF methods. All the results within the manuscript are reproducible as they are included as examples in the freely available Python Radial Basis Function Toolbox.
文摘Software reliability model is the tool to measure the software reliability quantitatively. Hazard-Rate model is one of the most popular ones. The purpose of our research is to propose the hazard-rate model considering fault level for Open Source Software (OSS). Moreover, we aim to adapt our proposed model to the hazard-rate considering the imperfect debugging environment. We have analyzed the trend of fault severity level by using fault data in Bug Tracking System (BTS) and proposed our model based on the result of analysis. Also, we have shown the numerical example for evaluating the performance of our proposed model. Furthermore, we have extended our proposed model to the hazard-rate considering the imperfect debugging environment and showed numerical example for evaluating the possibility of application. As the result, we found out that performance of our proposed model is better than typical hazard-rate models. Also, we verified the possibility of application of proposed model to hazard-rate model considering imperfect debugging.
文摘The bug tracking system is well known as the project support tool of open source software. There are many categorical data sets recorded on the bug tracking system. In the past, many reliability assessment methods have been proposed in the research area of software reliability. Also, there are several software project analyses based on the software effort data such as the earned value management. In particular, the software reliability growth models can </span><span style="font-family:Verdana;">apply to the system testing phase of software development. On the other</span><span style="font-family:Verdana;"> hand, the software effort analysis can apply to all development phase, because the fault data is only recorded on the testing phase. We focus on the big fault data and effort data of open source software. Then, it is difficult to assess by using the typical statistical assessment method, because the data recorded on the bug tracking system is large scale. Also, we discuss the jump diffusion process model based on the estimation method of jump parameters by using the discriminant analysis. Moreover, we analyze actual big fault data to show numerical examples of software effort assessment considering many categorical data set.
文摘Recently, many open source software (OSS) developed by various OSS projects. Also, the reliability assessment methods of OSS have been proposed by several researchers. Many methods for software reliability assessment have been proposed by software reliability growth models. Moreover, our research group has been proposed the method of reliability assessment for the OSS. Many OSS use bug tracking system (BTS) to manage software faults after it released. It keeps a detailed record of the environment in terms of the faults. There are several methods of reliability assessment based on deep learning for OSS fault data in the past. On the other hand, the data registered in BTS differences depending on OSS projects. Also, some projects have the specific collection data. The BTS has the specific collection data for each project. We focus on the recorded data. Moreover, we investigate the difference between the general data and the specific one for the estimation of OSS reliability. As a result, we show that the reliability estimation results by using specific data are better than the method using general data. Then, we show the characteristics between the specified data and general one in this paper. We also develop the GUI-based software to perform these reliability analyses so that even those who are not familiar with deep learning implementations can perform reliability analyses of OSS.
基金The research was supported by the National Natural Science Foundation of China (Grant Nos. 61432020, 61303064, 61472430, 61502512) and National Grand R&D Plan (2016YFB 1000805).
文摘Internet-scale open source software (OSS) pro- duction in various communities generates abundant reusable resources for software developers. However, finding the de- sired and mature software with keyword queries from a considerable number of candidates, especially for the fresher, is a significant challenge because current search services often fail to understand the semantics of user queries. In this paper, we construct a software term database (STDB) by analyzing tagging data in Stack Overflow and propose a correlationbased software search (CBSS) approach that performs correlation retrieval based on the term relevance obtained from STDB. In addition, we design a novel ranking method to optimize the initial retrieval result. We explore four research questions in four experiments, respectively, to evaluate the effectiveness of the STDB and investigate the performance of the CBSS. The experiment results show that the proposed CBSS can effectively respond to keyword-based software searches and significantly outperforms other existing search services at finding mature software.
基金This work was supported by the National Key Research and Development Program of China(2018YFB1004202).
文摘Pull-based development has become an important paradigm for distributed software development.In this model,each developer independently works on a copied repository(i.e.,a fork)from the central repository.It is essential for developers to maintain awareness of the state of other forks to improve collaboration efficiency.In this paper,we propose a method to automatically generate a summary of a fork.We first use the random forest method to generate the label of a fork,i.e.,feature implementation or a bug fix.Based on the information of the fork-related commits,we then use the TextRank algorithm to generate detailed activity information of the fork.Finally,we apply a set of rules to integrate all related information to construct a complete fork summary.To validate the effectiveness of our method,we conduct 30 groups of manual experiment and 77 groups of case studies on Github.We propose Fea_(avg)to evaluate the performance of Fea_(avg)the generated fork summary,considering the content accuracy,content integrity,sentence fluency,and label extraction accuracy.The results show that the average of of the fork summary generated by this method is 0.672.More than 63%of project maintainers and the contributors believe that the fork summary can improve development efficiency.
基金the grants from National Science Foundation of China(Nos.11435008,11455001,11505112 and 11525522)a grant from the Ministry of Science and Technology of China(No.2016YFA0400301)+1 种基金We thank the support of a key laboratory grant from the Office of Science and Technology,Shanghai Municipal Government(No.11DZ2260700)the support from the Key Laboratory for Particle Physics,Astrophysics and Cosmology,Ministry of Education.This work is supported in part by the Chinese Academy of Sciences Center for Excellence in Particle Physics(CCEPP).
文摘Background The PandaX-4T experiment is a next-generation dark matter search program located in the China Jinping Underground Laboratory.To ensure the stability of the complex instrument during the planned operation,the status of the facility should be monitored continuously.Purpose The paper reports the design of the slow control system for the experiment.The system is used to monitor the facility status and generate an alarm signal while the detector is in abnormal status.Methods Low-cost hardware is employed for the distributed data collection.Python-based data collection program was developed.We also used the open source software for data storage,visualization and anomaly detection,so that the development of the system is simplified.Conclusion We finally achieved a low-cost and robust slow control system for the prototype detector in short time.It is running well currently.The system will be integrated with the PandaX-4T facility in late 2019.