Recent studies have applied different approaches for summarizing software artifacts, and yet very few efforts have been made in summarizing the source code fragments available on web. This paper investigates the feasi...Recent studies have applied different approaches for summarizing software artifacts, and yet very few efforts have been made in summarizing the source code fragments available on web. This paper investigates the feasibility of generating code fragment summaries by using supervised learning algorithms. We hire a crowd of ten individuals from the same work place to extract source code features on a cor- pus of 127 code fragments retrieved from Eclipse and Net- Beans Official frequently asked questions (FAQs). Human an- notators suggest summary lines. Our machine learning algo- rithms produce better results with the precision of 82% and perform statistically better than existing code fragment classi- fiers. Evaluation of algorithms on several statistical measures endorses our result. This result is promising when employing mechanisms such as data-driven crowd enlistment improve the efficacy of existing code fragment classifiers.展开更多
基金We would like to extend our gratitude to the individu- als who dedicated their time and effort to participate in crowdsourcing activ- ity and annotation of our code fragment corpus. This work was supported in part by National Program on Key Basic Research Project (2013CB035906), in part by the New Century Excellent Talents in University (NCET-13-0073), and in part by the National Natural Science Foundation of China (Grant Nos. 61175062, 61370144).
文摘Recent studies have applied different approaches for summarizing software artifacts, and yet very few efforts have been made in summarizing the source code fragments available on web. This paper investigates the feasibility of generating code fragment summaries by using supervised learning algorithms. We hire a crowd of ten individuals from the same work place to extract source code features on a cor- pus of 127 code fragments retrieved from Eclipse and Net- Beans Official frequently asked questions (FAQs). Human an- notators suggest summary lines. Our machine learning algo- rithms produce better results with the precision of 82% and perform statistically better than existing code fragment classi- fiers. Evaluation of algorithms on several statistical measures endorses our result. This result is promising when employing mechanisms such as data-driven crowd enlistment improve the efficacy of existing code fragment classifiers.