摘要
针对基于文本的需求跟踪方法严重依赖文本质量的问题,提出了一种利用命名实体识别技术标注制品文档关键词的需求跟踪方法。该方法通过代码实体上下文构建命名实体识别模型,解决了抽象语法树和正则表达式无法解析非源代码形式的软件制品问题。利用命名实体识别模型标志出软件制品中的代码实体之后,该方法将软件制品转换为文档集合并进行语义聚类,最后再通过映射算法创建制品间的需求跟踪关系。实验结果表明,与基于所有词项和基于高权重词项的需求跟踪方法相比,该方法能够有效提高需求跟踪结果的质量。
Aiming at the problem that requirement traceability approaches based on textual information were rely heavily on the quality of the text, this paper proposed a traceability approach utilized named entity recognition technology to identify key words in software artefacts. Firstly, the proposed method constructed a named entity recognition model through the context of code entity, which solved the issue that abstract syntax tree and the regular expression was not able to parse non-source form software artefacts. After that, the proposed method transformed software artefacts to document set, and then carried out a se- mantic clustering process to cluster documents. Finally, the proposed method created trace links between software artefacts using the mapping algorithm. The experimental results show that comparing with those traceability approaches based on the all terms and high weight terms, this method is able to effectively improve the quality of requirement tracing results.
出处
《计算机应用研究》
CSCD
北大核心
2016年第1期132-135,146,共5页
Application Research of Computers
基金
国家自然科学基金资助项目(61402108)
福建省中青年教师教育科研项目(JA15348
JA13227
JB12146)
福建省科技厅高校项目(JK2012033)
福建工程学院科研启动基金资助项目(GY-Z13113
GY-Z14068)
关键词
需求跟踪
命名实体识别
语义聚类
自然语言处理
权重计算
requirement traceability
named entity recognition
semantic clustenng
natural language process
term weigh-ting