摘要
为进一步提高句子相似度算法的准确性并提升其在复杂语境中的适用性,综合编辑距离、关键词及同义词语义方法,设计出面向用户查询意图的句子相似度分层算法。在充分分析实验数据用途的基础上,研究数据的特征分布,借助自然标注将句子相似度计算建模为多层次优化问题。经仿真实验证实该算法是有效的,F值可达到0.6019。
In order to improve the accuracy of sentence similarity computation algorithm and further enhance its applicability in complex context,a hierarchical sentence similarity algorithm for user-oriented query intention was designed,integrating technologies such as edit distance,keyword and synonyms semantic method,and natural annotation.With thorough analyzing of the experimental data and its feature distribution,a multi-level optimization strategy was put forward.The experimental results confirm the algorithm in this paper is effective and achieves F value of 0.6019.
出处
《计算机科学》
CSCD
北大核心
2015年第1期227-231,共5页
Computer Science
基金
国家自然科学基金(61070119
61370139)
北京市属高等学校创新团队建设与教师职业发展计划项目(IDHT20130519)
北京市教委专项(PXM2013_014224_000042
PXM2014_014224_000067)资助
关键词
句子相似度计算
语义一致
编辑距离
关键词特征
用户查询意图
Sentence similarity computation
Semantic conformity
Edit distance
Keyword feature
User query intention