摘要
针对现有基于语境特征的术语相似度算法在语境模板生成和匹配过程中存在的不足,提出基于术语的句法依赖关系自动构造术语语境模板,进而通过语境模板匹配计算术语相似度的方法。该方法既能减少语境模板的生成和匹配困难,又将术语语境特征较好地保留在模板中。针对新方法提出具体的实现步骤,并选取基因工程领域实验数据对新方法和现有典型方法进行对比评测。实验证明,新方法在计算效果方面具有明显提升。
Based on the problems in typical term context similarity algorithm, the paper puts forward a new term similarity algorithm which constructs context patterns automatically by sentences dependencies analysis and then computes term similarity by mapping context patterns. The algorithm provides a better way to construet term context patterns. Meanwhile, term context characters are kept well in patterns. The paper also presents the specific implementation steps of new algorithm, and evaluates the algorithm on basis of gene engineering field experiment data set. Experiment result demonstrates that the algorithm has an obvious improvement in computing performanee.
出处
《现代图书情报技术》
CSSCI
北大核心
2011年第9期28-33,共6页
New Technology of Library and Information Service
基金
教育部人文社会科学研究项目基金资助课题"从科技文献中挖掘术语相似性及其在知识发现中的应用"(项目编号:09YJC870031)的研究成果之一
关键词
术语相似度
语境相似度
相似度计算
Term similarity Context similarity Similarity computation