期刊文献+

Levenshtein算法优化及在题库判重中的应用 被引量:1

Optimization of the Levenshtein algorithm and its application in repeatability judgment for test bank
下载PDF
导出
摘要 为了解决Levenshtein距离算法在长文本和大规模匹配效率的不足,本文针对Levenshtein距离算法提出一种提前终止的优化策略.首先根据Levenshtein距离矩阵中元素内在的联系,归纳总结出一个递推关系式.再依据此递推关系式,提出一种提前终止策略,可提前判断两个文本是否满足预先设定的相似度阈值.经过多个学科题库判重实验的佐证,本文的提前终止策略能显著减少计算时间. In order to overcome the disadvantages of the Levenshtein distance algorithm for long text and large-scale matching, we propose an early termination strategy for the Levenshtein distance algorithm. Firstly, according to the intrinsic relationship between elements in the Levenshtein distance matrix, we sum up a recurrence relation. Based on this relation, an early termination strategy is proposed to determine early-on whether two texts satisfy the predefined similarity threshold. Through several tests on different subjects, it is demonstrated that the early termination strategy can significantly reduce calculation time.
作者 张衡 陈良育 ZHANG Heng;CHEN Liang-yu(Shanghai Key Laboratory of Trustworthy Computing,East China Normal University,Shanghai 200062,China)
出处 《华东师范大学学报(自然科学版)》 CAS CSCD 北大核心 2018年第5期154-163,共10页 Journal of East China Normal University(Natural Science)
基金 国家自然科学基金(11471209)
关键词 题库匹配 文本相似度 Levenshtein编辑距离 bank match text similarity Levenshtein edit distance
  • 相关文献

参考文献6

二级参考文献54

共引文献142

同被引文献6

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部