期刊文献+

基于编辑距离相似度的文本校验技术研究与应用 被引量:12

Text Proofreading Technology Based on Levenshtein Distance Similarity
下载PDF
导出
摘要 树形结构的文本配置在分布式的测控数据处理软件中使用广泛,它的正确性对数据处理而言至关重要。为了实现树形结构的文本配置自动检查和纠错,通过引入LD(Levenshtein Distance)编辑距离算法,把字符串的编辑操作推广到多叉树之间。在此基础上定义了多叉树之间的编辑距离,建立了衡量多叉树之间相似度的方法,设计了基于模糊匹配的文本配置自动校对流程,解决了精确匹配时由字符的多义性导致的查全率失真和误判的问题。根据实验结果,查全率和查准率分别达到了87.5%和100%,有效提高了基于树形结构的文本配置自动校验时的可靠性。 The correctness of text configuration based on tree structure is critically important for data processing as it is widely used in distributed data processing software of TTC(Tracking,Telemetry and Command).To achieve automatic proofreading of text configuration based on tree structure,Levenshtein Distance is introduced to extend edit operations between strings to multi-branches trees.On basis of this,tree Levenshtein Distance is defined,a method for measuring similarity between trees is developed,and a text proofreading flow with fuzzy matching method is designed.Distortion of precision rate and misjudgment caused by polysemy of characters in accurate matching are solved.According to experimental results,the recall ratio and precision ratio are up to 87.5% and100% respectively,significantly improving the reliability of automatic text proofreading based on tree structure.
出处 《飞行器测控学报》 CSCD 2015年第4期389-394,共6页 Journal of Spacecraft TT&C Technology
基金 上海航天科技创新基金资助(SAST201251)
关键词 字符串相似度 树编辑距离 模糊匹配 文本校验 similarity between strings tree Levenshtein distance fuzzy matching text proofreading
  • 相关文献

参考文献12

二级参考文献71

  • 1赵作鹏,尹志民,王潜平,许新征,江海峰.一种改进的编辑距离算法及其在数据处理中的应用[J].计算机应用,2009,29(2):424-426. 被引量:51
  • 2车万翔,刘挺,秦兵,李生.基于改进编辑距离的中文相似句子检索[J].高技术通讯,2004,14(7):15-19. 被引量:63
  • 3陈洪涛,陈德人,顾学飞.基于网络处理器的内容过滤的实现[J].计算机应用,2005,25(10):2283-2285. 被引量:3
  • 4范立新.改进的中文近似字符串匹配算法[J].计算机工程与应用,2006,42(34):172-174. 被引量:8
  • 5Cho H, Kim D, Kim J, et al. Network processor based network intrusion detection system [C] // Proceedings of ICOIN. Germany: Springer, 2004: 973- 982.
  • 6Yu J, Li J. A parallel NIDS pattern matching engine and its implementation on network processor [C] // Proceedings of SAM. USA: CSREAPress, 2005:375-381.
  • 7Yu J, Huang Q, Xue Y. Optimizing multi-thread string matching for network processor-based intrusion management system[C] // Proceedings of CNIS. USA: IASTED, 2006: 199 - 204.
  • 8Liu R, Huang N, Kao C, et al. A fast pattern match engine for network processor based network intrusion detection system [C] // Proceedings of ITCC. USA.. IEEE Press, 2004, 1:97-101.
  • 9Piyachon P, Luo Y. Efficient memory utilization on network processors for deep packet inspection [C]// Proceedings of ANCS. USA: ACMPress, 2006:71 -80.
  • 10Aho A, Corasick M. Fast pattern matching: An aid to bibliographic search [J]. Communications of the ACM, 1975, 18(6):333-340.

共引文献132

同被引文献75

引证文献12

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部