期刊文献+

基于Trie树的相似字符串查找算法 被引量:10

Similar string search algorithm based on Trie tree
下载PDF
导出
摘要 基于Trie树的相似字符串查找算法是利用编辑距离的阈值来计算每个节点的活跃节点集,已有算法由于存在大量的冗余计算,导致时间复杂度和空间复杂度都比较高。针对这个问题,采用了基于活跃节点的对称性和动态规划算法的思想对已有算法进行改进,并对活跃节点集进行了修剪,提出了New-Trie-Stack算法。该算法避免了活跃节点的重复计算,以及已有算法在保存所有已遍历节点的活跃节点集时的空间开销。实验结果表明New-Trie-Stack算法在时间复杂度和空间复杂度上都有明显的下降。 Similar string search algorithms based on Trie tree need to compute active-node set of a node by editing distance threshold.A large number of redundant computation leads to a high time and space complexity.A new algorithm named New-Trie-Stack was proposed,which utilized the symmetrical properties of active-node set and the dynamic programming method to improve the performance.It could avoid the redundancy cost on active-node set computing and storing;moreover,active-node sets were pruned.The experimental results show that New-Trie-Stack algorithm has lower time complexity and space complexity.
出处 《计算机应用》 CSCD 北大核心 2013年第8期2375-2378,共4页 journal of Computer Applications
基金 国家自然科学基金资助项目(61272184 61202090 61100007)
关键词 TRIE树 相似字符串 编辑距离 活跃节点 动态规划 Trie tree similar string edit distance active-node dynamic programming
  • 相关文献

参考文献15

  • 1LI G L, DENG D, WANG J N, et al. Pass-Join: a partition-based method for similarity joins [ J]. Proceedings of the VLDB Endow- ment, 2011,5(3) : 253 - 264.
  • 2JESTES J., LI F F, YAN Z P, et al. Probabilistic string similarity joins[ C] // Proceedings of 29th ACM SIGMOD International Confer- ence on Management of Data. New York: ACM, 2010:327 -338.
  • 3BRYAN B, EBERHARDT F, FALOUTSOS C. Compact similarity joins [ C]//ICDE 2008: Proceeding of the 24th International Con- ference on Data Engineering. Piseataway: IEEE, 2008:346 -355.
  • 4XIAO C, WANG W, LIN X M, et al. Efficient similarity joins for near duplicate detection [ C]// WWW'08: Proceedings of the 17th International Conference on World Wide Web. New York: ACM, 2011:695-704.
  • 5FENG J H, WANG J N, LI G L. Tile-Join: a Tile-based method for efficient string similarity joins [ J]. The VLDB Journal, 2012, 21 (4) : 437 -461.
  • 6李璐 王宏志 李建中 等.Ed-Sjoin;一种优化的字符串相似连接算法.计算机研究与发展,2009,:319-325.
  • 7FENG J H, LI G L. Efficient fuzzy type-ahead search in XML data [ J]. IEEE Transactions on knowledge and Data Engineering, 2012, 24(5) : 882 - 895.
  • 8FENG J H, LI G L, WANG J Y. Finding Top-k answers in keyword search over relational database using tuple units [ J]. IEEE Transac- tions on Knowledge and Data Engineering, 2011, 23 ( 12): 1781 - 1794.
  • 9FENG J H, LI G L, WANG J Y, et al. Finding and ranking com- pact connected trees for effective keyword proximity search in XML documents[ J]. Information Systems, 2010, 35 (2) : 186 - 203.
  • 10AGRAWAL P, WIDOM J. Confidence-aware join algorithms [ C]// ICDE 2009: Proceedings of the 25th International Conference on Da- ta Engineering. Washington, DC: IEEE Computer Society, 2009: 628 - 639.

共引文献1

同被引文献97

引证文献10

二级引证文献27

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部