期刊文献+

海量数据库中实时包的判重算法 被引量:2

Algorithm for Judging Duplicate Real-time Packet in Massive Database
下载PDF
导出
摘要 关系数据库中的索引技术可以快速判断记录重复,但对于频繁更新的海量数据库,维护索引的时间与资源开销较大。针对交通量数据包及其海量数据库的特点,提出一个交通量实时包的时序区间模型,给出并证明了一个基于区间记录的快速判重算法,分析了算法的复杂度,探讨了改进算法的方法。该算法具有复杂度与数据库大小无关、高效、易于实现等特点。 The index technique in relational database can judge rapidly a duplicate record. But the cost of time and resource is more for maintaining the index in massiye database when its records updates frequently. Considering the characteristic of traffic packet and traffic massive database, this paper puts forward a time order interval model for trafl'ic real-time packet, gives and proves an algorithm for rapidly judging a duplicate packet based on interval records, analyses the algorithm complexity, and discusses several improved methods. The algorithm introduced here has the characteristic that the complexity is independent on the database size, the efficiency is high, and can be realized easily.
作者 张立芳
出处 《计算机工程》 CAS CSCD 北大核心 2008年第21期76-77,80,共3页 Computer Engineering
基金 湖南省交通厅科研基金资助项目(200610)
关键词 海量数据库 重复 公路交通量 实时包 massive database duplicate highway traffic real-time packet
  • 相关文献

参考文献6

  • 1交通部规划研究院.公路交通量调查设备与数据服务中心基础交通数据通讯协议(06年修定稿)[EB/OL].(2006—03-10).http://jd.tpri.gov.cn/gzjl/bzgf.jsp.
  • 2于胜英.进一步完善交通量和运输量统计体系建设为公路水路交通发展提供有力的信息支撑[EB/OL].(2006-04).http://jd.tpri.gov.cn/gzjl/2006jtljh.htm.
  • 3朱恒民,王宁生.一种改进的相似重复记录检测方法[J].控制与决策,2006,21(7):805-808. 被引量:12
  • 4邱越峰,田增平,季文贇,周傲英.一种高效的检测相似重复记录的方法[J].计算机学报,2001,24(1):69-77. 被引量:72
  • 5Mannino M V.数据库设计、应用开发与管理[M].2版.唐常杰,译.北京:电子工业出版社,2005.
  • 6胡百敬.Microsoft SQL Server性能调校[M].北京:电子工业出版社,2004.

二级参考文献18

  • 1[1]Bitton D, DeWitt D J. Duplicate record elimination in large data files. ACM Trans Database Systems, 1983, 8(2):255-65
  • 2[2]Hernandez M, Stolfo S. The Merge/Purge problem for large databases. In: Proc ACM SIGMOD International Conference on Management of Data, 1995. 127-138
  • 3[3]Howard B Newcombe, Kennedy J M, Axford S J, James A P. Automatic linkage of vital records. Science, 1959, 130:954-959
  • 4[4]DeWitt D J, Naught J F, Schneider D A. An evaluation of non-equijoin algorithms. In: Proc 17th International Conference on Very Large Databases, Barcelona, Spain, 1991. 443-452
  • 5[5]Hylton J A. Identifying and merging related bibliographic records[MS dissertation]. MIT: MIT Laboratory for Computer Science Technical Report 678, 1996
  • 6[6]Monge A E, Elkan C P. An efficient domain-independent algorithm for detecting approximately duplicate database records. In: Proc DMKD'97, Tucson Arizona, 1997
  • 7[7]Kukich K. Techniques for automatically correcting words in text. ACM Computing Surveys, 1992, 24(4):377-439
  • 8[8]Wagner R A, Fischer M J. The string-to-string correction problem. J ACM, 1974, 21(1):168-173
  • 9[9]Lowrance R, Robert A Wagner. An extension of the string-to-string correction problem. J ACM, 1975, 22(2):177-183
  • 10[10] Sellers P H. On the theory and computation of evolutionary distances. SIAM J Applied Mathematics, 1974, 26(4):787-793

共引文献79

同被引文献30

引证文献2

二级引证文献163

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部