期刊文献+

大数据实体识别相关技术研究 被引量:2

Research on Related Technologies for Big Data Entity Identification
下载PDF
导出
摘要 爆炸式增长的信息量带来严重的数据质量问题。实体识别是数据清洗的一项关键技术,用以识别存在不同形式的同一对象,或区分同一形式的不同对象。介绍了实体识别相关技术,阐述了实体识别技术过程与方法,并对面向大数据的实体识别技术进行了展望。 Data quality problems are particularly serious due to the explosive growth of information.Entity recognition is a key technology for data cleaning to identify different objects in different forms or to distinguish different objects in the same form.This paper outlines the problem of entity recognition,summarizes the technology of entity recognition and looks forward to the entity recognition technology for big data.
作者 莎仁 梁琼芳 李长明 张家鑫 SHA Ren;LIANG Qiong-fang;LI Chang-ming;ZHANG Jia-xin(College of Information Science and Technology,Northeast Normal University;Changchun Guanghua University,Changchun 130000,China)
出处 《软件导刊》 2020年第3期125-127,共3页 Software Guide
关键词 大数据 数据质量 实体识别 big data data quality entity recognition
  • 相关文献

参考文献2

二级参考文献79

  • 1WANG R,STRONG D. Beyond accuracy:what data quality means to data consumers[J].Journal of Management Information Systems,1996,(04):5-34.
  • 2RAHM E,DO H H. Data cleaning:problems and current approaches[J].IEEE Data Engineering Bulletin,2000,(04):3-13.
  • 3NEWCOMBE H B,KENNEDY J M,AXFORD S J. Automatic linkage of vital records[J].Science,1959,(3381):954-959.
  • 4FELLEGI I P,SUNTER A B. A theory for record linkage[J].Journal of the American Statistical Association,1969,(328):1183-1210.
  • 5HERN(A) NDEZ M A,STOLFO S J. The merge/purge problem for large databases[A].New York:ACM,1995.127-138.
  • 6ELMAGARMID A K,IPEIROTIS P G,VERYKIOS V S. Duplicate record detection:a survey[J].IEEE Transactions on Knowledge and Data Engineering,2007,(01):1-16.doi:10.1109/TKDE.2007.250581.
  • 7LIM E P,SRIVASTAVA J,PRABHAKAR S. Entity identification in database integration[J].Information Sciences,1996,(01):1-38.
  • 8BRIZAN D G,TANSEL A U. A survey of entity resolution and record linkage methodologies[J].Communications of the IIMA,2006,(03):41-50.
  • 9KOUDAS N,SARAWAGI S,SRIVASTAVA D. Record linkage:similarity measures and algorithms[A].New York:ACM,2006.802-803.
  • 10KIRSTEN T,KOLB L,HARTUNG M. Data partitioning for parallel entity matching[J].Proceedings of the VLDB Endowment,2010,(02):1-12.

共引文献10

同被引文献75

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部