摘要
爆炸式增长的信息量带来严重的数据质量问题。实体识别是数据清洗的一项关键技术,用以识别存在不同形式的同一对象,或区分同一形式的不同对象。介绍了实体识别相关技术,阐述了实体识别技术过程与方法,并对面向大数据的实体识别技术进行了展望。
Data quality problems are particularly serious due to the explosive growth of information.Entity recognition is a key technology for data cleaning to identify different objects in different forms or to distinguish different objects in the same form.This paper outlines the problem of entity recognition,summarizes the technology of entity recognition and looks forward to the entity recognition technology for big data.
作者
莎仁
梁琼芳
李长明
张家鑫
SHA Ren;LIANG Qiong-fang;LI Chang-ming;ZHANG Jia-xin(College of Information Science and Technology,Northeast Normal University;Changchun Guanghua University,Changchun 130000,China)
出处
《软件导刊》
2020年第3期125-127,共3页
Software Guide
关键词
大数据
数据质量
实体识别
big data
data quality
entity recognition