摘要
为克服传统的先分词再识别方法的缺点,提出了一种基于场景信息融合的姓名识别方法。该方法结合中文姓名的特点,综合考虑上下文信息、词本身信息、词典信息和姓名自身信息等场景资源对中文名实体的影响,将它们作为姓名识别的依据,同时引入了证据理论,通过场景资源信息的融合,最终识别出人名。通过对互联网上随机抽取的大规模真实语料的开放测试表明,该方法可以取得较高的召回率并同时保证较高的准确率。
To overcome the defects of traditional name identification algorithms with automatic segmentation at first,a name identification method based on scene information fusion is presented.Combining the characteristics of Chinese names,the scene information, such as the context, word, dictionary, names, is used as the basis of name identification.And then, the evidence theory is introduced,and the names are identified by scene information fusion.The open tests on real data sets randomly selected from the internet show that it is an effective method to improve the result of the identification with high recall rate and accuracy rate are guaranteed.
出处
《计算机工程与应用》
CSCD
北大核心
2009年第34期147-151,共5页
Computer Engineering and Applications
基金
国家自然科学基金No.60972045
南京邮电大学引进人才科研基金No.NY207148~~
关键词
姓名识别
场景信息融合
自动分词
证据理论
name identification
scene information fusion
automatic segmentation
evidence theory