摘要
网络信息的爆炸式增长给人物信息的自动获取带来了巨大挑战.论文针对因特网上大量的人物信息,设计了一种基于语义上下文分析的人物信息挖掘体系框架,重点阐述了人物简历信息识别方法、基于隐马尔可夫模型(HMM,H idden M arkovModel)的命名实体识别方法和基于语义上下文分析的人物信息抽取算法.经实验表明:基于语义上下文分析的人物信息挖掘方法具有较高的信息抽取效率和精度.
The explosive growth of web information had brought tremendous challenges to automatically obtaining person information. In regarding to the large number of person information on the internet, we designed a semantic contexts based person information mining system framework, focusing on the identification method of people resume information, the named entity recognition method based on HMM (Hidden Markov Model) and person information extraction algorithm based on the semantic context. The experiment showed that: the semantic contexts based person information mining method had higher efficiency and accuracy of information extraction.
出处
《安徽大学学报(自然科学版)》
CAS
北大核心
2009年第4期33-37,共5页
Journal of Anhui University(Natural Science Edition)
关键词
人物信息挖掘
语义上下文
隐马尔可夫模型
命名实体识别
person information mining
semantic contexts
HMM (Hidden Markov Model)
named entity recognition