摘要
命名实体识别是信息抽取的重要研究内容,主要包括对组织机构名、地名和人名的自动识别。针对英语和汉语的命名实体识别研究开始较早,主要采用基于规则和基于统计的方法进行识别,但目前国内还少有针对越南语命名实体识别的研究。该文分析了越南语命名实体的语言学特点,对其分类并进行了形式化表达,提出了一种基于规则的越南语命名实体识别方法,实验结果显示,该方法能够达到较高的识别准确率。
Named Entity Recognition (NER) is an important task for Information Extraction. NER mainly includes the recognition of person names, location names and organization names. Studies on English and Chinese NER began relatively earlier, mainly using rule-based methods or statistical methods. There are fewer studies carried out on Vietnamese NER, and there are even no domestic studies. This paper presents a rule based method to recognize Vietnamese Named Entities on the basis of their linguistic formations. Experiments results validate the effectiveness of this method.
出处
《中文信息学报》
CSCD
北大核心
2014年第5期198-205,214,共9页
Journal of Chinese Information Processing
基金
中国-东盟研究中心资助课题(201205)
关键词
命名实体识别
越南语
规则
named entity recognition
Vietnamese
rule