摘要
命名实体识别在自然语言处理实践中具有高度重要的作用,而且也是信息提取等各种自然语言方式的基础工具。本文采用条件随机场模型(Conditional Random Fields,CRF)对维吾尔语音乐实体识别进行初步的探讨。首先维吾尔语网站上收集数据,进行一系列预处理后得到纯文本,然后制定语料标注规则对实体进行人工标注,再利用上下文、关键字、词典等一系列特征进行训练,制定一个适合的模板来进行音乐实体的识别。实验结果证明,此方法在维吾尔语音乐领域不仅可行、而且有效。
Named entity recognition has played very important role in the practice of natural language processing, and also is an important basic tool for the information extraction and other natural language. This paper proposes preliminary discussion on recognition of Uyghur musical named entity by using Condition Random Field. Firstly, collect data on many Uyghur sites, and gain plain text after pretreatment of database, then make rules of marking the corpus database to realize artificially mark entities. Finally use context, keywords, dictionaries and a series of characteristics of training, to develop a suitable template for the identification of musical entities. The experimental results prove that this method is feasible and effective in the Uyghur music field.
出处
《智能计算机与应用》
2017年第2期59-62,共4页
Intelligent Computer and Applications
关键词
音乐实体识别
条件随机场模型
特征选择
musical named entity recognition
Conditional Random Field model
feature selection