摘要
将实体词典以特征的形式引入到机器学习模型中,提出一种基于实体词典与机器学习的基因命名实体识别方法,在GENIA 3.02语料上进行实验。测试结果表明引入实体词典特征后,在获得较高实体识别准确率的同时,优化CRFs识别模型的时间复杂度,提高系统识别效率。
By introducing the entity dictionary into the model of machine learning in the form of characteristics,this article proposes a method of gene- named entity recognition based on entity dictionary and machine learning and experiments on corpus GENIT 3.02.As indicated by the test results,after the characteristics of the entity dictionary are introduced,while a higher accuracy rate of entity recognition is obtained,the time complexity of CRFs recognition model is optimized and the systems recognition efficiency is enhanced.
出处
《医学信息学杂志》
CAS
2015年第12期54-60,共7页
Journal of Medical Informatics
基金
国家科技支撑计划项目(项目编号:2011BA H10B05)
关键词
实体词典
机器学习
基因命名实体
命名实体识别
Entity dictionary
Machine learning
Gene named entity
Named entity recognition