摘要
针对基于特征向量的实体关系抽取方法中分类算法分类精度的不足,提出了基于集成学习算法的实体关系抽取方法.该方法将实体特征组合并转化为特征向量,使用集成学习中的ADABoost.MH算法来构造实体关系抽取的分类器,弱分类器采用决策树进行构造,通过提高分类效果好的分类器的权重和分类错误样本权重的方式来提高分类的精度,从而实现实体关系类别的识别.该方法在对《人民日报》语料库的测试中,得到了比较好的效果.
To overcome the classification accuracy defects of traditional classification algorithm,a method of integrated learning is brought forward.The method which combined entity characteristics and translated entity characteristics into feature vector introduced an integrated learning algorithm.ADABoost.MH algorithm is used to divide weak classifier.By improving the weight of good classifier and wrong results to increase classification accuracy realized the recognized classes of entity.The method proved to be effective in test of the corpus of the people's Daily.
出处
《西安建筑科技大学学报(自然科学版)》
CSCD
北大核心
2011年第3期446-450,共5页
Journal of Xi'an University of Architecture & Technology(Natural Science Edition)
基金
陕西省自然科学基金资助项目(2009JM8006)
陕西省教育厅专项科研项目(2010JK620)