期刊文献+

一种非完全标注的文本分类训练方法 被引量:1

A Text Categorization Training Method with Incomplete Annotation
下载PDF
导出
摘要 针对传统方法中性能与精度的不足,通过最优类别分组和遗传算法,提出一种非完全标注的文本分类训练方法。新方法能将原来的分类体系拆分成多个分类体系,使得每个分类体系下的类别彼此互斥。在每个拆分出的分类体系下,对数据进行训练,可提高分类器的精度。通过多个分类器并联,分别输出样本对应的类别,得到样本实际所属的所有类别。仿真实验表明,该方法可有效地解决当前分类体系下,非完全标注的文本分类器无法有效的识别出非完全标注文本类别与其它类别的边界,从而造成数据分类性能低下等问题。 Aiming at the deficiency of performance and precision in traditional methods, a text classification training method with incomplete annotation is proposed through optimal category grouping and genetic algorithm. The new method can split the original classification system into multiple classification systems, making the categories under each classification system mutually exclusive. Under each split classification system, training the data can improve the accuracy of the classifier. Through parallel connection of a plurality of classifiers, the categories corresponding to the samples are respectively output to obtain all categories to which the samples actually belong. Simulation results show that the method can effectively solve the problem of low data classification performance caused by incomplete annotation text classifiers unable to effectively identify the boundaries between incomplete annotated text categories and other categories under the current classification system.
作者 段军红 李晓宇 慕德俊 DUAN Junhong;LI Xiaoyu;MU Dejun(Research & Development Institute of Northwestern Polytechnical University in Shenzhen, Shenzhen 518057, China)
出处 《微处理机》 2019年第1期20-24,共5页 Microprocessors
基金 国家自然科学基金项目(61672433) 深圳市科创委基础研究项目(201703063000511 201703063000517) 国家密码发展基金(MMJJ20170210) 国家电网公司科技项目(522722180007)
关键词 文本分类 非完全标注 最优分组 训练方法 Text classification Incomplete annotation Optimal grouping Training methods
  • 相关文献

同被引文献3

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部