期刊文献+

基于Stacking集成学习的大规模文本层次分类方法 被引量:12

Large Scale Text Hierarchical Classification Method Based on Stacking Ensemble Learning
下载PDF
导出
摘要 [目的/意义]大规模文本层次分类问题是当前文本分类领域中的研究难点之一。由于数据规模和类别数量巨大,分类难以达到理想的效果。针对该问题,提出基于Stacking集成学习的大规模文本层次分类方法。[方法/过程]该方法使用自上而下方法实现分类,分别采用两类策略来训练高层和低层分类器。训练高层分类器(第一层和第二层)时采用多分类策略,根据高层分类结果设计了一种约束算法来选择合适的低层分类器。训练低层分类器时采用二分类策略,利用Stacking算法训练每个低层类别的基分类器和融合分类器,通过融合分类器预测结果排名选择得分最高的分类标签作为分类结果。[结果/结论]在中文期刊数据集上的实验结果表明,该方法能够有效提升大规模文本层次分类的效果。 [Purpose/significance]Large-scale text hierarchical classification is one of the difficult points in the current text classification research field.Due to large-scale data and categories,it is difficult to achieve desired classification effect.To solve the problem,a large-scale text hierarchical classification method based on Stacking ensemble learning was proposed.[Method/process]The method used a top-down approach to classify and used two types of strategies to train high-level and low-level classifiers.The high-level(first and second)classifiers were trained to adopt the multi-classification strategy,according to the high-level classification results of the document,a constraint algorithm was designed to select the appropriate low-level classifiers.The low-level classifiers were trained to adopt the binary classification strategy,and the Stacking algorithm was used to train the base classifier and fusion classifier of each lower-level class,and the class label with the highest score was returned according to the prediction results of the fusion classifier as the classification result.[Result/conclusion]The results of the experiment on the Chinese journal literature dataset show that the proposed method can effectively improve the accuracy of large-scale text hierarchical classification.
出处 《情报理论与实践》 CSSCI 北大核心 2020年第10期171-176,182,共7页 Information Studies:Theory & Application
基金 中国工程科技知识中心建设项目“知识组织体系建设”(项目编号:CKCEST-2020-1-19) 中国科学技术信息研究所重点工作项目“多模态知识图谱构建关键技术研究”(项目编号:ZD2020-09)的成果之一。
关键词 Stacking算法 文本分类 层次分类 深度学习 集成学习 stacking algorithm text classification hierarchical classification deep learning ensemble learning
  • 相关文献

参考文献14

二级参考文献167

共引文献325

同被引文献139

引证文献12

二级引证文献34

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部