摘要
在多标签分类任务中随着标签数量的增多,传统的基于隐含狄利克雷分布模型的方法往往会遇到可扩展性问题。为解决这一问题,提出一种基于划分子集的带标签隐含狄利克雷模型。通过对数据划分子集降低算法的时间复杂度,在标签规模达到成百上千时灵活扩展模型,提高传统带标签狄利克雷模型的预测准确率。该方法被部署于大规模实验数据集上,与多个经典方法进行比对,实验结果表明,该方法具有良好的准确率和效率,是解决多标签学习问题的有效工具。
In multi-label classification task,the traditional latent Dirichlet allocation based model often face scalability problems when the number of labels increases.A subset labeled latent Dirichlet allocation model was proposed.By dividing the data into subsets,the time complexity of the algorithm was reduced.Moreover,it adaptively scaled up when labels were tens of thousands.The proposed method was implemented in a huge dataset.Experimental result shows that,compared with several classic models,the proposed method has good accuracy and efficiency.It is a useful tool in multi-label learning tasks.
作者
杨菊英
刘燚
罗佳
YANG Ju-ying;LIU Yi;LUO Jia(Department of Computer Science,Chengdu College of University of Electronic Science and Technology of China,Chengdu 611731,China;School of Computer Science,China West Normal University,Nanchong 637009,China)
出处
《计算机工程与设计》
北大核心
2020年第12期3432-3437,共6页
Computer Engineering and Design
基金
四川省科技厅基金项目(172102210594)。