摘要
概率主题模型由于其高效的数据降维和文档主题特征挖掘能力被广泛应用于各种文档分析任务中,然而概率主题模型主要基于有向图模型构建,使得模型的表示能力受到极大限制。为此,研究分布式主题特征表示和基于无向图模型玻尔兹曼机的重复软最大化模型(RSM),提出一种半监督的RSM(SSRSM)。将SSRSM、RSM模型提取的主题特征应用于多标记判别任务中,实验结果表明,相比LDA和RSM模型,SSRSM模型具有更好的多标记判别能力。
Recently probabilistic topic models are widely used because of high performance of dimension reduction and topic features mining. However, topic models are built based on directed graph model which limits the performance of data representation. This paper based on the studies on distributed feature representation and Replicated Softmax Model (RSM) which is based on the Restricted Bolzmann Machine (RBM) proposes a Semi Supervised Replicated Softmax Model(SSRSM). Experimental results show that the SSRSM outperforms LDA and RSM in task of topics extraction. In addition,by using the features learned by SSRSM and RSM in task of multi-label classification,it is shown that SSRSM has a better performance of multi-label learning than RSM.
出处
《计算机工程》
CAS
CSCD
北大核心
2015年第9期209-214,共6页
Computer Engineering
基金
国家自然科学基金资助项目(71172219)
国家科技型中小企业创新基金资助项目(11C26213402013)
关键词
主题模型
无向图模型
重复软最大化模型
半监督模型
特征学习
topic model
undirected graph model
Replicated Softmax Model(RSM)
semi-supervised model
featurelearning