摘要
针对变分自编码器在文本分类应用中经常发生的后验塌陷问题,提出了一种基于球面自编码器的文本分类方法。首先,将变分自编码器中隐变量所服从的分布由多元高斯分布替换为冯米塞斯-费舍尔球面分布,从理论上解决了后验塌陷,得到了高质量的文本特征表示。在三个文本分类数据集上的实验结果表明,所提出的方法优于原始的变分自编码器文本分类方法。
Aiming at the posterior collapse problem that often occurs in the text classification application of the Variational auto-encoder, a text classification method based on the spherical auto-encoder is proposed. First, the distribution of the hidden variables in the Variational auto-encoder is replaced by the multivariate Gaussian distribution with the von Mises-Fisher spherical distribution, which theoretically solves the posterior collapse and obtains a high-quality text feature representation. Experimental results on three text classification data sets show that the proposed method is better than the original Variational auto-encoder text classification method.
作者
赵书安
ZHAO Shu’an(School of Information Engineerin,Jiangsu Open University,Nanjing Jiangsu 210065,China;School of Electronic and Optical Engineering,Nanjing University of Science and Technology,Nanjing Jiangsu 210094,China)
出处
《电子器件》
CAS
北大核心
2021年第6期1417-1420,共4页
Chinese Journal of Electron Devices
关键词
文本分类
变分自编码器
冯米塞斯-费舍尔分布
text classification
variational auto-encoder
von Mises-Fisher distribution