摘要
针对短文本存在特征稀疏和信息不规范等特点,文中在TextGCN模型的基础上通过增加词性过滤减弱无关词对特征选择的影响,并加入TF-CR算法提高类别无关词权重,最后,通过与几个经典模型进行对比,验证改进模型的有效性。
In view of the features of the short text such as sparse features and non-standard information,this paper,based on the TextGCN model,reduces the influence of irrelevant words on feature selection by adding part of speech filtering,and adds TF-CR algorithm improves the weight of category independent words,and finally verifies the effectiveness of the improved model by comparing with several classical models.
作者
许梦玥
侯秀萍
王俊华
XU Mengyue;HOU Xiuping;WANG Junhua(School of Computer Science&Engineering,Changchun University of Technology,Changchun 130102,China)
出处
《长春工业大学学报》
CAS
2023年第6期546-551,共6页
Journal of Changchun University of Technology
基金
吉林省教育厅“十三五”科学技术项目(JJKH20191311KJ)。
关键词
词性过滤
特征选择
短文本分类
part of speech filtering
feature selection
short text classification