摘要
针对小样本不均衡数据采用GAT模型主题分类效果不佳的问题,本文提出一种基于GAT的样本均衡补偿模型(BC-GAT),优化GAT模型的构建方法,对数据集中小比例样本进行均衡补偿。本文通过合理运用EDA算法和网络爬虫算法,使数据集中小比例样本的扩充更加合理和高效,使GAT模型更加适合小样本不均衡主题分类。实验表明,BC-GAT模型小比例样本识别准确率在90%以上,可以有效解决实际应用中存在的极小样本和数据倾斜问题。
This paper proposes a GAT-based sample balancing compensation model(BC-GAT) to optimize the construction method of the GAT model for the balancing compensation of small-scale samples in the dataset, in order to address the problem that the GAT model is not effective in classifying topics with small unbalanced samples. By reasonably using EDA algorithm and web crawler algorithm, this paper makes the expansion of small-scale samples in the dataset more reasonable and efficient and makes the GAT model more suitable for small sample unbalanced topic classification. The experiments show that the accuracy of the BC-GAT model for small-scale sample recognition is above 90%, which can effectively solve the problems of very small samples and data skewing in practical applications.
作者
王琦菲
张大为
WANG Qifei;ZHANG Dawei(College of Computer and Information Technology,Liaoning Normal University,Dalian Liaoning 116000,China)
出处
《智能计算机与应用》
2023年第1期100-103,111,共5页
Intelligent Computer and Applications
基金
国家自然科学基金(20200037,20200084)
辽宁省科技厅-博士科研启动基金计划项目(20210301)。