摘要
该研究利用Python数据挖掘技术,详细探究了大规模预训练语言模型ChatGPT相关话题在视频网站上的影响力。对视频网站的数据样本进行了处理和分析,以了解ChatGPT相关的视频话题、关键字、点赞数、评论数、互动情况和新增订阅数等指标之间的关系。通过运用散点图绘制、相关分析、多元回归分析以及词云生成,对变量进行对数转换后,发现了显著的模型性能改善。对数转换后的模型的均方误差(MSE)为0.156,远低于原始模型的MSE(93157.23),而决定系数(R^(2))为0.962,表明该模型能够解释新增订阅数的96.2%的变异,这也显著优于原始模型的R^(2)(-2.09)。结果表明,对数转换能显著改善模型性能,提供对ChatGPT话题在视频网站上影响力的深入理解。
This study utilized Python data mining techniques to investigate the influence of ChatGPT,a large-scale pretrained language model,on video-sharing websites.We processed and analyzed data samples from these websites to understand the rela-tionship between ChatGPT-related video topics,keywords,likes,comments,interactions,and new subscriptions.By employing scat-ter plots,correlation analysis,multiple regression analysis,and word cloud generation,we observed significant improvements in the model’s performance after applying a logarithmic transformation to the variables.The logarithmically transformed model had a mean squared error(MSE)of 0.156,much lower than the MSE of the original model(93157.23).Additionally,the coefficient of determination(R^(2))for the transformed model was 0.962,indicating that the model could explain 96.2%of the variation in new sub-scriptions,which is significantly better than the original model’s R^(2)(-2.09).These results suggest that logarithmic transformation significantly improves the model’s performance and provides a deeper understanding of the impact of ChatGPT topics on video sharing websites.
作者
林海
叶小玲
Lin Hai;Ye Xiaoling(School of Information,City College of Huizhou,Huizhou 516025,China;Department of Ecological Engineering,Huizhou Engineering Vocational College,Huizhou 516025,China)
出处
《现代计算机》
2023年第23期1-8,41,共9页
Modern Computer