摘要
随着移动互联网在高校图书馆年轻读者的影响越来越大,新生代读者使用QQ即时通信软件的比例越来越大,对于图书馆QQ群文本信息进行文本挖掘从而了解图书馆的舆情,可以用于图书馆舆情预警,为图书馆决策层提供更强的舆情应对能力。采用网络爬虫技术爬取2022年9月至2022年12月份时间段内的QQ群中聊天记录作为图书馆舆情数据,接着对原始的舆情数据进行数据去重、清洗等数据预处理操作,然后运用清华大学Thulac分词技术提取舆情数据的关键字、计算其权重,并使用WordCloud库进行可视化展示,接下来采用spaCy库给舆情数据计算出具体的情感倾向及分值,最后通过实验来分析图书馆舆情的情感倾向。
With the increasing influence of mobile Internet on young readers in university libraries,the proportion of new generation readers using QQ instant messaging software is increasing,text mining of library QQ group text information can be used to understand the public sentiment of the library,which can be used for library public sentiment warning and provide strong public sentiment response capabilities for library decision-makers.Web crawler technology is used to crawl chat records in QQ groups from September 2022 to December 2022 as library public sentiment data,then performing data preprocessing operations such as deduplication and cleaning on the original public sentiment data.Then,Tsinghua University's Thulac segmentation technology is used to extract keywords and calculate their weights from the public sentiment data,and them the WordCloud library is used for visualization.Next,the spaCy library is used to to calculate specific emotional tendencies and scores for public opinion data,and finally the emotional tendencies of library public opinion are analyzed through experiments.
作者
王龙军
王晶
李光华
陈亮
WANG Long-jun;WANG Jing;LI Guang-hua;CHEN Liang(Chengdu Technological University,Chengdu 611730,Sichuan)
出处
《电脑与电信》
2024年第3期13-16,共4页
Computer & Telecommunication
基金
2022年度国家级大学生创新创业训练计划项目,项目编号:202211116016。