摘要
该文通过爬虫代码搜集了当前B站电影栏目列表中的所有电影(约1000部),同时爬取每部电影下的所有评分数据(约65万条),每条评分数据包含评分时间与用户的ID信息。通过非参数统计中的Mann-Whitney秩和检验对搜集的数据进行分析和研究,结果表明:B站电影栏目中第一次评分人员的比例会对评分产生显著影响。同时参考美国IMDb贝叶斯加权统计算法中只收录“老用户”评分的处理方式,对B站评分系统提出建议,使评分能更加客观、全面地为观众提供参考。
Through the crawler code,this paper collects all the movies(about 1000)in the current bilibili movie column list,and crawls all the rating data(about 650000 pieces)of current movies under each movie.Each rating data contains the rating time and user ID information.The data collected were analyzed by Mann-Whitney rank sum test in nonparametric statistics.The results showed that the proportion of people who scored for the first time in bilibili film column had a significant impact on the score.At the same time,referring to the American IMDb Bayes weighted statistical algorithm which only includes"old users"rating behavior,this paper puts forward suggestions for bilibili scoring system,so that the scoring can be more objective and comprehensive to provide reference for the audience.
作者
张可奇
李秋敏
ZHANG Keqi;LI Qiumin(不详)
出处
《科教文汇》
2021年第6期63-65,共3页
Journal of Science and Education