摘要
笔者基于Python设计并实现了面向豆瓣网站分类浏览下艺术家标签的数据采集及清洗系统,完成了对该标签下全部歌手及其歌曲的数据爬取和清洗。通过爬取豆瓣音乐网分类浏览下的艺术家栏目,分析豆瓣音乐的详细信息,了解当下热门音乐以及音乐人,统计音乐人的歌曲总数、评价等详细信息,并对爬取到的数据进行数据清洗,具有一定的商业价值。
Based on Python,this paper designs and implements the data collection and cleaning system for the artist label under the category of Douban website,and completes the data crawling and cleaning of all the singers and their songs under the label.By crawling the artist section under the category of Douban Music Network,analyze the detailed information of Douban music,learn about the current popular music and musicians,count the total number of songs and evaluations of musicians,and clean the data of the crawled data.,has a certain commercial value.
作者
欧丽粤
毛红霞
赵春
熊浩宇
李荟
Ou Liyue;Mao Hongxia;Zhao Chun;Xiong Haoyu;Li Hui(School of Computer Science,Jincheng College of Sichuan University,Chengdu Sichuan 611731,China)
出处
《信息与电脑》
2019年第18期151-153,共3页
Information & Computer