摘要
随着信息时代的飞速发展,碎片化阅读已成为一种学习的新趋势。在了解HTML语法的同时,利用网络爬虫、Python语言对互联网信息进行大量爬取的同时利用3种筛选方式对数据实现快速搜索匹配,并将信息存入数据库,然后对这些原始信息,通过机器算法与人工审核结合的方式进行二次浓缩加工,最终得出简洁易懂的信息,是提高碎片化学习效率的有效途径。
With the rapid development of the information age,fragmented reading has become a new trend of learning.While understanding the HTML grammar,using the web crawler,using the python language to crawl a lot of Internet information,using three filtering methods to achieve fast search matching of the data,and store the information in the database,and then the original information through the machine The combination of algorithm and manual review for secondary concentration processing,and finally to obtain simple and easy to understand information,is an effective way to improve the efficiency of fragmented learning.
作者
叶葳蕤
郁惠闻
陈杰
王昊天
王彦之
Ye Weirui;Yu Huiwen;Chen Jie;Wang Haotian;Wang Yanzhi(Shanghai Lixin University of Accounting and Finance,Shanghai 201209,China)
出处
《信息与电脑》
2019年第22期125-126,130,共3页
Information & Computer
基金
上海立信会计金融学院大学生创新创业训练计划(项目编号:201911639112)
关键词
碎片化阅读
网络爬虫
数据筛选
二次加工
fragmented reading
web crawler
data screening
secondary processing