期刊文献+

基于AdaBoost算法的养老信息筛选及应用

Filtering and Application of Aged Information Based on AdaBoost Algorithm
下载PDF
导出
摘要 面对信息社会中老年人对养老信息的关注与需求,本文使用基于Python的网络爬虫技术对民政部网站的新闻和公文进行抓取。针对门户网站的新闻特点,对数据抓取过程以及训练集进行优化,使用AdaBoost算法对给定的文本集合进行训练,得到筛选模型。提供一种有效的特征选择方法,采用χ2统计量准则,有效降低了特征维数,然后用该模型对采集的信息进行筛选得到养老信息。最后,对信息筛选结果进行了分析。实验分析结果表明,本文提出的方法可以实现对养老信息的有效筛选,在应用上可以满足老年人对养老信息的获取需求。 Facing attention to the needs of older persons in the information society for aged information, this paper uses Web crawler technology based on Python to crawl the news and official documents from Ministry of Civil Affairs website. Aiming at the characteristics of news on portals, the paper optimizes data fetching process as well as the training set, uses Adaboost algorithm to train a given collection of text and get filtering model. And the paper provides an effective feature selection method which uses the χ2 statistic principles, effectively reduces the feature dimension, and then uses this model to filter the collection information to get aged information. Finally, the results of information filtering are analyzed. The experimental analysis results show that the proposed method can effectively filter the aged information and meet the elderly demand of aged information acquisition in the practical application. Key words: Web crawler; AdaBoost; aged information; government press ; information filtering
作者 程光洋 廉彬
出处 《计算机与现代化》 2016年第12期102-106,110,共6页 Computer and Modernization
关键词 网络爬虫 ADABOOST 养老信息 政府新闻 信息筛选 Web crawler AdaBoost aged information government press information filtering
  • 相关文献

参考文献10

二级参考文献97

共引文献726

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部