摘要
文章针对辨别互联网网页文本内容,设计并实现了一个基于轻量级Spring Boot框架的网页健康性评级系统。软件使用jsoup技术实现根据URL对网页内容的爬取,进而利用页面中的敏感词进行检测评级,并对指定网页的所有链接进行进一步的爬取与检测;在前台以ECharts图表可视化显示数据分析结果,管理员还可以对用户举报的页面进行审核。本系统可以发挥广大民众的积极性,促进网民与网监部门合作,协力净化网络文化环境。
This paper proposed a web pages health rating system based on Spring Boot framework,which aims to identifying the text content of web pages is whether health.Jsoup technology was adopted to crawl the web pages’content according to URLs,and then get the sensitive words for detection and rating.The crawling and rating process was also applied to the linked URL.Results was displayed by ECharts digrams.Moreover,managers can review the reported pages.The System can motivate web users to cooperate with Internet surveillance authority for a better Internet cultural environment.
作者
程岚岚
田文涛
汪剑
CHENG Lan-lan;TIAN Wen-tao;WANG Jian(School of Computer Science&Information Engineering,Tianjin University of Science and Technology,Tianjin 300457,China)
出处
《电脑与信息技术》
2018年第2期45-47,共3页
Computer and Information Technology
基金
天津科技大学大学生实验室创新基金(项目编号:1610A111)