摘要
对Web页进行必要的、有效的内容过滤对于营造健康、安全的网络环境具有重要的意义。重现用户成功访问过的Web页内容,可以对网络访问进行事后监督,为过滤机制的完善提供相应数据。文中分析了Web页的访问流程,基于HTTP代理服务器,在应用层实现了对Web页的关键字过滤和基于语义的内容过滤,并通过将客户机成功访问过的Web页存储在代理服务器硬盘上,实现了内容重现。试验表明,语义过滤能较好地甄别文本的不同观点,准确度较单纯关键字过滤有明显提高。
It is important to construct a healthy and secure network circumstance through necessary and effective content filtering to Web page accessing. To recur the content of Web pages which have been accessed by users successfully can do postmortem supervising for Internet surfing,and to provide the corresponding data for consummating the filtering system. This paper analyzed the flow of the Web page accessing, the keyword - based filtering and the semantic - based content filtering for Web page on the application layer were implemented based on HTIP proxy server. Besides, the content recur function is implemented by saving the content of Web pages which are accessed successfully by client PC; to the HDD of the proxy server. The results of experiments proved that semantic filtering can discriminate the different standpoints, its precision is improved in evidence compared with content filtering only by key word.
出处
《计算机技术与发展》
2007年第4期120-124,共5页
Computer Technology and Development
基金
河南省教育厅自然科学研究计划项目(2006520022)
周口师范学院青年基金项目(ZKNUQN200615)
关键词
内容过滤
代理
内容重现
语义
content filtering
proxy
content recurrence
semantic