摘要
给出了一种针对目标网站的全文搜索系统的程序框架图,介绍了其工作原理及实现过程.在全文信息数据库的建立过程中,针对HTML文档的特点,提出了网页特征信息提取技术,有效地减少了信息存储量.最后,给出了应用结果.
The programming framework of a kind of full-text searching system for the target website is presented. Then, its process principle and implementation are also introduced. During the establishment of full-text information database, the technique of the feature extraction of web pages is proposed based on the characteristic of HTML documents, which may decrease storage efficiently. At last, some results are given.
出处
《东华大学学报(自然科学版)》
CAS
CSCD
北大核心
2007年第5期639-643,共5页
Journal of Donghua University(Natural Science)
关键词
特征提取
网站
全文搜索系统
全文信息库
搜索代理
feature extraction
website
full-text searching system
full-text information database
searching agent