摘要
由于不同的维吾尔文网站采用了不同的字符编码或不同的字库体系,使得单一的搜索引擎很难兼顾不同的维吾尔文网页信息。针对维吾尔文网页内容检索特点,提出了利用元搜索引擎完成网页信息采集,通过信息过滤技术从动态的信息流中抽取出符合用户个性化需求的信息条目,经去重后载入采集数据库,通过与敏感信息库进行比对,发掘出发布特定信息的网页。
Because different Uighur Website using different character encoding systems or different character system, it makes a single search engine hard to balance the different Uighur information on the Website. According to the characteristic of Uighur .Web pages content retrieval, proposes to use meta search engine to complete the collection of Web page information, through information filtering technology, extracting information from the dynamic flow of information to meet the demand of personalized information catalogue, and then writes them in the database after removing the duplicate content, and compares with the sensitive information database.
出处
《现代计算机》
2008年第10期40-42,共3页
Modern Computer
关键词
维吾尔文
信息检索
元搜索引擎
Uighur
Information Retrieval
Meta-Search Engine