摘要
首先分析传统搜索引擎查准率不高的原因,然后介绍XML以及XML搜索引擎研究现状,并对XML搜索引擎所涉及的文档存储、索引、查询等关键技术进行详尽探讨。在此基础上,设计现行网络环境下的XML搜索引擎模型。认为该模型可充分利用XML文档的DTD模式信息,并能大幅度提高查询的准确率。
As an extensible markup language, XML has its advantages that HTML can't match. XML not only can support the mark which is defined by users, but can express semantics, which makes it possible to improve the accuracy of retrieval on the Internet. The paper firstly analyzes the reasons that the traditional search engine does not have a high accuracy ratio of inquiries, then introduces the current situation of the XML and XML search engine, and makes a thorough discussion on the key technique of XML search engines, such as document storage, index and query. Based on the analysis, a model of XML search engine is designed under the present network environment. Taking full advantage of the DTD pattern information of the XML documcnts, the research model can significantly enhance the accuracy ratio of inquiries.
出处
《图书情报工作》
CSSCI
北大核心
2007年第1期114-117,121,共5页
Library and Information Service