摘要
本文详细探讨了一种新的Web数据模型——标记图,给出了严格的形式化描述.标记图精确描述了HTML文档的标记关系,适合描述Web的半结构化数据,建立在标记图基础之上的查询语言可以用于Web查询、Web视图和资源发现等,利用FORM标记可以查询动态文档内容.本文还简要介绍了标记图在Web查询优化、HTML文档结构信息抽取和视图技术中的应用,并探讨了标记图和其半结构化数据模型OEM关系.
This paper presents a new data model for web, TagGraph, which focuses on tags, the key elements of HTML documents. TagGraph, as a so called semi structured data model, describes the tag relations as well as the data subtly in web. Querying based on TagGraph can be easily used in web query and resource discovery. FORM targeted query can exploit the content of dynamic HTML document, which is generated on the fly by a CGI program on the server side. The applications of TagGraph in query optimization, extracting structure information extraction in HTML files and views of web are briefly introduced. In addition, the relations between TagGraph and other seme structured data model, e.g. OEM are also discussed.
出处
《计算机学报》
EI
CSCD
北大核心
1999年第3期306-312,共7页
Chinese Journal of Computers
基金
国家自然科学基金