摘要
超文本是一种非结构化的文档 .它虽然不支持跨页查询和全文检索 ,但却是 Internet上信息组织与存储的重要方式 .提出了一种将超文本转换为结构化数据库的算法 .分析了超文本结构化转换的需求 ,运用图论分析并描述了超文本的转换模型与实现算法 .该算法在鲁迅数字图书馆系统中得到了实际应用和验证 .
Hypertext is a kind of unstructured document. It is impossible to realize the search based on content and topic for hypertext documents. However, hypertext is one of the most important ways of information storage and organization in the Internet. Therefore, in order to realize the effective management and the search of hypertext documents, a new and practical method named HtoDB for converting unstructured hypertext to database is presented. In the paper, the requirements and functions for converting hypertext to database are analyzed, the converting model and algorithm are also put forward according to the graph theory. The algorithm and model presented in this paper are verified in the project of “LU XUN digital library system”.
出处
《软件学报》
EI
CSCD
北大核心
2001年第2期167-172,共6页
Journal of Software
基金
国家863高科技发展计划资助!项目 (86 3- 317- 0 1- 0 4- 99)
机械制造系统工程国家重点实验室基金!资助项目
西安交通大学科学研