摘要
以W3C的文档对象模型DOM和元数据为基础,把要提取的信息以DOM层次结构中的路径表达式来表示,通过归纳学习来获得所需信息的路径表达式,从而获得提取信息;元数据在信息提取过程中起到关键作用,它以XML的DTD表示,可以由信息服务商提供,也可以由开发人员给出,适应了信息源不断变化的特点。
Based on DOM and metadata,retrieved information is organized by path expression that complies with DOM.Path expression is gained by inductive learning.Metadata expressed by DTD is a key during the information retrieval.It is provided by information suppliers or developers and adapts to everincreasing scale and diversity of information and application on Internet.
出处
《计算机与现代化》
2003年第10期81-82,94,共3页
Computer and Modernization