摘要
WWW的迅速发展,使其日益成为人们查找有用数据的重要来源。本文介绍了一种基于Web的信息抽取的实现方法,能够按照规则模式重复地将半结构化网页中的信息自动抽取出来。
The rapid development of the World Wide Web makes it become more and more important sources for people to look for useful data. This paper introduces a method of information extraction from the Web, which can extract information by regulation pattern from semi-strnctured web pages repeatedly and automatically.
出处
《洛阳工业高等专科学校学报》
2005年第3期30-31,共2页
Journal of Luoyang Technology College
关键词
信息抽取
规则
模式
Information extraction
Regulation
Pattern