摘要
在对开放知识资源的内容和特点进行调研分析的基础上,研究开放知识资源的采集需求。并以专家遴选出的种子数据源为实证,总结分析不同数据源的特点,最终研究形成三种元数据自动采集策略:基于OAI标准元数据收割协议的策略、基于抽取普通动态网页的策略、基于解析RSS源接口的策略。
The paper is aimed at studying collecting needs of open knowledge resources based on investigating its content and features. Then, the authors experiment with the seed data sources selected by experts so as to summarize and analyze the characteristics of different data sources. Finally, three kinds of metadata's collecting strategy are put forward as follows: the strategy based on OAI standard metadata harvesting protocol, the strategy based on extracting common dynamic web pages, the strategy based on parsing RSS Feeds interface.
出处
《图书馆学研究》
CSSCI
北大核心
2013年第12期47-51,共5页
Research on Library Science
基金
中国科学院文献情报能力专项项目"开放知识资源登记系统(二期)"的研究成果之一
关键词
开放知识资源
元数据
采集策略
网页抽取
OAI协议
RSS源
open knowledge resources metadata collecting strategy web pages extraction OAI protocol RSS feeds