摘要
文章简要分析了NSTL国际科学引文数据库的建设现状,讨论了期刊类型引文数据自动化拆分的必要性和可行性,深入研究了期刊类型引文的著录规律,提出采用分类的思想将引文数据划分为不同类型再分别加以拆分,设计出自动化拆分的具体流程和技术框架,实现了自动化拆分工具,并在农学领域进行了自动化批量拆分的应用实践,增强了大规模数据的自动化处理能力,提高了数据的整体质量及时效性。
This paper gives brief introduction about the construction of Database of International Science Citation (DISC), and makes some discussions on the necessity and feasibility to split the journals' citation data automatically. We make further study on the marked rules of citation data with journal types, and propose to classify the citation data into different types such as journals, books and so on, and then split them respectively. We design the workflow and technical framework, and also develop a system to classify and split the citation data automatically. The practice on agricultural domain proves that this system enhances the processing capability with large-scale data and also improves the effectiveness and the whole quality of DISC.
出处
《数字图书馆论坛》
2010年第10期91-95,共5页
Digital Library Forum