摘要
【目的】在数字资源长期保存系统(DPS)中应用开源格式识别工具获取复杂对象的格式信息。【应用背景】在现有开源工具的基础上,为满足DPS的实际需求、保障效率和执行效果,需选择合适的工具进行二次开发和集成应用。【方法】分析比较现有两种常用工具,选取DROID作为DPS的格式识别工具,同时针对DPS效率要求,提出选用DROID批量格式识别的处理思路,并对其进行有效封装。【结果】将DROID封装为"DPS的批量格式处理模块"并在DPS格式识别及技术元数据抽取中得到实际应用。【结论】DROID是一个优秀的开源工具,其自动批处理特性基本满足DPS格式处理需求。
[Objective] Integrate open source file-format identification tool into Digital Preservation System (DPS) to get complex object format information. [Context] Based on the existing open source tools, to meet the practical requirements, the DPS needs choose appropriate tools for application integration. [Methods] Analyze and compare several open source file-format identification tools. According to the practical requirements, DROID has been chosen for the DPS. At the same time to meet the efficiency requirements of DPS, an idea of choosing DROID batch format identification of complex objects is proposed. [Results] Batch format processing module which is integrated with DROID is utilized to complete format identification of complex objects and technical metadata extraction. [Conclusions] DROID is an excellent open source tool, of which the automatic batch processing can meet the requirements of DPS.
出处
《现代图书情报技术》
CSSCI
2015年第1期75-81,共7页
New Technology of Library and Information Service
关键词
格式识别
长期保存
复杂对象
Format identification Long-term preservation Complex object