摘要
现代信息技术飞速发展,为古籍引书计算机识别提供了可能性。本研究的目的是研究古籍引书的引用模式,进而探讨模式识别方法在引书识别中的应用。具体方法是以明、清及民国时期广东地方志中有关物产门目资料(《广东方志物产》)为语料,抽取所有引书的引用模式(引书名称及其表达方式),分别对引书名称模式和引用表达方式进行研究。
It is possible to recognize the cited books with the rapid development of modern information technology. This paper attempts to explore the citing patterns, and the pattern recognition applying in the cited books recognition. The authors take Local Chronicle of Guangdong: Produce as an example, extracting all the citing patterns from the documents. After that, all the citing patterns are surveyed, including the title citing ways and the citing expression patterns. The title citing ways are classified as the document name, the author name, the author name + document name. The citing expression patterns are classified as pre - symbol, pos - symbol, the seal. In the end, the authors draw out 12 documents randomly to test the citing expression patterns, and that the recall is 84.95%, the precision is 72.88%.
出处
《图书情报工作》
CSSCI
北大核心
2009年第15期142-145,141,共5页
Library and Information Service
基金
国家社科基金重点项目"文化典籍整理开发的智能技术研究"(项目编号:08ATQ002)
2008年度佛山市社科规划项目"佛山地方志数字化整理模式研究"(项目编号:W-15)研究成果之一
关键词
引书模式
引书识别
引用表达模式
模式识别
citing pattern cited book recognition citing expression pattern pattern recognition