期刊文献+

一种自动发现、分割与标注引文元数据的方法 被引量:2

An Approach to Auto-detection,Segmentation and Tagging of Bibliographic Metadata
下载PDF
导出
摘要 在总结现有的引文元数据抽取方法的基础上,针对引文的排版惯例——引文在文档内部风格一致,提出了一种新的引文元数据抽取方法。重点描述了以往研究中很少涉及的引文元数据的自动发现和分割,探讨了风格一致性在引文元数据标注中的应用。实验结果表明此方法在引文元数据发现、分割和标注方面均取得了较好的效果。 After reviewing the existing methods on citation data extraction, the authors propose a new approach for the task depending on a common typesetting practice of bibliographies: style consistency of citation data in the same document. Citation data detection and segmentation task are described on which less attention is put in previous researches. Furthermore, the authors take advantage of the style consistency of bibliographies to enhance citation metadata tagging. Experimental results show that the proposed method performs well in citation data detection, segmentation and tagging.
出处 《北京大学学报(自然科学版)》 EI CAS CSCD 北大核心 2010年第6期893-900,共8页 Acta Scientiarum Naturalium Universitatis Pekinensis
基金 国家科技支撑计划(2006BAH02A21)资助
关键词 引文元数据 风格一致性 元数据抽取 数字图书馆 bibliographic metadata style consistency metadata extraction digital library
  • 相关文献

参考文献15

  • 1张铭,银平,邓志鸿,杨冬青.SVM+BiHMM:基于统计方法的元数据抽取混合模型[J].软件学报,2008,19(2):358-368. 被引量:27
  • 2蒋新.英美学术文献的几种主要引文方式[J].图书与情报,2003(3):26-30. 被引量:8
  • 3李朝光,张铭,邓志鸿,杨冬青,唐世渭.论文元数据信息的自动抽取[J].计算机工程与应用,2002,38(21):189-191. 被引量:38
  • 4Wei W,King I,Lee J H M.Bibliographic attributes extraction with layer-upon-layer tagging. Proc ICDAR‘07 . 2007
  • 5Besagni D,Belaid A,Benet N.A segmentation method for bibliographic references by contextual tagging of fields. Proc ICDAR‘03 . 2003
  • 6Day M Y,Tsai R T H,Sung C Let al.Reference metadata extraction using a hierarchical knowledge representation framework. Decision Support . 2007
  • 7Eli C,Altigran S,Silva Det al.FLUX-CIM:flexible unsupervised extraction of citation metadata. Proc JCDL‘07 . 2007
  • 8Huang A,Ho J M,Kao H Yet al.Extracting citation metadata from online publication lists using BLAST. Proc PAKDD‘04 . 2004
  • 9Chen C C,Yang K H,Kao H Yet al.BibPro:a citation parser based on sequence alignment techniques. Proc AINA‘08 . 2008
  • 10Jewell M.Para Cite:an overview. http://paracite.eprints.org/ . 2009

二级参考文献28

  • 1American Psyhological Association. (1983). Publication Manual of the American Psychological Association[M]. (3rd ed). Washington DC: American Psychological Association.
  • 2American Psyhological Association. (2001). Publication Manual of the American Psychological Association[ M]. (Sth ed). Washington DC: American Psychological Association.
  • 3Turabian, Kate L.(1996).A Manual for Writers of Tem Papers, Theses, and Disseaations[M]. (6thed). Chicago and london: The University of Chicago Press.
  • 4Gibaldi, Joseph. (1998). MAL Style Manual arid Guide to Scholarly Publishing[M] .(2nd ed).New York: The Modem Language Association of America.
  • 5Gibaldi, Joseph. (1999) .MLA Handbook for Writers of Research Papers[M]. (5th ed).New York:The Modem Language Association of America.
  • 6Morville P, Rosenfeld L. Information Architecture for the World Wide Web: Designing Large-Scale Web Site. 3rd ed., Sebastopol: 0'Reilly&Associates, 2006.
  • 7Chidlovskii B Wrapping web information providers by transducer induction. In: Racdt L, Flach P, eds. Proc of the 12th Int'l of European Conf. on Machine Learning (ECML 2001). LNCS 2167, Heidelberg: Springer-Verlag, 2001.61-72.
  • 8Hitchcock S, Carr L, Jiao Z, Bergmark D, Hall W, Lagoze C, Harnad S. Developing services for open eprint archives: Globalisation, integration and the impact of links. In: Proc. of the 5th ACM Conf. on Digital Libraries (ACMDL 2000). New York: ACM Press, 2000. 143-151.
  • 9Klink S, Dengel A, Kieninger T. Rule-Based document structure understanding with a fuzzy combination of layout and textual features. Int'l Journal on Document Analysis and Recognition, 2001,4( 1): 18-26.
  • 10Kim J, Le DX, Thoma GR. Automated labeling algorithms for biomedical document images. In: Proc. of the 7th World Multiconference on Systemics, Cybernetics and Informatics. Orlando: ⅢS, 2003. 352-357.

共引文献66

同被引文献33

  • 1李莉,木拉提.哈米提.医学影像数据分类方法研究综述[J].中国医学物理学杂志,2011,28(6):3007-3011. 被引量:9
  • 2郭志鑫,金海,陈汉华.SemreX中基于语义的文档参考文献元数据信息提取[J].计算机研究与发展,2006,43(8):1368-1374. 被引量:8
  • 3王一达,沈熙玲,谢炯.遥感图像分类方法综述[J].遥感信息,2006,28(5):67-71. 被引量:33
  • 4陈俊林,张文德.基于XSLT的PDF论文元数据的优化抽取[J].现代图书情报技术,2007(2):18-23. 被引量:9
  • 5Wei W, King I, Lee JHM. Bibliographic attributes extraction with layer-upon-layer tagging. Proc of the ICDAR'07. Curitiba, 2007: 804-808.
  • 6Besagni D, Belaid A, Benet N. A segmentation method for bibliographic references by contextual tagging of fields. Proc. of the ICDAR'03. Edinburgh, 2003: 384-388.
  • 7Ding Y, Chowdhury G, Foo S. Template mining for the extra- ction of citation fi'om digital documents. Proc. of the Second Asian Digital Library Conference. Taiwan, 1999: 47-62.
  • 8Day MY, Tsai RTH, Sung CL, Hsieh CC, Lee CW, Wu SH, Wu KP, Ong CS, Hsu WL. Reference Metadata Extraction Using a Hierarchical Knowledge representation framework. Decision Support Systems, 2007,43:152-167.
  • 9Eli C, da Silva AS, Marcos AG, Filipe M, de Moura ES. FLUX-CIM: flexible unsupervised extraction of citation metadata. Proc. of the JCDL'07. New York: ACM Press, 2007:215-224.
  • 10Chen CC, Yang KH, Kao HY, Ho JM. BibPro: A citation parser based on sequence alignment techniques. Proc. of the IEEE AINA'08. Okinawa, Japan, 2008: 1175-1180.

引证文献2

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部