期刊文献+

专利文献OCR校对方法研究 被引量:2

The Study of OCR Proofreading Method of Patent Document
下载PDF
导出
摘要 专利文献代码化对于专利无纸化审查、专利分析、专利检索和专利管理都非常重要。本文提出一种以专利文献OCR校对词典和技术领域特征为基础,利用中文分词、隐马尔科夫模型为方法的专利文献OCR校对框架和专利文献OCR中文文本的拼写校对方法,降低了人力成本投入,提高了专利文献代码化效率和代码化质量。本文最后给出了实验系统和实现结果。 Codification of patent document is important to the paperless review, analysis, searching and management of patents. This paper presents an OCR proofing framework of patent docume,at and a spelling proofing method of Chinese text produced by patent document OCR, based on an OCR proofreading dictionary and technical features of the fields of patent document, using Chinese words segmentation and HMM Model. The method will reduce labor costs and improve the efficiency and quality of patent document codification. Finally, the experimental system and results are presented.
出处 《情报杂志》 CSSCI 北大核心 2011年第3期182-184,190,共4页 Journal of Intelligence
关键词 OCR校对 专利文献 HMM模型 校对词典 OCR proofing patent document HMM model proofreading dictionary
  • 相关文献

参考文献5

二级参考文献25

  • 1Andrew R.Golding.A Bayesian hybrid method for context-sensitive spelling correction[C]//David Yarowsky and Kenneth Church.In Proc.Third Workshop on Very Large Corpora.Cambridge,Massachusetts,USA:Morgan Kaufmann Publishers,1995:39-53.
  • 2Andre R.Golding,Yves Schabes.Combining trigrambased and feature-based methods for context-sensitiveSpelling Correction[C]//Proc.34th Annual Meeting of the Association for Computational Linguistics.Santa Cruz,USA:Morgan Kaufmann Publishers,1996:71-78.
  • 3Xiao tong,David A.Evans.A statistical approach to automatic OCR error correction in context[DB/OL].1996.http://acl,ldc.upenn,edu/W/W96/W96-0108.pdf.
  • 4Andre R.Golding,Dan Roth.A winnow-based approach to context-senstive spelling correction[J].Machine Learning,1999,34 (1-3):107-130.
  • 5Yuen-Hsien Tseng.Error Correction in-a Chinese OCR Test Collection[C]//Proceedings of the 25th International ACM SIGIR Conference on Research and Development in Information Retrieval.Tampere,Finland:ACM,2002:429-430.
  • 6Zhang Lei,Zhou Ming,Huang Changning,et al.Multifeature-based approach to automatic error detection and correction of Chinese text[DB/OL].1999.http://www,math.ryukoku,ac.jp/- qma/activity/ NLPNN9 9/CONTENTS/zhang.pdf.
  • 7Zhang ZhaoHuang.A Pilot Study on Automatic Chinese Spelling Error Correction[J].Communications of COLIPS,1994,4(2):143-149.
  • 8Masaaki NAGATA:Japanese OCR Error Correction Using Character Shape Similiarity and Statistical language model[C]// Proceedings of the 17th international conference on Computational linguistics.Montreal,Quebec,Canada:Association for Computational Linguistics,1998:922-928.
  • 9Tetsuo ARAKI,Satoru IKEHARA,Nobuyuki TSUKAHARA,et al:An evaluation to detect and correct erroneous characters wrongly substitute,deleted and inserted in Japanese and English sentences using Markov models[C]//Proceedings of the 15th conference on Computational linguistics.Kyoto,Japan:Association for Computational Linguistics,1994:187-193.
  • 10Kazem Taghva,Julie Borsack,Allen Condit:An expert system for automatically correcting OCR Output[C]// Proceedings of the IS&T/SPIE 1994 International Symposium on Electronic Imaging Science and Technology.California,USA:Springfield,1994:270-278.

共引文献26

同被引文献3

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部