期刊文献+

基于统计特征的数学公式抽取方法的研究 被引量:3

Research on Mathematical Formulas Extraction from Chinese Document Based on Statistical Features
下载PDF
导出
摘要 在分析公式特征的基础上,提出了一种将Parezen窗和Bayes分类规则相结合的公式抽取方法。对于孤立式公式采用改进后的Parzen窗方法将其从文档中抽取出来,对于内嵌公式通过Bayes分类规则将其从文本行中抽取出来。实验表明,这种抽取方法对中文文档具有较好的适应性和较高的成功率。 Based on the analysis of formula features, an approach composed of Parzen windows and Bayes theorem is proposed to extract mathematical formulas. Improved Parzen windows approach is used to extract the isolated formulas from the printed documents and Bayes theorem is used to extract the embedded formulas from the text lines. The experiments show that the combination of the two methods can obtain satisfactory results.
出处 《计算机工程》 EI CAS CSCD 北大核心 2006年第19期211-213,共3页 Computer Engineering
基金 河北省自然科学基金资助项目(F2004000132)
关键词 OCR技术 数学公式抽取 Bayes法则 OCR technique Mathematical formulas extraction Bayes theorem
  • 相关文献

参考文献6

  • 1Lee H J,Wang J S.Design of Mathematical Expression Recognition System[J].Pattern Recognition Letters,1997,18(3):289-298.
  • 2Fatematn R,Tokuyasu T,Berman B,et al.Optical Character Recognition and Parsing of Typeset Mathematics[J].Journal of Visual Commun.and Image Representation,1996,7(1):2-15.
  • 3Kacem A,Belald A,Ahemed M B.Automatic Segmentation of Mathematical Documents[C].Proceedings of ACIDCA,MonastirTunisia,2000:86-91.
  • 4Inoue K,Miyazaki R,Suzuku M.Optical Recognition of Printed Mathematical Documents[C].Proceeding of ATCM,1998:280-289.
  • 5Jin Jianming,Han Xionghu,Wang Qingren.Mathematical Formulas Extraction[C].Proceedings of the 7^th ICDAR,2003:1138-1141.
  • 6Kohavi R,Becker B,Sommerfield D.Improving Simple Bayes[C].Proc.of Poster Papers of the 9^th European Conference on Machine Learning,1997:78-87.

同被引文献17

  • 1王科俊,王黎斌,林桂芳.科技文献中数学公式定位技术概述[J].自动化技术与应用,2004,23(5):1-4. 被引量:3
  • 2杨捧,田学东.基于Parzen窗的印刷文档数学公式抽取的研究[J].计算机工程与应用,2005,41(23):200-202. 被引量:4
  • 3李新平.OCR技术的教育应用研究[J].软件导刊,2006,5(09X):45-46. 被引量:1
  • 4张志伟,孔凡让,刘维来,龙潜,刘永斌.中文科技文档中的数学表达式定位[J].中文信息学报,2007,21(4):86-91. 被引量:4
  • 5LEE H J, WANG J S. Design of mathematical expression recognition system[ J ]. Pattern Recognition Letters, 1997,18 ( 3 ) :289- 298.
  • 6FATEMAN R,TOKUYASU T, BERMAN B. Optical character recognition and parsing of typeset mathematics[ J]. Joumal of Visual Commun and Image Representation,1996,7( 1 ): 2-15.
  • 7INOUE K, MIYAZAKI R, SUZUKI M. Optical recognition of printed mathematical documents [ C ]// Proc of the 3rd Asian Technology Conference in Mathematics. 1998:280-289.
  • 8KACEM A,BELAID A,AHMED M B. Automatic extraction of printed mathematical formulas using fuzzy logic and propagation of context [J]. International Journal of Document Analysis and Recognition,2001,4(2) :97-108,
  • 9JIN Jian-ming, HAN Xiong-hu, WANG Qing-ren. Mathematical formulas extraction[ C]//Proc of the 7th International Conference on Document Analysis and Recognition. Washington DC:IEEE Computer Society ,2003 : 1138-1141.
  • 10GARAIN U, CHAUDHURI B B. Identification of embedded mathematical expressions in scanned documents[ C ]//Proc of the 17th International Conference on Pattern Recognition. Washington DC: IEEE Computer Society,2004:384- 387.

引证文献3

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部