期刊文献+

基于最大-最小相似度学习方法的文本提取 被引量:1

Text Extraction Based on Maximum-Minimum Similarity Training Method
下载PDF
导出
摘要 应用最大-最小相似度(maximum-minimum similarity,简称MMS)学习方法,对基于高斯混合模型的文本区域提取方法中的有关参数进行优化.该学习方法通过最大化正样本相似度和最小化反样本相似度获得最佳分类能力.根据这种判别学习思想,建立了相应的目标函数,并利用最速梯度下降法寻找目标函数最小值,以得到文本区域提取方法的最优参数集合.文本区域提取实验结果表明:在用期望最大化(expectation maximization,简称EM)算法获得参数的极大似然估计值后,使用最大-最小相似度学习方法,使文本提取综合性能明显提高,开放实验的召回率和准确率分别达到98.55%和93.56%.在实验中,最大-最小相似度学习方法的表现还优于常用的判别学习方法——最小分类错误(minimum classification error,简称MCE)学习方法. This paper proposes a maximum-minimum similarity training algorithm to optimize the parameters in the effective method of text extraction based on Gaussian mixture modeling of neighbor characters. The maximum-minimum similarity training (MMS) methods optimize recognizer performance through maximizing the similarities of positive samples and minimizing the similarities of negative samples. Based on this approach to discriminative training, it defines the objective function for text extraction, and uses the gradient descent method to search the minimum of the objective function and the optimum parameters for the text extraction method. The experimental results of text extraction show the effectiveness of MMS training in text extraction, Compared with the maximum likelihood estimation of parameters from expectation maximization (EM) algorithm, the training results after MMS has the performance of text extraction improved greatly. The recall rate of 98.55% and the precision rate of 93.56% are achieved. The experimental results also show that the maximum-minimum similarity (MMS) training behaves better than the commonly used discriminative training of the minimum classification error (MCE).
出处 《软件学报》 EI CSCD 北大核心 2008年第3期621-629,共9页 Journal of Software
基金 Supported by the National Natural Science Foundation of China under Grant No.60473049 (国家自然科学基金) the National Basic Research Program of China under Grant No.2006CB303105 (国家重点基础研究发展计划(973)) the Excellent Young Scholars Research Fund of Beijing Institute of Technology of China under Grant No.2006Y1202 (北京理工大学优秀青年教师资助计划)
关键词 文本提取 高斯混合模型 判别学习 最大-最小相似度学习 最小分类错误学习 text extraction Gaussian mixture modeling discriminative training maximum-minimum similarity training minimum classification error training
  • 相关文献

参考文献6

二级参考文献61

  • 1A. K. Jain and B. Yu. Automatic text location in images and video frames [C]. Proceedings of 14th International Conference on Pattern Recognition, pages 1998. 1497- 1499.
  • 2X. W. Wang, X. Q. Ding, C. S. Liu. Character Extraction and Recognition in Natural Scene Images[C]. Proc. of ICDAR'2001, 1084-1088.
  • 3H. Goto and H. Aso. Character Pattern Extraction from Colorful Documents with Complex[C]. Proceedings of 16th International Conference on Pattern Recognition, ICPR2002, Aug. 2002,Canada.
  • 4C. Li, X. Q. Ding, Y. S. Wu. Automatic Text Location in Natural Scene Images [C]. Proc. of ICDAR'2001, 1069- 1073.
  • 5X. W. Wang, X. Q. Ding, C. S. Liu. Gray-scale Character Image Recognition Based on Fuzzy DCT Transform Features [C]. Proc. of ICPR'2000, 235-238.
  • 6Han J,Proc IEEE 1998 Int Conf Acoust Speech(ICASSP'98)Signal Processing,1998年,81页
  • 7Juang B,IEEE Thins Speech Audio Processing,1997年,5卷,3期,257265页
  • 8Rathinavelu C,IEEE Thins Speech Audio Processing,1997年,5卷,3期,243页
  • 9Han J,Proc 1997 Europ Conf Speech Communication and Technology(Eurospeech'97),1997年,1531页
  • 10Gales M,Comput Speech Lang,1995年,9卷,289页

共引文献35

同被引文献5

  • 1Breunig M, Kriegel H P, Ng R, et al. LOF: Identifying Densitybased Local Outliers[C] //Proc. of ACM SIGMOD International Conference on Management of Data.[S. l.] : ACM Press, 2000.
  • 2Tang Jian, Chen Zhixiang, Fu A W, et al. Enhancing Effectiveness of Outlier Detections for Low-density Patterns[C] //Proc. of the 6th Pacific-Asia Conference on Knowledge Discovery and Data Mining. Taipei, China:[s. n.] , 2002: 535-548.
  • 3Sanjay C, Sun Pei. SLOM: A New Measure for Local Spatial Outliers[J]. Knowledge and Information Systems, 2006, 9(4): 412-429.
  • 4凌妍妍,孟小峰,刘伟.基于属性相关度的Web数据库大小估算方法[J].软件学报,2008,19(2):224-236. 被引量:30
  • 5王妍,潘瑜春,阎波杰.基于Voronoi和空间自相关的离群点检测[J].计算机工程,2010,36(1):33-34. 被引量:5

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部