一种手写图文分离方法被引量：1

A Method of Handwriting Texts and Shapes Separation

下载PDF

导出

摘要手写识别作为改善人机交互的技术之一已经变得越来越重要,涌现了大量对手写文字和手绘图形的研究工作,而作为手写识别的一个重要部分,对图形和文本的分类工作一直没有获得足够的重视。本文基于开源数据挖掘工具Weka设计并实现一种手写图文分离方法,基于LogitBoost、Random Forest和LADTree三种不同分类器的测试结果表明,LogitBoost的综合分类效果最好。通过联合3个分类器能够实现精确的图形判定,但文本的分类效率则受限于分类效果最差的分类器。同时基于信息增益评估结果,分析了不同特征对图文分类的影响。 As a technology to improve human-computer interaction , handwriting recognition is becoming more and more impor-tant.However, the distinction of handwriting texts and shapes has not drawn enough attention .In this paper, we designed and implemented a handwriting text and shape separation approach based on Weka .The experiment results based on three classifica-tion techniques , LogitBoost , RandomForest and LogitBoost , show that LogitBoost performances best .Through a combination of these three classifiers , shapes can be recognized more accurately , while the precision of text is limited by the classifier with lowest accuracy.Moreover, the effect of different features to the results is analyzed based on Information Gain Method .

作者胡兴鸿施大鹏冯桂焕

机构地区计算机软件新技术国家重点实验室南京大学软件学院

出处《计算机与现代化》 2013年第12期145-148,154,共5页 Computer and Modernization

基金国家自然科学基金资助项目(61100109)

关键词手写识别数据挖掘图文分离分类模型 sketch recognition data mining text-shape separation classification model

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献13

1Blagojevic R V. Using Data Mining for Digital Ink Recog- nition [ D ]. University of Auckland, 2011.
2Landay J A. SILK: Sketchting interfaces like crazy [ C ]// Proceedings of Human Factors in Computing Systems, ACM CHI' 96. 1996:398-399.
3Gross M D. The electronic cocktail napkin: A computa- tional environment for working with design diagrams [ J ]. Design Studies, 1996, 17(1):53-69.
4Hammond T, Davis R. Tahuti: A geometrical sketch rec- ognition system for UML class diagrams [ C ]//2002 AAAI Spring Symposium on Sketch Understanding. 2002:59-68.
5Jain A K, Namboodiri A M, Subrahmonia J. Structure in on-line documents [ C ]// Proceedings of the 6th Interna- tional Conference on Document Analysis and Recognition. 2001 : 844-848.
6Bishop C M, Svensen M, Hinton G E. Distinguishing text from graphics in online handwritten ink [ C ]//Proceedings of the 9th International Workshop on Frontiers in Handwrit- ing Recognition. 2004 : 142-147.
7Zhou X D, Liu C L, Quiniou S, et al. Text/non-text ink stroke classification in Japanese handwriting based on Markov random fields [ C l// Proceedings of the 9th Inter- national Conference on Document Analysis and Recogni- tion. 2007:377-381.
8Delaye A, Liu C L. Text/non-text classification in online handwritten documents with conditional random fields [ C ]// Proceedings of the Chinese Conference on Pattern Recognition. 2012: 514-521.
9杜剑锋.weka完整中文教程[DB/OL].http://wenku.baidu.com/view/449180c189ebl72ded63b7c7.htrnl.2012一05J09.
10Leo Breiman. Random forests [ J ]. Machine Learning, 2001,45(1) :5-32.

同被引文献23

1蒋维,张斌,孙正兴.基于自适应HMM的在线草图识别方法[J].计算机科学,2005,32(5):185-189. 被引量：4
2靳简明,江红英,王庆人.数学公式识别系统:MatheReader[J].计算机学报,2006,29(11):2018-2026. 被引量：13
3张小亮,孙根正,廖达雄,王淑侠.基于几何特征的在线手绘流程图识别[J].计算机辅助工程,2007,16(1):29-33. 被引量：4
4谢强,冯桂焕,孙正兴.基于上下文的在线草图识别方法[J].计算机科学,2007,34(3):216-219. 被引量：8
5Fish J, Scrivener S. Amplifying the mind's eye: Sketching and visual cognition [J]. Leonardo, 1990,23(1) : 117-126.
6Yeager L S, Webb B J, Lyon R F. Combining neural networks and context-driven search for online, printed handwriting recog- nition in the Newton[J]. AI Magazine, 1996,19 ( 1 ) : 73-89.
7Rubine D. Specifying gestures by example[J]. SIGGRAPH' 91, 1991,25(4) : 329-337.
8Cohen P, Johnston M, McGee D, et al. Quickset: Multimodal in- teraction for distributed applieatinns[C]//Proceedings of the Fifth ACM International Multimedia Conference. NY: ACM Press, 1997 : 31-40.
9Wilcox L D, Bill N S, Nitin S. Dynomite: A Dynamically Orga- nized Ink and Audio Notebook [C]//Proceedings of CHI' 97, 1997 : 186-193.
10Shilman M, Viola P. Spatial Recognition and Grouping Text and Graphics[C]//EUROGRAPHICS Workshop on Sketch-based Interface and modeling, 2004.

引证文献1

1陈泉,施大鹏,冯桂焕,赵小燕,骆斌.基于语法描述语言的在线手绘流程图识别[J].计算机科学,2015,42(B11):113-118.

1邱光华,陈素贤.图纸图文分离的研究[J].应用科技,1993,20(1):44-47.
2李伟青,彭群生.一种新的字符提取和组合算法[J].工程图学学报,1997,18(2):38-45. 被引量：7
3郁晓红,李伟青.工程图纸中字符提取和组合的新算法[J].杭州大学学报（自然科学版）,1998,25(1):50-54.
4王琴琴,徐家恺.快速图文分离算法[J].科学技术与工程,2008,8(8):2089-2092.
5王卓亚,王文杰.复印机图文分离技术及应用[J].计算机工程与科学,2006,28(3):134-135.
6刘金花,刘明启,黄晓花.AdaBoost算法在多属性二分类问题中的应用——以魔术伽马望远镜数据集为例[J].新余学院学报,2012,17(4):136-138.
7王砚坤.用VB实现手写文字模糊识别[J].电脑学习,2001(3):6-7.
8徐立.精彩演绎Flash手写文字[J].计算机应用文摘,2000(11):67-67.
9张敬普,王建玺.基于linux的备份系统的设计与实现[J].科技风,2010(16).
10方卫宁,邹华.工程图纸识别过程中特征点判定[J].计算机工程,1994,20(S1):594-598. 被引量：1

计算机与现代化

2013年第12期

浏览历史

内容加载中请稍等...

一种手写图文分离方法被引量：1

参考文献13

同被引文献23

引证文献1

相关作者

相关机构

相关主题

浏览历史

一种手写图文分离方法 被引量：1

参考文献13

同被引文献23

引证文献1

相关作者

相关机构

相关主题

浏览历史

一种手写图文分离方法被引量：1