期刊文献+

脱机手写维吾尔文本图像单词切分 被引量:2

Word extraction from Uyghur handwritten documents
下载PDF
导出
摘要 针对脱机手写维吾尔文本行图像中单词切分问题,提出了FCM融合K-means的聚类算法。通过该算法得到单词内距离和单词间距离两种分类。以聚类结果为依据,对文字区域进行合并,得到切分点,再对切分点内的文字进行连通域标注,进行着色处理。以50幅不同的人书写的维吾尔脱机手写文本图像为实验对象,共有536行和4 002个单词,正确切分率达到80.68%。实验结果表明,该方法解决了手写维吾尔文在切分过程中,单词间距离不规律带来的切分困难的问题和一些单词间重叠的问题。同时实现了大篇幅手写文本图像的整体处理。 For the problem of word extraction from handwritten Uyghur text lines,this paper proposes a clustering algorithm based on FCM fusion K-means.Through the clustering,two classification can be obtained for within word distance and between word distance.Based on clustering results,merging the connected components to get the segmented points.At the same time for the connected components which are within the segmented points used connected components labeling and coloring.In this paper,experimental object is 50 pairs of Uyghur off-line handwritten text images that are written different people and there are 536 lines and 4,002 words,correct segmentation rate reaches 80.68%.Experimental results show that the proposed method solve the problem which is difficult to extract words from the text line because of irregular distance between the words and overlapping between adjacent words.Meanwhile the presented method achieves whole dispose to the large handwritten text image.
作者 阿依萨代提.阿卜力孜 加合买提.司马义 卡米力.木依丁 艾斯卡尔.艾木都拉 AYSADET·Abliz;HOJAHMAT·Ismayil;KAMIL·Muyidin;ASKAR·Hamdulla(Institute of Information Science and Engineering,Xinjiang University,Urumqi 830046,China)
出处 《计算机工程与应用》 CSCD 北大核心 2018年第9期133-138,共6页 Computer Engineering and Applications
基金 国家自然科学基金(No.61462080)
关键词 维吾尔文 手写文本图像 单词切分 聚类 着色处理 Uyghur handwritten text image word extraction clustering coloring
  • 相关文献

参考文献11

二级参考文献106

共引文献332

同被引文献22

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部