期刊文献+

一个面向信息处理的双向文字处理算法IBidi 被引量:2

IBidi——bidirectional algorithm oriented to information processing
下载PDF
导出
摘要 根据多文种信息处理中双向文字所存在的问题,提出了一种面向信息处理、具有自描述能力的双向文字处理算法IBidi。该算法首先对字符流进行预处理,主要对数字等特殊的字符进行标注;然后分析字符流,添加各种定义好的标签,用于描述字符的特性,供信息处理系统使用;最后,IBidi利用一个重新排序算法输出处理结果。该算法在典型测试样本上正确率达到96.7%,比Unicode的双向文字处理算法高出约17个百分点。另外,在随机样本测试中,IBidi的正确率也比Unicode的双向文字处理算法高5%左右。 According to the existing problems in bidirectional text recognition, a new bidirectional algorithm-IBidi was put forward to process bidirectional text and it had the ability of self-descrlption and was oriented to information processing. Firstly, IBidi preprocessed the text stream and tagged the digits. Then it analyzed the text stream and tagged the string with predefined marks to describe the characteristic of strings. Finally, a sorting algorithm was used to sort text stream for display. The experimental result on a typical test set shows that the precision of IBidi is up to 96.7%, while that of Unicede's bidirectional algorithm is only 80%. Additionally, the experimental result on random test also shows that the precision of IBidi is 5% higher than that of Unicede's bidirectional algorithm.
出处 《计算机应用》 CSCD 北大核心 2007年第6期1513-1517,共5页 journal of Computer Applications
基金 国家自然科学基金资助项目(60673041) 江苏省高技术研究项目(BG2005020) 江苏省自然科学基金资助项目(BK2003030)
关键词 双向文字处理算法 IBidi 标签 双向文字 bidirectional algorithm IBidi tag bidirectional text
  • 相关文献

参考文献8

  • 1JOSEPH B.Arabic Word Processing[J].Communications of the ACM,1987,30(7):600-610.
  • 2DAVIS M.Unicode Standard Annex #9:The Bidirectional Algorithm[EB/OL].http://www.unicode.org/reports/tr9/,2005.
  • 3WEINSTEIN V.Getting Started With ICU[A].San Jose:Proceedings of the 26th Internationalization and Unicode Conference[C].2004.
  • 4MARK L.The UCData Unicode Character Properities and Bidi Algorithm Package[EB/OL].http://crl.nmsu.edu/ ~ meleisher/ucdata.html,2000.
  • 5DOV G.A Free Implementation of the Unicode Bidi Algorithm[EB/OL].http://imagic.weiz-mann.ac.il/dov/freesw/ FriBidi/,2000.
  • 6EDWARD S.A Framework for Multilingual Information Processing[D].Doctoral dissertation,USA:Florida Institute of Technology,2003.
  • 7JONES P.Haskell 98-a Non-strict,Purely Functional Language[EB/OL].http://www.haskell.org/onlinereport,1999.
  • 8李培峰,朱巧明,钱培德.多文种环境下汉字内码识别算法的研究[J].中文信息学报,2004,18(2):73-79. 被引量:16

二级参考文献1

  • 1张轴材.ISO/ IEC 10646-1 and Unicode标准与实现.CharacterCode amp Data To Come研讨会[R].,1996..

共引文献15

同被引文献8

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部