期刊文献+

双语网页分句匹配算法的研究与实现

Research and Implementation of Text Segmentation Algorithm in Bilingual Web Page
下载PDF
导出
摘要 在实现基于网络语料库和双语网页搜索的辅助翻译系统的过程中,利用网络机器人从互联网上获取中英文双语对照网页,对它们进行过滤,留下有用的信息,再把中英文句子进行匹配存入数据库。分句匹配算法是语言翻译处理领域的双语句子对齐过程,它将网页净化后获得的有用信息进行匹配,产生最终的双语语料。对分句匹配算法进行了描述,并且研究了匹配算法的实现过程。 In the process of implementing the assistant translation system based on bilingual corpus and bilingual pages searching, the Chinese and English bilingual pages from the net using network robot are used and filtered so as to get useful information. In this way, the Chinese and English sentences are matched and stored in the database. The text segmentation algorithm is the process that matches the bilingual sentences in language translation processing domain, which matches the useful information that we get in web page cleaning module to get the last bilingual corpus. The text segmentation algorithm was described and the implementing process of test segmentation algorithm was studied.
作者 刘东飞 卢苇
出处 《武汉理工大学学报(信息与管理工程版)》 CAS 2008年第5期708-710,共3页 Journal of Wuhan University of Technology:Information & Management Engineering
关键词 分句匹配 双语句对 匹配最优 text segmentation bilingual pairs of sentences best match
  • 相关文献

参考文献5

  • 1刘非凡,赵军,徐波.大规模非限定领域汉英双语语料库建设及句子对齐研究[M].北京:清华大学出版社,2003.
  • 2张霄军,张凌岚,刘军.基于Web语料挖掘技术及其系统设计[J].上海电力学院学报,2004,20(2):39-43. 被引量:5
  • 3RENIK P. A preliminary investigation into mining the Web for bilingual text [ R ]. Maryland : University of Maryland, 1998.
  • 4CHRISTOPHER C. Mining english/chinese parallel documents from the world wide web [ C ]. Proceedings of the International World Wide Web Conference. Hawaii : [ s. n. ], 1999 : 156 - 167.
  • 5JISONG C, ROWENA C. Discovering parallel text from the world wide web [ R]. Australia-Monash University, 2001.

二级参考文献2

共引文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部