日语时间表达式识别与日汉翻译研究被引量：1

Japanese Time Expression Recognition and Translation

下载PDF

导出

摘要基于自定义知识库,提出一种知识库强化规则集以及与统计模型相结合的日语时间表达式识别方法,旨在不断提高时间表达式的识别精准度。按照Timex2标准对时间表现的细化分类,结合日语时间词的特点,渐进地扩展重构日语时间表达式知识库,实现基于知识库获取的规则集的优化更新。同时,融合条件随机场CRF统计模型,提高日语时间表达式识别的泛化能力。通过考察基于短语的翻译模型翻译时间词的精度,提出统计机器翻译(SMT)结合规则翻译日语时间词的必要性。实验结果显示,日语时间表达式识别的开放测试F1值达到0.8987,基于《日汉时间词平行字典》与规则的翻译精度和召回率都略高于基于统计机器翻译模型。 Based on the defined knowledge base, the authors presented a Japanese time expression recognition method through combining rules set strengthened by knowledge base with statistical model. In order to increase recognition accuracy, according to the Timex2 standards＇ granular classification on time, the knowledge base was progressively expanded and reconstructed given the Japanese time characteristic to achieve rules set optimization and update. Simultaneously, CRF model was fused to enhance the generalization ability of Japanese time expression recognition. The authors studied the time translation accuracy of phrase-based translation model and proved the necessity of combing rules with statistical machine translation （SMT）. Experiment results show that the F1 value of Japanese time expression recognition reaches 0.8987 on open test, and both the precision and recall by the method based on rules and parallel dictionary of Japanese to Chinese time expression are a bit higher than those by the method based on statistical translation model.

作者赵紫玉徐金安张玉洁刘江鸣

机构地区北京交通大学计算机与信息技术学院

出处《北京大学学报（自然科学版）》 EI CAS CSCD 北大核心 2014年第1期180-186,共7页 Acta Scientiarum Naturalium Universitatis Pekinensis

基金国家自然科学基金(61370130) 科学技术部国际科技合作计划(K11F100010) 中央高校基本科研业务费专项资金(2010JBZ2007) 中国科学院计算技术研究所智能信息处理重点实验室开放课题(IIP2010-4) 北京交通大学人才基金(2011RC034)资助

关键词知识库规则统计模型统计机器翻译时间词平行字典 knowledge base rule statistical model statistical machine translation time parallel dictionary

分类号 TP391.2 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献15

1邬桐,周雅倩,黄萱菁,吴立德.自动构建时间基元规则库的中文时间表达式识别[J].中文信息学报,2010,24(4):3-10. 被引量：16
2贺瑞芳,秦兵,刘挺,潘越群,李生.基于依存分析和错误驱动的中文时间表达式识别[J].中文信息学报,2007,21(5):36-40. 被引量：21
3Maqur P, Dale R. A rule based approach to temporal expression tagging II Proceeding of the International Multi Conference on Computer Science and Information Technology. Wisla, Poland, 2007: 293- 303.
4Wu Mingli, Li Wenjie, Lu Qin, et al. A Chinese temporal parser for extracting and normalizing tern - oral information II InternationalJoint Conference on Natural language Processing (IJCNlP).Jeju Island, 2005: 694-706.
5Ahn D, Adafre S F, de Rijke M. Recognizing and interpreting temporal expressions in open domain texts. Digital Information Management, 2005, 3(1): 14-20.
6Ahn D D, Adafre SF, de Rijke M. Towards task-based temporal extraction and recognition II Proceeding on Annotating, Extracting, and Reasoning about Time and Events. Schloss Dagstuhl, Germany, 2005: 193- 205.
7Hacioglu K, Chen Y. Benjamin douglas automatic time expression labeling for English and Chinese text II Computational linguistics and Intelligent Text Processing. Mexico City, 2005: 548-559.
8ACE (Automatic Content Extraction) Chinese Annota- tion Gubdelines for TIMEX2(Summary). Version 1.2 20050610[EB/Ol]. (2005-05-05)[2012-05-03]. http:// www.Idc.upenn.edu/projects/ACE.
9刘成亮韩海伟.知识库系统的原理及其在智能搜索引擎中的应用.电脑知识与技术,2008,(8):1512-1514.
10Brill, Eric. Transformation-based error-driven learning and natural language processing: a case study in part of speech tagging. Computational Linguistics, 1995,21(4): 543-565.

二级参考文献23

1Seok Bae Jang, Jennifer Baldwin. Inderjeet Mani Automatic TIMEX2 Tagging of Korean News [J].ACM Transactions on Asian Language Information processing (TALIP), 2004, 3(1) : 51-65.
2Nikolai Vazov A System for Extraction of Temporal Expressions from French Texts based on Syntactic and Semantic Constraints[C]//Proceedings of the workshop on Temporal and spatial information processing, 2001, Volume 13: Article No. 14:1-8.
3Estela Saquete, Patricio Martinez-barco. Rafael Mufioz Recognizing and Tagging Temporal Expressions in Spanish [C]//Workshop on Annotation Standards for Temporal Information in Natural Language (LREC), 2002: 44-51.
4Mingli Wu, Wenjie Li, Qin Lu, Baoli Li. A Chinese Temporal Parser for Extracting and Normalizing Temporal Information [C]//International Joint Conference on Natural Language Processing ( IJCNLP), 2005, Volume 3651: 694-706.
5David Ahn, Sisay Fissaha Adafre, Maarten De Rijke Towards Task-Based Temporal Extraction and Recognition [C]//Proceedings Dagstuhl Workshop on Annotating, Extracting, and Reasoning about Time and Events, 2005.
6Kadri Hacioglu, Ying Chen. Benjamin Douglas Auto matic Time Expression Labeling for English and Chi nese Text [C]//Computational Linguistics and Intelli gent Text Processing (CICLing), 2005, Volume 3406 548-559.
7贺瑞芳,秦兵,刘挺,潘越群,李生.基于依存分析和错误驱动的中文时间表达式识别[J].中文信息学报,2007,21(5):36-40. 被引量：21
8Mingli Wu,Wenjie Li,Qin Lu,Baoli Li.CTEMP:A Chinese Temporal Parser for Extracting and Normalizing Temporal Information[A].IJCNLP 2005[C].694-706.
9Yang Ye,Victoria Li Fossum,and Steven Abney.Latent features in automatic tense translation between chinese and english[A].In:Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing[C].Sydney,Australia:July 2006.48-55.
10SemEval-2007[EB/OL].http://nlp.cs.swarthmore.edu/semeval/index.shtml.

共引文献27

1李君婵,谭红叶,王风娥.中文时间表达式及类型识别[J].计算机科学,2012,39(S3):191-194. 被引量：9
2丁建琴,张娣.文检课智能教学系统中知识库的构建[J].现代情报,2008,28(12):184-185.
3贺瑞芳,秦兵,潘越群,刘挺,李生.基于启发式错误驱动学习的中文时间表达式识别[J].高技术通讯,2008,18(12):1258-1262. 被引量：3
4徐永东,王亚东,刘杨,王伟,权光日.多文档文摘中基于时间信息的句子排序策略研究[J].中文信息学报,2009,23(4):27-33. 被引量：8
5邬桐,周雅倩,黄萱菁,吴立德.自动构建时间基元规则库的中文时间表达式识别[J].中文信息学报,2010,24(4):3-10. 被引量：16
6朱莎莎,刘宗田,付剑锋,朱芳.基于条件随机场的中文时间短语识别[J].计算机工程,2011,37(15):164-167. 被引量：16
7谭红叶,郑家恒,梁吉业.时间关系识别研究进展[J].中文信息学报,2011,25(5):44-52. 被引量：6
8许旭阳,李弼程,张先飞,席耀一.基于条件随机场与自定义规则的时间表达式识别[J].情报学报,2011,30(10):1065-1071. 被引量：3
9沈思,苏新宁,谢靖,王东波.基于清华汉语树库的时间表达式抽取模型构建研究[J].图书情报工作,2012,56(18):127-132. 被引量：6
10肖升,何炎祥,李勇帆.基于依存分析的中文时间表达式类型判定[J].计算机应用,2013,33(6):1582-1586. 被引量：2

同被引文献11

1李君婵,谭红叶,王风娥.中文时间表达式及类型识别[J].计算机科学,2012,39(S3):191-194. 被引量：9
2赵军.命名实体识别、排歧和跨语言关联[J].中文信息学报,2009,23(2):3-17. 被引量：50
3邬桐,周雅倩,黄萱菁,吴立德.自动构建时间基元规则库的中文时间表达式识别[J].中文信息学报,2010,24(4):3-10. 被引量：16
4王斯日古楞,斯琴图,那顺乌日图.汉蒙机器翻译系统中量词翻译研究[J].中文信息学报,2010,24(5):92-95. 被引量：3
5赵紫玉,徐金安,张玉洁,刘江鸣.规则与统计相结合的日语时间表达式识别[J].中文信息学报,2013,27(6):192-200. 被引量：3
6邹岳琳,吐尔根.依布拉音,麦热哈巴.艾力,艾山.吾买尔,帕力旦.吐尔逊.基于词干提取的维吾尔语事件类时间短语识别[J].计算机工程与设计,2014,35(2):625-630. 被引量：6
7王伟,赵东岩,苏婷婷.C-TERN:一种基于CFSA的军事新闻文本时间信息处理算法[J].北京大学学报（自然科学版）,2014,50(1):9-16. 被引量：4
8张磊,杨雅婷,米成刚,李晓.维吾尔语数词类命名实体的识别与翻译[J].计算机应用与软件,2015,32(8):64-67. 被引量：6
9尹存燕,黄书剑,戴新宇,陈家骏.中英命名实体识别及对齐中的中文分词优化[J].电子学报,2015,43(8):1481-1487. 被引量：6
10李风环,郑德权,赵铁军.基于浅层语义分析的主题事件的时间识别[J].山东大学学报（理学版）,2015,50(11):74-80. 被引量：1

引证文献1

1阿依古丽.哈力克,艾山.吾买尔,吐尔根.伊布拉音,卡哈尔江.阿比的热西提,买合木提.买买提.汉维时间数字和量词的识别与翻译研究[J].中文信息学报,2016,30(6):190-200. 被引量：8

二级引证文献8

1买合木提.买买提,卡哈尔江.阿比的热西提,艾山.吾买尔,吐尔根.依布拉音,王路路.CRF与规则相结合的维吾尔文地名识别研究[J].中文信息学报,2017,31(6):110-118. 被引量：9
2朱顺乐.融合深度学习特征的汉维短语表过滤研究[J].计算机技术与发展,2018,28(7):149-154. 被引量：1
3买合木提.买买提,王路路,吐尔根.依布拉音,艾山.吾买尔,卡哈尔江.阿比的热西提.基于条件随机场的维吾尔文机构名识别[J].计算机工程与设计,2019,40(1):273-278. 被引量：5
4王路路,艾山.吾买尔,吐尔根.依布拉音,买合木提.买买提,卡哈尔江.阿比的热西提.基于深度神经网络的维吾尔文命名实体识别研究[J].中文信息学报,2019,33(3):64-70. 被引量：10
5朱顺乐.融合多特征的汉维神经网络机器翻译模型[J].计算机工程与设计,2019,40(5):1484-1488. 被引量：7
6阿依古丽·哈力克,卡哈尔江·阿比的热西提,艾山·吾买尔,吐尔根·依布拉音.维吾尔语-汉语量词短语的神经机器翻译[J].计算机工程与设计,2019,40(9):2649-2653. 被引量：3
7赵莉莉,马雪梅.汉维数词对比与翻译[J].文化创新比较研究,2021,5(32):136-139.
8景治强,刘林.维吾尔语量词研究综述——基于1982年至今CNKI文献的分析[J].现代语言学,2021,9(5):1207-1220.

1赵紫玉,徐金安,张玉洁,刘江鸣.规则与统计相结合的日语时间表达式识别[J].中文信息学报,2013,27(6):192-200. 被引量：3
2高源,席耀一,李弼程,李苏奕.基于词典特征优化和依存关系的中文时间表达式识别[J].信息工程大学学报,2016,17(4):490-495. 被引量：4
3邬桐,周雅倩,黄萱菁,吴立德.自动构建时间基元规则库的中文时间表达式识别[J].中文信息学报,2010,24(4):3-10. 被引量：16
4许旭阳,李弼程,张先飞,席耀一.基于条件随机场与自定义规则的时间表达式识别[J].情报学报,2011,30(10):1065-1071. 被引量：3
5王凤玲.基于条件随机域模型的英语时间表达式识别研究[J].电子技术（上海）,2012,39(5):8-10. 被引量：2
6王伟,赵东岩,苏婷婷.C-TERN:一种基于CFSA的军事新闻文本时间信息处理算法[J].北京大学学报（自然科学版）,2014,50(1):9-16. 被引量：4
7高霄云,杨建林.基于规则的中文时间词和数词的自动识别算法[J].现代图书情报技术,2007(3):46-50. 被引量：2
8罗跃生,凌焕章,吴荣华,齐绪.广义线性系统的结构分类方法[J].控制工程,2012,19(4):639-643.
9贺瑞芳,秦兵,刘挺,潘越群,李生.基于依存分析和错误驱动的中文时间表达式识别[J].中文信息学报,2007,21(5):36-40. 被引量：21
10李君婵,谭红叶,王风娥.中文时间表达式及类型识别[J].计算机科学,2012,39(S3):191-194. 被引量：9

北京大学学报（自然科学版）

2014年第1期

浏览历史

内容加载中请稍等...

日语时间表达式识别与日汉翻译研究被引量：1

参考文献15

二级参考文献23

共引文献27

同被引文献11

引证文献1

二级引证文献8

相关作者

相关机构

相关主题

浏览历史

日语时间表达式识别与日汉翻译研究 被引量：1

参考文献15

二级参考文献23

共引文献27

同被引文献11

引证文献1

二级引证文献8

相关作者

相关机构

相关主题

浏览历史

日语时间表达式识别与日汉翻译研究被引量：1