基于排序方法的汉语句际关系树自动分析（英文）

A New Ranking Method for Chinese Discourse Tree Building

下载PDF

导出

摘要提出一种自动分析汉语小句级句际关系树的新方法。在修辞结构理论体系下,构建一个汉语句际关系标注语料库。不同于传统的只关心相邻两个单元的方法,提出一种类排序模型(SVM-R),自动构建汉语句际关系的树结构,旨在把握相邻3个单元之间的关联强度。实验结果表明,所提出的SVM-R模型对句际关系树的分析显著优于传统方法。最后提出并验证了丰富的、适合于汉语句际关系分析的语言特征。 This paper proposes a novel method for sentence-level Chinese discourse tree building. The authors constrcut a Chinese discourse annotated corpus in the framework of Rhetorical Structure Theory, and propose a ranking-like SVM（SVM-R） model to automatically build the tree structure, which can capture the relative associated strength among three consecutive text spans rather than only two adjacent spans. The experimental results show that proposed SVM-R method significantly outperforms state-of-the-art methods in discourse parsing accuracy. It is also demonstrated that the useful features for discourse tree building are consistent with Chinese language characteristics.

作者吴云芳万富强徐艺峰吕学强

机构地区计算语言学教育部重点实验室网络文化与数字传播北京市重点实验室

出处《北京大学学报（自然科学版）》 EI CAS CSCD 北大核心 2016年第1期65-74,共10页 Acta Scientiarum Naturalium Universitatis Pekinensis

基金国家自然科学基金(61371129) 国家重点基础研究发展计划(2014CB340504) 国家社会科学基金重大项目(12&ZD227) 网络文化与数字传播北京市重点实验室开放课题(ICDD201302)资助

关键词句际关系树构建排序方法汉语句际关系语料库 discourse tree building ranking method Chinese discourse annotated corpus

分类号 TP391.1 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献21

1Mann W, Thompson S. Rhetorical structure theory: toward a functional theory of text organization. Text, 1988, 8(3): 243-281.
2Hernault H, Prendinger H, duVerle D, et al. HILDA: a discourse parser using support vector machine classification. Dialogue and Discourse, 2010, 1 (3): 1-33.
3Feng V, Hirst G. Text-level discourse parsing with rich linguistic features // Proceedings of ACL-2012. Jeju: Association for Computational Linguistics, 2012:60-68.
4Feng V, Hirst G. A linear-time bottom-up discourse parser with constraints and post-editting // Proceedings of ACL-2014. Baltimore: Association for Computational Linguistics, 2014:511-521.
5Prasad R, Dinesh N, Lee A, et al. The Penn Discourse Treebank 2.0 // Proceedings of the 6th International Conference on Language Resources and Evaluation. Marrakech, 2008:2961-2968.
6Lin Z, Kan M Y, Ng H T. Recognizing implicit discourse relations in the Penn Discourse Treebank // Proceedings of EMNLP-2009. Singapore: Association for Compu- tational Linguistics, 2009:343-351.
7Pitler E, Louis A, Nenkova A. Automatic sense prediction for implicit discourse relations in text // Proceedings of ACL-2009. Singapore: Association for Computational Linguistics, 2009:683-691.
8Cao F, Xie T. Functional study of topic in Chinese: the first step towards discourse analysis. Beijing: Language and Culture Press, 1995.
9Carlson L, Marcu D, Okurowski M E. Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory // Proceedings of Second SIGdial Workshop on Discourse and Dialogue. Enschede: Springer, 2003:85-112.
10Soricut R, Marcu D. Sentence level discourse parsing using syntactic and lexical infbrmation // Proceedings of NAACL-2003. Edmonton: Association for Compu- tational Linguistics, 2003:149-156.

1吴云芳,石静,万富强,吕学强.汉语并列复句的自动识别方法[J].北京大学学报（自然科学版）,2013,49(1):1-6. 被引量：6
2胡嘉伟,吴云志,乐毅,张友华.基于改进LF算法的PPI网络聚类方法[J].湖南工程学院学报（自然科学版）,2016,26(3):56-59. 被引量：1
3吴云芳,徐艺峰,王恺然.汉语篇章级小句关系的标注体系[J].中文信息学报,2015,29(3):71-81. 被引量：3
4张志飞,苗夺谦,聂建云,岳晓冬.否定句的情感不确定性度量及分类[J].计算机研究与发展,2015,52(8):1806-1816. 被引量：8
5张毅成,戴连奎,杨正春,金建祥.四水箱实验系统的关联分析与解耦[J].自动化仪表,2005,26(11):15-18. 被引量：6
6娄旭.政治关联模型探究：一种基于五因素的测度方法[J].中州建设,2016,0(8):76-78.
7陈强,杜攀,陈海强,包秀国,刘悦,程学旗.K-Canopy:一种面向话题发现的快速数据切分算法[J].山东大学学报（理学版）,2016,51(9):106-112. 被引量：2
8胡雪娇,李慧,马国栋.BBS成员聚类及交互特性分析[J].首都师范大学学报（自然科学版）,2014,35(3):10-13.
9师文轩,殷爱茹.垃圾邮件的概念漂移及过滤技术研究[J].中国科技论文,2014,9(10):1111-1117. 被引量：2
10胡金柱,王琳,肖明,罗旋,姚双云,罗进军.汉语复句本体模型初探[J].华中师范大学学报（自然科学版）,2005,39(4):466-469. 被引量：11

北京大学学报（自然科学版）

2016年第1期

浏览历史

内容加载中请稍等...

基于排序方法的汉语句际关系树自动分析（英文）

参考文献21

相关作者

相关机构

相关主题

浏览历史