期刊文献+

汉语框架语义角色的自动标注 被引量:41

Automatic Labeling of Semantic Roles on Chinese FrameNet
下载PDF
导出
摘要 基于山西大学自主开发的汉语框架语义知识库(CFN),将语义角色标注问题通过IOB策略转化为词序列标注问题,采用条件随机场模型,研究了汉语框架语义角色的自动标注.模型以词为基本标注单元,选择词、词性、词相对于目标词的位置、目标词及其组合为特征.针对每个特征设定若干可选的窗口,组合构成模型的各种特征模板,基于统计学中的正交表,给出一种较优模板选择方法.全部实验在选出的25个框架的6692个例句的语料上进行.对每一个框架,分别按照其例句训练一个模型,同时进行语义角色的边界识别与分类,进行2-fold交叉验证.在给定句子中的目标词以及目标词所属的框架情况下,25个框架交叉验证的实验结果的准确率、召回率、F1-值分别达到74.16%,52.70%和61.62%. Based on the semantic knowledge base of Chinese FrameNet (CFN) self-developed by Shanxi University, automatic labeling of the semantic roles of Chinese FrameNet is turned into a sequential tagging problem at word-level by applying IOB (inside/outside/begin) strategies to the exemplified sentences in CFN corpus, and the Conditional Random Fields (CRF) model is adopted. The basic unit of tagging is word. The word, its part of speech, its relative position to the target word, the target word, and their combination are chosen as the features. Various model templates are formed through optional size windows in each feature, and the orthogonal array within statistics is employed for screening of the better template. All experiments are based on the6 692 exemplified sentences of 25 frames selected from CFN corpus. The separate model is trained for each frame on its exemplified sentences by 2-fold cross-validation, and the processing of identification and classification for the semantic roles are taken simultaneously. Finally, with the target word given in a sentence, as well as the frame name of the target word, the experimental results on all 25 frames data for the precision, the recall, and Fl-measure are 74.16%, 52.70%, 61.62%, respectively.
出处 《软件学报》 EI CSCD 北大核心 2010年第4期597-611,共15页 Journal of Software
基金 国家自然科学基金No.60873128 国家高技术研究发展计划(863)No.2006AA01Z142~~
关键词 汉语框架语义知识库 语义角色标注 正交表 特征选择 条件随机场 Chinese FrameNet semantic role labeling orthogonal array feature selection conditional random fields
  • 相关文献

参考文献3

二级参考文献80

  • 1吕德新,张桂平,蔡东风,朱江涛.基于SVM的疑问句问点语义角色标注[J].沈阳航空工业学院学报,2006,23(1):44-46. 被引量:4
  • 2刘怀军,车万翔,刘挺.中文语义角色标注的特征工程[J].中文信息学报,2007,21(1):79-84. 被引量:39
  • 3刘挺,车万翔,李生.基于最大熵分类器的语义角色标注[J].软件学报,2007,18(3):565-573. 被引量:73
  • 4袁毓林.语义角色的精细等级及其在信息处理中的应用[J].中文信息学报,2007,21(4):10-20. 被引量:44
  • 5Karin Kipper, Hoa Trang Dang, and Martha Palmer. Class based construction of a verb lexicon[C]//Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence. Austin, Texas, USA: 2000, 691-696.
  • 6Carreras X, Mdrques L. Introduction to the conll-2004 shared task: Semantic role labeling [C]//Proceedings of the 8th Conference on Computational Natural Lan guage Learning. Boston, MA, USA: 2004, 89-97.
  • 7Carreras X, Mdrques L. Introduction to the conll-2005 shared task: Semantic role labeling [C]//Proceedings of the 9th Conference on Computational Natural Language I.earning. Ann Arbor, MI, USA: 2005, 152-164.
  • 8S. Pradhan, K. Hacioglu, V. Krugler, W. Ward, J. H. Martin, D. Jurafsky. Support vector learning for semantic argument classification [J]. Machine Learning, 2005, 60(1-3): 11-39.
  • 9A. Moschiti. A Study on Convolutlon Kernels for Shallow Statistic Parsing [C]//Proceedings of the 42nd Meeting of the Association for Computational Linguistics. Barcelona, Spain: 2004, 335-342.
  • 10M. Zhang, W. Che, A. T. AW, C. L. Tan, G. Zhou, T. I.iu, S. Li, A Grammar-driven Convolution Tree Kernel for Semantic Role Classification[C]//Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. Prague, Czech Republic: 2007, 200-207.

共引文献87

同被引文献297

引证文献41

二级引证文献271

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部