摘要
采用支持向量机的机器学习方法,以中文宾州树库为基础,对中文文本进行了部分语义角色标注实验。选取了主语、宾语、间接宾语、时间和地点这五种主要的语义角色,以中文PropBank 5.0中的前1 652个句子作为实验的训练集和测试集,选择路径、短语类型、谓词、头词、头词词性等八个属性作为分类特征,采用两阶段分类方法,在测试集上得到的总体语义角色标注的准确率和召回率分别为89.73%和91.26%。实验结果表明该方法对中文浅层语义分析工作是有效的。
This paper presented an experiment on semantic role labeling by using SVM. This experiment was based on Chinese PropBank 5.0, which consisted of 1 652 sentences. The role-labeling set of this experiment included subject, object, !ndirect object, time and location. It used two-phase classification method with eight features, including path, phrase type, etc. For the small scaled training set, the experiment on testing set could reach the accuracy of 89.73% and the recall of 91.26% for semantic role labeling. Results highlight the effectiveness and efficiency of proposed approach for shallow semantic parsing of Chinese.
出处
《计算机应用研究》
CSCD
北大核心
2008年第3期674-676,680,共4页
Application Research of Computers
基金
国家"863"计划资助项目(2002AA117010-10)
国家自然科学基金资助项目(60673043)
"十五"攻关教育部科技基础条件平台建设项目