基于弱指导SVM的汉语动词次范畴化自动获取被引量：2

Subcategorization Acquisition Based on Weakly Supervised SVM for Chinese Verbs

下载PDF

导出

摘要动词次范畴化自动获取过程主要涉及到两个典型步骤一、依据启发性规则生成次范畴化假设;二、应用统计方法对假设集合进行过滤,选择可靠的次范畴化类型。此前改进获取性能的研究都集中在统计过滤阶段,并且相关实验的假设生成阶段都没有涉及到有指导的训练过程,因此所有这些方法都是无指导的。文章提出一种弱指导的汉语动词次范畴化自动获取方案,并应用SVM分类器取代原系统中的无指导假设生成模块。实验结果表明,最终获取性能有了统计意义上的改善。 Procedure of subcategorization acquisition mainly includes two typical steps ：Subcategorization hypotheses are generated according to certain heuristic rules ;Hypotheses are filtered via statistical methods and reliable subcategorization types are selected.Previous efforts to improve the acquisition performance are focused on statistical filtering,and there is no supervised training for the generation of hypotheses in relevant experiments,Therefore,all these methods are unsupervised.This paper proposes a weakly supervised method for Chinese subcategorization acquisition, where the unsupervised hypothesis generator is replaced with a SVM classifier.Results of experiments indicate statistically significant improvement in the general acquisition performance.

作者韩习武赵铁军

机构地区黑龙江大学计算机学院哈尔滨工业大学计算机学院

出处《计算机工程与应用》 CSCD 北大核心 2006年第28期9-11,27,共4页 Computer Engineering and Applications

基金国家自然科学基金项目资助(编号:60373101)

关键词汉语动词次范畴化弱指导 SVM Chinese verbs, subcategorization,weakly supervised, SVM

分类号 TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献15

1Chomsky N.Aspects of the Theory of Syntax[M].Cambridge:MIT Press,1965
2Korhonen,Anna.Subcategorization Acquisition[D].Dissertation for Ph D.Trinity Hall University of Cambridge,2001:29～77
3Briscoe E J,Carroll J:Automatic Extraction of Subcategorization from Corpora[C].In:Proceedings of the 5th ACL Conference on Applied Natural Language Processing,Washington DC,1997:356～363
4Brent M.Automatic Acquisition of Subcategorization Frames from Untagged Text[C].In:Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics,Berkeley,CA,1991:209～214
5Sabine Schulte im Walde,Helmut Schmid,Mats Rooth et al.Statistical Grammar Models and Lexicon Acquisition[C].In:Christian Rohrer,Antje RoBdeutrcher,Hans Kamp eds.Linguistic Form and its Computation,CSLI Publications,2001
6A Sarkar,D Zeman.Automatic Extraction of Subcategorization Frames for Czech[C].In:Proceedings of the 19th International Conference on Computational Linguistics,Saarbrucken,Germany,2002
7Grzegorz Chrupala.Acquiring Verb Subcategorization from Spanish Corpora[D].PhD Program"Cognitive Science and Language".Universitat de Barcelona,2003:5～71
8Manolis Maragoudakis,Katia Lida Kermanidis,George Kokkinakis.Learning Subcategorization Frames from Corpora:A Case Study for Modern Greek.Wire Communications Laboratory,University of Patras 26500 Rio,Greece,2002
9P Gamallo,A Agustini,P L Gabriel.Using co-composition for acquiring syntactic and semantic subcategorization-Unsupervised lexical acquisition[C].In:Proceedings of the Workshop of the ACL Special Interest Group on the Lexicon (SIGLEX),Philadelphia,2002:34～41
10Han Xiwu,Tiejun Zhao,Muyun Yang.FML-Based SCF Predefinition Learning for Chinese Verbs[C].In:Proceedings of the International Joint Conference of NLP,2004:115～122.

同被引文献30

1段丹青,陈松乔,杨卫平.网络入侵检测中的支持向量机主动学习算法[J].计算机工程与应用,2006,42(1):117-119. 被引量：5
2赵凌志,刘颖,覃征.Weighted SVM在蛋白质磷酸化位点预测中的应用[J].计算机工程与应用,2006,42(3):155-157. 被引量：10
3HA Minghu,LI Yan,LI Jia,TIAN Dazeng.The key theorem and the bounds on the rate of uniform convergence of learning theory on Sugeno measure space[J].Science in China(Series F),2006,49(3):372-385. 被引量：16
4定光桂.巴拿赫空间引论[M].北京:科学出版社,2001.
5Vapnik V N.The nature of statistical learning theory[M].New York: Springer-Verlag, 1995.
6Vapnik V N.Statistical learning theory[M].New York: A Wiley- Interscience Publication, 1998.
7Vapnik V N.An overview of statistical learning theory [J].IEEE Transactions on Neural Networks, 1999,10(5) :988-999.
8Wechsler H,Duric Z,Li F,et al.Motion estimation using statistical learning theory[J].IEEE Transactions on Pattern Analysis and MachineIntelligence, 2004,26(4 ) : 466-478.
9Lu J W,Zhang E H.Gait recognition for human identification based on ICA and fuzzy SVM through multiple views fusion[J]. Pattern Recognition Letters, 2007,28( 16 ) :2401-2411.
10Camastra F.A SVM-based cursive character recognizer[J].Pattern Recognition, 2007,40( 12 ) : 3721-3727.

引证文献2

1张植明,田景峰.基于双重随机样本的统计学习理论的理论基础[J].计算机工程与应用,2008,44(17):33-36. 被引量：9
2张植明,田景峰.Sugeno测度空间基于复样本的统计学习理论[J].计算机工程与应用,2009,45(7):59-64. 被引量：3

二级引证文献12

1张植明.双重随机样本的结构风险最小化原则[J].计算机工程与应用,2009,45(1):51-55. 被引量：5
2田景峰,张植明.可信性空间上基于复模糊变量的学习理论的关键定理[J].华北电力大学学报（自然科学版）,2009,36(4):104-109.
3田景峰,张植明,哈明虎.Sugeno测度空间上的一类回归估计问题的界[J].模糊系统与数学,2009,23(4):84-91. 被引量：3
4田景峰,张植明.可信性空间上基于复模糊变量的学习过程一致收敛速度的界[J].华北电力大学学报（自然科学版）,2009,36(5):106-112. 被引量：3
5何其慧,王翠,毛军军.基于随机模糊样本的统计学习理论基础[J].合肥学院学报（自然科学版）,2011,21(3):5-11. 被引量：2
6何其慧,姚登宝,王翠翠,毛军军.基于模糊随机样本的结构风险最小化原则[J].计算机工程与应用,2011,47(34):51-55. 被引量：4
7孙小慧,孙恒,吴涛.粗糙随机样本的结构风险最小化原则[J].周口师范学院学报,2012,29(5):7-12. 被引量：3
8何其慧,高巍,王翠翠,毛军军.若干基于模糊随机样本的SLT定理[J].数学的实践与认识,2012,42(24):245-252.
9赵书强,王明雨,胡永强,刘晨亮.基于不确定理论的光伏出力预测研究[J].电工技术学报,2015,30(16):213-220. 被引量：40
10白云超,冯贺平,白鹤举.Sugeno测度空间上局部风险最小化估计的界[J].河北大学学报（自然科学版）,2015,35(6):566-570. 被引量：1

1韩习武,赵铁军.从真实语料中自动获取汉语动词次范畴化信息[J].计算机工程与应用,2005,41(19):1-4. 被引量：1
2韩习武,赵铁军.基于子类的汉语动词SCF获取回退[J].计算机工程与应用,2005,41(30):158-161. 被引量：1
3韩习武,赵铁军.基于次范畴化的汉语多义动词模糊聚类(英文)[J].软件学报,2006,17(2):259-266.
4韩习武,赵铁军.汉英动词次范畴化对应类型的统计分析[J].计算机科学,2010,37(3):230-233.
5韩习武.汉英动词次范畴化对应关系自动获取[J].计算机工程与应用,2008,44(6):9-13.
6邓志成,周经野.汉语动词深层语义多态性[J].湘潭大学自然科学学报,1997,19(2):105-108. 被引量：2
7韩习武,赵铁军.基于转换句式的英语动词次范畴化获取回退[J].高技术通讯,2006,16(9):904-908.
8曹建芳,郑家恒.基于SVM的汉语动词短语识别[J].咸阳师范学院学报,2004,19(6):32-34. 被引量：3
9王素格,杨军玲,张武.基于最大熵模型与投票法的汉语动词与动词搭配识别[J].小型微型计算机系统,2007,28(7):1306-1310. 被引量：3
10李华.中日对比研究——汉语动词“给”和日语动词「与える」[J].科技信息,2011(32).

计算机工程与应用

2006年第28期

浏览历史

内容加载中请稍等...

基于弱指导SVM的汉语动词次范畴化自动获取被引量：2

参考文献15

同被引文献30

引证文献2

二级引证文献12

相关作者

相关机构

相关主题

浏览历史

基于弱指导SVM的汉语动词次范畴化自动获取 被引量：2

参考文献15

同被引文献30

引证文献2

二级引证文献12

相关作者

相关机构

相关主题

浏览历史

基于弱指导SVM的汉语动词次范畴化自动获取被引量：2