摘要
谓词的自动识别是浅层句法分析的重要内容。以汉语的“谓词中枢论”为语言学基础,详细分析了汉语句子中谓词所处的上下文环境,讨论了影响谓词出现的主要语境因素。提出了一种基于统计学原理的汉语句子谓词自动识别概率模型,通过极大似然估计对谓词候选词在句中充当谓词的概率进行近似计算,利用绝对折扣模型对参数进行平滑。在小规模语料库上进行的实验显示,谓词识别率最高分别达到了80.6%(动词性谓词)和83.2%(形容词性谓词),表明了该方法的可行性和有效性。
Automatically recognizing predicate is one of the important research topics in shallow parsing.Based on the theory of "predicate is the head of Chinese sentence" ,predicate's context of Chinese sentences has been analyzed in detail,and the main factors that affects the presence of predicate have been discussed.Then,A probability model for recognizing predicate of Chinese sentences has been presented.The probabilities of quasi-predicates are estimated by maximum likelihood estimation.Discounted model is used to smooth parameters.Proposed method is tested with a small corpora,and the average accuracy of predicate recognition reaches 80.6% for adjective predicate and 83.2% for verb predicate,respectively.The experimental results show that this method is feasible and effective.
出处
《计算机工程与应用》
CSCD
北大核心
2007年第17期176-178,共3页
Computer Engineering and Applications
基金
浙江省自然科学基金(the Natural Science Foundation of Zhejiang Province of China under Grant No.M603025)
关键词
中文信息处理
浅层句法分析
谓词识别
概率模型
折扣模型
Chinese information processing
shallow parsing
predicate recognition
probability model
discounted model