摘要
给出基于LSC模型的EM方法进行汉语语义优选的学习。具体步骤是首先随机为参数模型赋予初值;然后迭代运行EM算法,直到收敛;最后计算动词和名词的语义关联度,以此衡量其搭配的可能性。大量实验结果表明LSC模型能够较好地体现动、名词的搭配模式,且算法迭代收敛速度快。该方法无需语法标注的语料库,适合应用于汉语。
An Expectation-Maximisation(EM) algorithm based on latent semantic clustering(LSC) model is introduced for learning Chinese semantic selectional preferences.The specific procedure is as follows: First,the model parameters are designated their initial values randomly;secondly,EM algorithm is executed iteratively until convergence achieved;finally,the semantic association between verbs and nouns is calculated to measure their collocation possibility.Lots of experiment results show that LSC model is able to provide proper collocation patterns of verbs and nouns and the iterative convergence speed of the algorithm is fast as well.The method is suitable for Chinese as it does not need syntax-annotated corpora.
出处
《计算机应用与软件》
CSCD
北大核心
2012年第1期155-158,216,共5页
Computer Applications and Software
基金
吉林省科技发展计划项目青年基金(20100155)
吉林省科研发展计划科技支撑重点项目(20100214)
关键词
语义优选
潜在语义聚类
无指导学习
Semantic selectional preferences Latent semantic clustering Unsupervised learning