摘要
该文主要介绍汉语动词事件类型的预测。事件类型是根据内部时间结构对汉语动词进行的重要分类,包括状态、活动、变化(完结和达成)。对汉语动词事件类型进行预测从理论上能够对以往语言学研究提出的特征进行验证,从应用上可以服务于机器翻译等任务。该文基于两种方式构建词向量进行汉语动词事件类型的预测,一种是根据语言学特征有监督地构建词向量,另一种是利用word2vec无监督地构建词嵌入向量。通过多元逻辑回归、支持向量机和人工神经网络分类器对汉语动词事件类型进行预测,最终实现了73.6%的总体准确率。
This paper investigates the prediction of event types of Mandarin verbs,which are trisected into state,activity and transition or quartered into state,activity,accomplishment and achievement.Previous linguistic studies of event types of Mandarin verbs have come up with various features for different event types,but none of them are validated by statistical or computational methods.Both supervised vectors and unsupervised vectors are examined for prediction,i.e.the linguistics features and the embedding vectors by word2vec,respectively.We achieve an overall accuracy of 73.6% using classifiers of multinominal regression,supporting vector machine and the neural network.
作者
刘洪超
黄居仁
侯仁魁
李洪政
LIU Hongehao;HUANG Churen;HOU Renkui;LI Hongzheng(CBS, The Hong Kong Polytechnic University, Hong Kong, China;School of Chinese I.anguage and Literature, Ludong University, Yantai, Shandong 264001, China;Institute of Chinese Information Processing, Beijing Normal University, Beijing, 100875, China)
出处
《中文信息学报》
CSCD
北大核心
2018年第1期26-33,共8页
Journal of Chinese Information Processing
基金
国家社会科学基金(16BYY110)
关键词
事件类型
汉语动词
语言学特征
词嵌入
分类
预测
event type
Mandarin verbs
linguistic features
word embedding
classification
prediction