摘要
提出了一种利用CNN_BiLSTM双重通道模型的维吾尔语名词短语指代消解。利用包含维语语言特点的Hand-crafted特征初步筛选先行语和照应语,减少不必要的负例,然后使用wordembedding将先行语和照应语向量化,并作为CNN_BiLSTM双重通道模型的输入,使用双通道模型提取空间语义特征和时间语义特征。两种特征融合之后训练softmax分类器,最终完成指代消解任务。上述方法在维吾尔语名词短语指代消解任务中的准确率为84.3召回率为78.1,F1值为81。实验结果表明,充分利用CNN和BiLSTM分别提取时间和空间双重特征的,可以有效提高维吾尔语名词短语指代消解的性能。
We proposed a dual channel model for coreference resolution of Uyghur noun phrase. First, we screened the antecedent and anaphora by using the Hand-crafted features which were full of language characteristics of Uygur, In this way, we can reduce the unnecessary negative cases. Then, they were vectorized by using word embedding. These vectors can be the input to the CNN_BiLSTM dual channel model. So we can get the spatial semantic features and temporal semantic features. At last, we integrated these features and training softmax to finish the anaphora resolution task. Our experiment show that the precision rate, the recall rate and F value respectively reach 84.3%, 78.1% and 81%, which demonstrates the efficiency of the dual channel model.
作者
张江
田生伟
禹龙
ZHANG Jiang;TIAN Sheng-wei;YU Long(School of Software Xinjiang University,Urumqi Xinjiang 830008,China;Network Center,Xinjiang University,Urumqi Xinjiang 830046,China)
出处
《计算机仿真》
北大核心
2020年第4期255-259,共5页
Computer Simulation
基金
国家自然科学基金(61563051,61662074,61262064)
国家自然科学基金重点项目(61331011)
新疆自治区科技人才培养项目(QN2016YX0051)新疆天山青年计划项目(2017Q001)。