To solve the problem of missing many valid triples in knowledge graphs(KGs),a novel model based on a convolutional neural network(CNN)called ConvKG is proposed,which employs a joint learning strategy for knowledge gra...To solve the problem of missing many valid triples in knowledge graphs(KGs),a novel model based on a convolutional neural network(CNN)called ConvKG is proposed,which employs a joint learning strategy for knowledge graph completion(KGC).Related research work has shown the superiority of convolutional neural networks(CNNs)in extracting semantic features of triple embeddings.However,these researches use only one single-shaped filter and fail to extract semantic features of different granularity.To solve this problem,ConvKG exploits multi-shaped filters to co-convolute on the triple embeddings,joint learning semantic features of different granularity.Different shaped filters cover different sizes on the triple embeddings and capture pairwise interactions of different granularity among triple elements.Experimental results confirm the strength of joint learning,and compared with state-of-the-art CNN-based KGC models,ConvKG achieves the better mean rank(MR)and Hits@10 metrics on dataset WN18 RR,and the better MR on dataset FB15k-237.展开更多
在命名实体识别任务中,运用词典匹配的方法能够添加丰富的文本特征,但匹配到的词组信息多使用静态归一化的方法,缺乏自动推理能力。提出了基于动态词典匹配的语义增强中文命名实体识别方法。对输入句子中的字符,在词典中进行动态词组匹...在命名实体识别任务中,运用词典匹配的方法能够添加丰富的文本特征,但匹配到的词组信息多使用静态归一化的方法,缺乏自动推理能力。提出了基于动态词典匹配的语义增强中文命名实体识别方法。对输入句子中的字符,在词典中进行动态词组匹配,利用神经网络对词组加权,结合word2vec与ALBERT得到字符的增强特征表示;在序列建模层运用BiLSTM对字符的word2vec向量与字符增强特征进行模型训练;在标签推理层运用条件随机场(Conditional Random Field,CRF)识别命名实体。在中文Resume和Weibo数据集上进行实验,验证结果表明,该方法比传统方法具有更好的效果。展开更多
The research on named entity recognition for label-few domain is becoming increasingly important.In this paper,a novel algorithm,positive unlabeled named entity recognition(PUNER)with multi-granularity language inform...The research on named entity recognition for label-few domain is becoming increasingly important.In this paper,a novel algorithm,positive unlabeled named entity recognition(PUNER)with multi-granularity language information,is proposed,which combines positive unlabeled(PU)learning and deep learning to obtain the multi-granularity language information from a few labeled in-stances and many unlabeled instances to recognize named entities.First,PUNER selects reliable negative instances from unlabeled datasets,uses positive instances and a corresponding number of negative instances to train the PU learning classifier,and iterates continuously to label all unlabeled instances.Second,a neural network-based architecture to implement the PU learning classifier is used,and comprehensive text semantics through multi-granular language information are obtained,which helps the classifier correctly recognize named entities.Performance tests of the PUNER are carried out on three multilingual NER datasets,which are CoNLL2003,CoNLL 2002 and SIGHAN Bakeoff 2006.Experimental results demonstrate the effectiveness of the proposed PUNER.展开更多
基金Supported by the National Natural Science Foundation of China(No.61876144)。
文摘To solve the problem of missing many valid triples in knowledge graphs(KGs),a novel model based on a convolutional neural network(CNN)called ConvKG is proposed,which employs a joint learning strategy for knowledge graph completion(KGC).Related research work has shown the superiority of convolutional neural networks(CNNs)in extracting semantic features of triple embeddings.However,these researches use only one single-shaped filter and fail to extract semantic features of different granularity.To solve this problem,ConvKG exploits multi-shaped filters to co-convolute on the triple embeddings,joint learning semantic features of different granularity.Different shaped filters cover different sizes on the triple embeddings and capture pairwise interactions of different granularity among triple elements.Experimental results confirm the strength of joint learning,and compared with state-of-the-art CNN-based KGC models,ConvKG achieves the better mean rank(MR)and Hits@10 metrics on dataset WN18 RR,and the better MR on dataset FB15k-237.
文摘在命名实体识别任务中,运用词典匹配的方法能够添加丰富的文本特征,但匹配到的词组信息多使用静态归一化的方法,缺乏自动推理能力。提出了基于动态词典匹配的语义增强中文命名实体识别方法。对输入句子中的字符,在词典中进行动态词组匹配,利用神经网络对词组加权,结合word2vec与ALBERT得到字符的增强特征表示;在序列建模层运用BiLSTM对字符的word2vec向量与字符增强特征进行模型训练;在标签推理层运用条件随机场(Conditional Random Field,CRF)识别命名实体。在中文Resume和Weibo数据集上进行实验,验证结果表明,该方法比传统方法具有更好的效果。
基金the National Natural Science Foundation of China(No.61876144)the Strategy Priority Research Program of Chinese Acade-my of Sciences(No.XDC02070600).
文摘The research on named entity recognition for label-few domain is becoming increasingly important.In this paper,a novel algorithm,positive unlabeled named entity recognition(PUNER)with multi-granularity language information,is proposed,which combines positive unlabeled(PU)learning and deep learning to obtain the multi-granularity language information from a few labeled in-stances and many unlabeled instances to recognize named entities.First,PUNER selects reliable negative instances from unlabeled datasets,uses positive instances and a corresponding number of negative instances to train the PU learning classifier,and iterates continuously to label all unlabeled instances.Second,a neural network-based architecture to implement the PU learning classifier is used,and comprehensive text semantics through multi-granular language information are obtained,which helps the classifier correctly recognize named entities.Performance tests of the PUNER are carried out on three multilingual NER datasets,which are CoNLL2003,CoNLL 2002 and SIGHAN Bakeoff 2006.Experimental results demonstrate the effectiveness of the proposed PUNER.