Named entity recognition(NER)is an important part in knowledge extraction and one of the main tasks in constructing knowledge graphs.In today’s Chinese named entity recognition(CNER)task,the BERT-BiLSTM-CRF model is ...Named entity recognition(NER)is an important part in knowledge extraction and one of the main tasks in constructing knowledge graphs.In today’s Chinese named entity recognition(CNER)task,the BERT-BiLSTM-CRF model is widely used and often yields notable results.However,recognizing each entity with high accuracy remains challenging.Many entities do not appear as single words but as part of complex phrases,making it difficult to achieve accurate recognition using word embedding information alone because the intricate lexical structure often impacts the performance.To address this issue,we propose an improved Bidirectional Encoder Representations from Transformers(BERT)character word conditional random field(CRF)(BCWC)model.It incorporates a pre-trained word embedding model using the skip-gram with negative sampling(SGNS)method,alongside traditional BERT embeddings.By comparing datasets with different word segmentation tools,we obtain enhanced word embedding features for segmented data.These features are then processed using the multi-scale convolution and iterated dilated convolutional neural networks(IDCNNs)with varying expansion rates to capture features at multiple scales and extract diverse contextual information.Additionally,a multi-attention mechanism is employed to fuse word and character embeddings.Finally,CRFs are applied to learn sequence constraints and optimize entity label annotations.A series of experiments are conducted on three public datasets,demonstrating that the proposed method outperforms the recent advanced baselines.BCWC is capable to address the challenge of recognizing complex entities by combining character-level and word-level embedding information,thereby improving the accuracy of CNER.Such a model is potential to the applications of more precise knowledge extraction such as knowledge graph construction and information retrieval,particularly in domain-specific natural language processing tasks that require high entity recognition precision.展开更多
图对比学习因其可有效缓解数据稀疏问题被广泛应用在推荐系统中.然而,目前大多数基于图对比学习的推荐算法均采用单一视角进行学习,这极大地限制了模型的泛化能力,且图卷积网络本身存在的过度平滑问题也影响着模型的稳定性.基于此,提出...图对比学习因其可有效缓解数据稀疏问题被广泛应用在推荐系统中.然而,目前大多数基于图对比学习的推荐算法均采用单一视角进行学习,这极大地限制了模型的泛化能力,且图卷积网络本身存在的过度平滑问题也影响着模型的稳定性.基于此,提出一种融合层注意力机制的多视角图对比学习推荐方法.一方面,该方法提出2种不同视角下的3种对比学习,在视图级视角下,通过对原始图添加随机噪声构建扰动增强视图,利用奇异值分解(singular value decomposition)重组构建SVD增强视图,对这2个增强视图进行视图级对比学习;在节点视角下,利用节点间的语义信息分别进行候选节点和候选结构邻居对比学习,并将3种对比学习辅助任务和推荐任务进行多任务学习优化,以提高节点嵌入的质量,从而提升模型的泛化能力.另一方面,在图卷积网络学习用户和项目的节点嵌入时,采用层注意力机制的方式聚合最终的节点嵌入,提高模型的高阶连通性,以缓解过度平滑问题.在4个公开数据集LastFM,Gowalla,Ifashion,Yelp上与10个经典模型进行对比,结果表明该方法在Recall,Precision,NDCG这3个指标上分别平均提升3.12%,3.22%,4.06%,这说明所提方法是有效的.展开更多
单个较大非均匀超图聚类旨在将非均匀超图包含的节点划分为多个簇,使得同一簇内的节点更相似,而不同簇中的节点更不相似,具有广泛的应用场景。目前,最优的基于超图神经网络的非均匀超图聚类方法CIAH(co-cluster the interactions via at...单个较大非均匀超图聚类旨在将非均匀超图包含的节点划分为多个簇,使得同一簇内的节点更相似,而不同簇中的节点更不相似,具有广泛的应用场景。目前,最优的基于超图神经网络的非均匀超图聚类方法CIAH(co-cluster the interactions via attentive hypergraph neural network)虽然较好地学习了非均匀超图的关系信息,但仍存在两点不足:(1)对于局部关系信息的挖掘不足;(2)忽略了隐藏的高阶关系。因此,提出一种基于多尺度注意力和动态超图构建的非均匀超图聚类模型MADC(non-uniform hypergraph clustering combining multi-scale attention and dynamic construction)。一方面,使用多尺度注意力充分学习了超边中节点与节点之间的局部关系信息;另一方面,采用动态构建挖掘隐藏的高阶关系,进一步丰富了超图特征嵌入。真实数据集上的大量实验结果验证了MADC模型在非均匀超图聚类上的聚类准确率(accuracy,ACC)、标准互信息(normalized mutual information,NMI)和调整兰德指数(adjusted Rand index,ARI)均优于CIAH等所有Baseline方法。展开更多
基金supported by the International Research Center of Big Data for Sustainable Development Goals under Grant No.CBAS2022GSP05the Open Fund of State Key Laboratory of Remote Sensing Science under Grant No.6142A01210404the Hubei Key Laboratory of Intelligent Geo-Information Processing under Grant No.KLIGIP-2022-B03.
文摘Named entity recognition(NER)is an important part in knowledge extraction and one of the main tasks in constructing knowledge graphs.In today’s Chinese named entity recognition(CNER)task,the BERT-BiLSTM-CRF model is widely used and often yields notable results.However,recognizing each entity with high accuracy remains challenging.Many entities do not appear as single words but as part of complex phrases,making it difficult to achieve accurate recognition using word embedding information alone because the intricate lexical structure often impacts the performance.To address this issue,we propose an improved Bidirectional Encoder Representations from Transformers(BERT)character word conditional random field(CRF)(BCWC)model.It incorporates a pre-trained word embedding model using the skip-gram with negative sampling(SGNS)method,alongside traditional BERT embeddings.By comparing datasets with different word segmentation tools,we obtain enhanced word embedding features for segmented data.These features are then processed using the multi-scale convolution and iterated dilated convolutional neural networks(IDCNNs)with varying expansion rates to capture features at multiple scales and extract diverse contextual information.Additionally,a multi-attention mechanism is employed to fuse word and character embeddings.Finally,CRFs are applied to learn sequence constraints and optimize entity label annotations.A series of experiments are conducted on three public datasets,demonstrating that the proposed method outperforms the recent advanced baselines.BCWC is capable to address the challenge of recognizing complex entities by combining character-level and word-level embedding information,thereby improving the accuracy of CNER.Such a model is potential to the applications of more precise knowledge extraction such as knowledge graph construction and information retrieval,particularly in domain-specific natural language processing tasks that require high entity recognition precision.
文摘图对比学习因其可有效缓解数据稀疏问题被广泛应用在推荐系统中.然而,目前大多数基于图对比学习的推荐算法均采用单一视角进行学习,这极大地限制了模型的泛化能力,且图卷积网络本身存在的过度平滑问题也影响着模型的稳定性.基于此,提出一种融合层注意力机制的多视角图对比学习推荐方法.一方面,该方法提出2种不同视角下的3种对比学习,在视图级视角下,通过对原始图添加随机噪声构建扰动增强视图,利用奇异值分解(singular value decomposition)重组构建SVD增强视图,对这2个增强视图进行视图级对比学习;在节点视角下,利用节点间的语义信息分别进行候选节点和候选结构邻居对比学习,并将3种对比学习辅助任务和推荐任务进行多任务学习优化,以提高节点嵌入的质量,从而提升模型的泛化能力.另一方面,在图卷积网络学习用户和项目的节点嵌入时,采用层注意力机制的方式聚合最终的节点嵌入,提高模型的高阶连通性,以缓解过度平滑问题.在4个公开数据集LastFM,Gowalla,Ifashion,Yelp上与10个经典模型进行对比,结果表明该方法在Recall,Precision,NDCG这3个指标上分别平均提升3.12%,3.22%,4.06%,这说明所提方法是有效的.