Chinese named entity recognition(CNER)has received widespread attention as an important task of Chinese information extraction.Most previous research has focused on individually studying flat CNER,overlapped CNER,or d...Chinese named entity recognition(CNER)has received widespread attention as an important task of Chinese information extraction.Most previous research has focused on individually studying flat CNER,overlapped CNER,or discontinuous CNER.However,a unified CNER is often needed in real-world scenarios.Recent studies have shown that grid tagging-based methods based on character-pair relationship classification hold great potential for achieving unified NER.Nevertheless,how to enrich Chinese character-pair grid representations and capture deeper dependencies between character pairs to improve entity recognition performance remains an unresolved challenge.In this study,we enhance the character-pair grid representation by incorporating both local and global information.Significantly,we introduce a new approach by considering the character-pair grid representation matrix as a specialized image,converting the classification of character-pair relationships into a pixel-level semantic segmentation task.We devise a U-shaped network to extract multi-scale and deeper semantic information from the grid image,allowing for a more comprehensive understanding of associative features between character pairs.This approach leads to improved accuracy in predicting their relationships,ultimately enhancing entity recognition performance.We conducted experiments on two public CNER datasets in the biomedical domain,namely CMeEE-V2 and Diakg.The results demonstrate the effectiveness of our approach,which achieves F1-score improvements of 7.29 percentage points and 1.64 percentage points compared to the current state-of-the-art(SOTA)models,respectively.展开更多
It is of great significance to guarantee the efficient statistics of high-speed railway on-board equipment fault information,which also improves the efficiency of fault analysis. Considering this background, this pape...It is of great significance to guarantee the efficient statistics of high-speed railway on-board equipment fault information,which also improves the efficiency of fault analysis. Considering this background, this paper presents an empirical exploration of named entity recognition(NER) of on-board equipment fault information. Based on the historical fault records of on-board equipment, a fault information recognition model based on multi-neural network collaboration is proposed. First, considering Chinese recorded data characteristics, a method of constructing semantic features and additional features based on character granularity is proposed. Then, the two feature representations are concatenated and passed into the gated convolutional layer to extract the dependencies from multiple different subspaces and adjacent characters in parallel. Next, the local features are transmitted to the bidirectional long short-term memory(BiLSTM) to learn long-term dependency information. On top of BiLSTM, the sequential conditional random field(CRF) is used to jointly decode the optimized tag sequence of the whole sentence. The model is tested and compared with other representative baseline models. The results show that the proposed model not only considers the language characteristics of on-board fault records, but also has obvious advantages on the performance of fault information recognition.展开更多
基金supported by Yunnan Provincial Major Science and Technology Special Plan Projects(Grant Nos.202202AD080003,202202AE090008,202202AD080004,202302AD080003)National Natural Science Foundation of China(Grant Nos.U21B2027,62266027,62266028,62266025)Yunnan Province Young and Middle-Aged Academic and Technical Leaders Reserve Talent Program(Grant No.202305AC160063).
文摘Chinese named entity recognition(CNER)has received widespread attention as an important task of Chinese information extraction.Most previous research has focused on individually studying flat CNER,overlapped CNER,or discontinuous CNER.However,a unified CNER is often needed in real-world scenarios.Recent studies have shown that grid tagging-based methods based on character-pair relationship classification hold great potential for achieving unified NER.Nevertheless,how to enrich Chinese character-pair grid representations and capture deeper dependencies between character pairs to improve entity recognition performance remains an unresolved challenge.In this study,we enhance the character-pair grid representation by incorporating both local and global information.Significantly,we introduce a new approach by considering the character-pair grid representation matrix as a specialized image,converting the classification of character-pair relationships into a pixel-level semantic segmentation task.We devise a U-shaped network to extract multi-scale and deeper semantic information from the grid image,allowing for a more comprehensive understanding of associative features between character pairs.This approach leads to improved accuracy in predicting their relationships,ultimately enhancing entity recognition performance.We conducted experiments on two public CNER datasets in the biomedical domain,namely CMeEE-V2 and Diakg.The results demonstrate the effectiveness of our approach,which achieves F1-score improvements of 7.29 percentage points and 1.64 percentage points compared to the current state-of-the-art(SOTA)models,respectively.
基金supported by National Natural Science Foundation of China(No.61763025)Gansu Science and Technology Program Project(No.18JR3RA104)+1 种基金Industrial Support Program for Colleges and Universities in Gansu Province(No.2020C-19)Lanzhou Science and Technology Project(No.2019-4-49)。
文摘It is of great significance to guarantee the efficient statistics of high-speed railway on-board equipment fault information,which also improves the efficiency of fault analysis. Considering this background, this paper presents an empirical exploration of named entity recognition(NER) of on-board equipment fault information. Based on the historical fault records of on-board equipment, a fault information recognition model based on multi-neural network collaboration is proposed. First, considering Chinese recorded data characteristics, a method of constructing semantic features and additional features based on character granularity is proposed. Then, the two feature representations are concatenated and passed into the gated convolutional layer to extract the dependencies from multiple different subspaces and adjacent characters in parallel. Next, the local features are transmitted to the bidirectional long short-term memory(BiLSTM) to learn long-term dependency information. On top of BiLSTM, the sequential conditional random field(CRF) is used to jointly decode the optimized tag sequence of the whole sentence. The model is tested and compared with other representative baseline models. The results show that the proposed model not only considers the language characteristics of on-board fault records, but also has obvious advantages on the performance of fault information recognition.