期刊文献+
共找到418篇文章
< 1 2 21 >
每页显示 20 50 100
SciCN:A Scientific Dataset for Chinese Named Entity Recognition
1
作者 Jing Yang Bin Ji +2 位作者 Shasha Li Jun Ma Jie Yu 《Computers, Materials & Continua》 SCIE EI 2024年第3期4303-4315,共13页
Named entity recognition(NER)is a fundamental task of information extraction(IE),and it has attracted considerable research attention in recent years.The abundant annotated English NER datasets have significantly prom... Named entity recognition(NER)is a fundamental task of information extraction(IE),and it has attracted considerable research attention in recent years.The abundant annotated English NER datasets have significantly promoted the NER research in the English field.By contrast,much fewer efforts are made to the Chinese NER research,especially in the scientific domain,due to the scarcity of Chinese NER datasets.To alleviate this problem,we present aChinese scientificNER dataset–SciCN,which contains entity annotations of titles and abstracts derived from 3,500 scientific papers.We manually annotate a total of 62,059 entities,and these entities are classified into six types.Compared to English scientific NER datasets,SciCN has a larger scale and is more diverse,for it not only contains more paper abstracts but these abstracts are derived from more research fields.To investigate the properties of SciCN and provide baselines for future research,we adapt a number of previous state-of-theart Chinese NER models to evaluate SciCN.Experimental results show that SciCN is more challenging than other Chinese NER datasets.In addition,previous studies have proven the effectiveness of using lexicons to enhance Chinese NER models.Motivated by this fact,we provide a scientific domain-specific lexicon.Validation results demonstrate that our lexicon delivers better performance gains than lexicons of other domains.We hope that the SciCN dataset and the lexicon will enable us to benchmark the NER task regarding the Chinese scientific domain and make progress for future research.The dataset and lexicon are available at:https://github.com/yangjingla/SciCN.git. 展开更多
关键词 named entity recognition DATASET scientific information extraction LEXICON
下载PDF
Implicit Modality Mining: An End-to-End Method for Multimodal Information Extraction
2
作者 Jinle Lu Qinglang Guo 《Journal of Electronic Research and Application》 2024年第2期124-139,共16页
Multimodal named entity recognition(MNER)and relation extraction(MRE)are key in social media analysis but face challenges like inefficient visual processing and non-optimal modality interaction.(1)Heavy visual embeddi... Multimodal named entity recognition(MNER)and relation extraction(MRE)are key in social media analysis but face challenges like inefficient visual processing and non-optimal modality interaction.(1)Heavy visual embedding:the process of visual embedding is both time and computationally expensive due to the prerequisite extraction of explicit visual cues from the original image before input into the multimodal model.Consequently,these approaches cannot achieve efficient online reasoning;(2)suboptimal interaction handling:the prevalent method of managing interaction between different modalities typically relies on the alternation of self-attention and cross-attention mechanisms or excessive dependence on the gating mechanism.This explicit modeling method may fail to capture some nuanced relations between image and text,ultimately undermining the model’s capability to extract optimal information.To address these challenges,we introduce Implicit Modality Mining(IMM),a novel end-to-end framework for fine-grained image-text correlation without heavy visual embedders.IMM uses an Implicit Semantic Alignment module with a Transformer for cross-modal clues and an Insert-Activation module to effectively utilize these clues.Our approach achieves state-of-the-art performance on three datasets. 展开更多
关键词 MULTIMODAL named entity recognition Relation extraction Patch projection
下载PDF
Corpus of Carbonate Platforms with Lexical Annotations for Named Entity Recognition
3
作者 Zhichen Hu Huali Ren +3 位作者 Jielin Jiang Yan Cui Xiumian Hu Xiaolong Xu 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第4期91-108,共18页
An obviously challenging problem in named entity recognition is the construction of the kind data set of entities.Although some research has been conducted on entity database construction,the majority of them are dire... An obviously challenging problem in named entity recognition is the construction of the kind data set of entities.Although some research has been conducted on entity database construction,the majority of them are directed at Wikipedia or the minority at structured entities such as people,locations and organizational nouns in the news.This paper focuses on the identification of scientific entities in carbonate platforms in English literature,using the example of carbonate platforms in sedimentology.Firstly,based on the fact that the reasons for writing literature in key disciplines are likely to be provided by multidisciplinary experts,this paper designs a literature content extraction method that allows dealing with complex text structures.Secondly,based on the literature extraction content,we formalize the entity extraction task(lexicon and lexical-based entity extraction)for entity extraction.Furthermore,for testing the accuracy of entity extraction,three currently popular recognition methods are chosen to perform entity detection in this paper.Experiments show that the entity data set provided by the lexicon and lexical-based entity extraction method is of significant assistance for the named entity recognition task.This study presents a pilot study of entity extraction,which involves the use of a complex structure and specialized literature on carbonate platforms in English. 展开更多
关键词 named entity recognition carbonate platform corpus entity extraction english literature detection
下载PDF
Overview of Named Entity Recognition 被引量:2
4
作者 Xing Liu Huiqin Chen Wangui Xia 《Journal of Contemporary Educational Research》 2022年第5期65-68,共4页
Named entity recognition,as a sub-task of information extraction,has attracted widespread attention from scholars at home and abroad since it was proposed,and a series of studies and discussions have been carried out ... Named entity recognition,as a sub-task of information extraction,has attracted widespread attention from scholars at home and abroad since it was proposed,and a series of studies and discussions have been carried out based on it.This paper discusses the existing named entity recognition technology based on its history of development. 展开更多
关键词 named entity recognition information extraction
下载PDF
Low Resource Chinese Geological Text Named Entity Recognition Based on Prompt Learning
5
作者 Hang He Chao Ma +6 位作者 Shan Ye Wenqiang Tang Yuxuan Zhou Zhen Yu Jiaxin Yi Li Hou Mingcai Hou 《Journal of Earth Science》 SCIE CAS CSCD 2024年第3期1035-1043,共9页
Geological reports are a significant accomplishment for geologists involved in geological investigations and scientific research as they contain rich data and textual information.With the rapid development of science ... Geological reports are a significant accomplishment for geologists involved in geological investigations and scientific research as they contain rich data and textual information.With the rapid development of science and technology,a large number of textual reports have accumulated in the field of geology.However,many non-hot topics and non-English speaking regions are neglected in mainstream geoscience databases for geological information mining,making it more challenging for some researchers to extract necessary information from these texts.Natural Language Processing(NLP)has obvious advantages in processing large amounts of textual data.The objective of this paper is to identify geological named entities from Chinese geological texts using NLP techniques.We propose the RoBERTa-Prompt-Tuning-NER method,which leverages the concept of Prompt Learning and requires only a small amount of annotated data to train superior models for recognizing geological named entities in low-resource dataset configurations.The RoBERTa layer captures context-based information and longer-distance dependencies through dynamic word vectors.Finally,we conducted experiments on the constructed Geological Named Entity Recognition(GNER)dataset.Our experimental results show that the proposed model achieves the highest F1 score of 80.64%among the four baseline algorithms,demonstrating the reliability and robustness of using the model for Named Entity Recognition of geological texts. 展开更多
关键词 Prompt Learning named entity Recognition(NER) low resource geological text text information mining big data geology.
原文传递
A Two-Phase Paradigm for Joint Entity-Relation Extraction 被引量:2
6
作者 Bin Ji Hao Xu +4 位作者 Jie Yu Shasha Li JunMa Yuke Ji Huijun Liu 《Computers, Materials & Continua》 SCIE EI 2023年第1期1303-1318,共16页
An exhaustive study has been conducted to investigate span-based models for the joint entity and relation extraction task.However,these models sample a large number of negative entities and negative relations during t... An exhaustive study has been conducted to investigate span-based models for the joint entity and relation extraction task.However,these models sample a large number of negative entities and negative relations during the model training,which are essential but result in grossly imbalanced data distributions and in turn cause suboptimal model performance.In order to address the above issues,we propose a two-phase paradigm for the span-based joint entity and relation extraction,which involves classifying the entities and relations in the first phase,and predicting the types of these entities and relations in the second phase.The two-phase paradigm enables our model to significantly reduce the data distribution gap,including the gap between negative entities and other entities,aswell as the gap between negative relations and other relations.In addition,we make the first attempt at combining entity type and entity distance as global features,which has proven effective,especially for the relation extraction.Experimental results on several datasets demonstrate that the span-based joint extraction model augmented with the two-phase paradigm and the global features consistently outperforms previous state-ofthe-art span-based models for the joint extraction task,establishing a new standard benchmark.Qualitative and quantitative analyses further validate the effectiveness the proposed paradigm and the global features. 展开更多
关键词 Joint extraction span-based named entity recognition relation extraction data distribution global features
下载PDF
Overview of CCKS 2020 Task 3: Named Entity Recognition and Event Extraction in Chinese Electronic Medical Records 被引量:6
7
作者 Xia Li Qinghua Wen +2 位作者 Hu Lin Zengtao Jiao Jiangtao Zhang 《Data Intelligence》 2021年第3期376-388,共13页
The China Conference on Knowledge Graph and Semantic Computing(CCKS)2020 Evaluation Task 3 presented clinical named entity recognition and event extraction for the Chinese electronic medical records.Two annotated data... The China Conference on Knowledge Graph and Semantic Computing(CCKS)2020 Evaluation Task 3 presented clinical named entity recognition and event extraction for the Chinese electronic medical records.Two annotated data sets and some other additional resources for these two subtasks were provided for participators.This evaluation competition attracted 354 teams and 46 of them successfully submitted the valid results.The pre-trained language models are widely applied in this evaluation task.Data argumentation and external resources are also helpful. 展开更多
关键词 Chinese electronic medical records Event extraction named entity recognition Clinical text CCKS
原文传递
A review on cyber security named entity recognition 被引量:5
8
作者 Chen GAO Xuan ZHANG +1 位作者 Mengting HAN Hui LIU 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2021年第9期1153-1168,共16页
With the rapid development of Internet technology and the advent of the era of big data,more and more cyber security texts are provided on the Internet.These texts include not only security concepts,incidents,tools,gu... With the rapid development of Internet technology and the advent of the era of big data,more and more cyber security texts are provided on the Internet.These texts include not only security concepts,incidents,tools,guidelines,and policies,but also risk management approaches,best practices,assurances,technologies,and more.Through the integration of large-scale,heterogeneous,unstructured cyber security information,the identification and classification of cyber security entities can help handle cyber security issues.Due to the complexity and diversity of texts in the cyber security domain,it is difficult to identify security entities in the cyber security domain using the traditional named entity recognition(NER)methods.This paper describes various approaches and techniques for NER in this domain,including the rule-based approach,dictionary-based approach,and machine learning based approach,and discusses the problems faced by NER research in this domain,such as conjunction and disjunction,non-standardized naming convention,abbreviation,and massive nesting.Three future directions of NER in cyber security are proposed:(1)application of unsupervised or semi-supervised technology;(2)development of a more comprehensive cyber security ontology;(3)development of a more comprehensive deep learning model. 展开更多
关键词 named entity recognition(NER) information extraction Cyber security Machine learning Deep learning
原文传递
Word Embedding Bootstrapped Deep Active Learning Method to Information Extraction on Chinese Electronic Medical Record
9
作者 MA Qunsheng CEN Xingxing +1 位作者 YUAN Junyi HOU Xumin 《Journal of Shanghai Jiaotong university(Science)》 EI 2021年第4期494-502,共9页
Electronic medical record (EMR) containing rich biomedical information has a great potential in disease diagnosis and biomedical research. However, the EMR information is usually in the form of unstructured text, whic... Electronic medical record (EMR) containing rich biomedical information has a great potential in disease diagnosis and biomedical research. However, the EMR information is usually in the form of unstructured text, which increases the use cost and hinders its applications. In this work, an effective named entity recognition (NER) method is presented for information extraction on Chinese EMR, which is achieved by word embedding bootstrapped deep active learning to promote the acquisition of medical information from Chinese EMR and to release its value. In this work, deep active learning of bi-directional long short-term memory followed by conditional random field (Bi-LSTM+CRF) is used to capture the characteristics of different information from labeled corpus, and the word embedding models of contiguous bag of words and skip-gram are combined in the above model to respectively capture the text feature of Chinese EMR from unlabeled corpus. To evaluate the performance of above method, the tasks of NER on Chinese EMR with “medical history” content were used. Experimental results show that the word embedding bootstrapped deep active learning method using unlabeled medical corpus can achieve a better performance compared with other models. 展开更多
关键词 deep active learning named entity recognition(NER) information extraction word embedding Chinese electronic medical record(EMR)
原文传递
基于深度字词融合的小麦种质信息实体关系联合抽取
10
作者 刘合兵 贾笑笑 +3 位作者 时雷 熊蜀峰 马新明 席磊 《计算机工程与设计》 北大核心 2024年第4期1079-1086,共8页
为获得结构化的小麦品种表型和遗传描述,针对非结构化小麦种质数据中存在的实体边界模糊以及关系重叠问题,提出一种基于深度字词融合的小麦种质信息实体关系联合抽取模型WGIE-DCWF(wheat germplasm information extraction model based ... 为获得结构化的小麦品种表型和遗传描述,针对非结构化小麦种质数据中存在的实体边界模糊以及关系重叠问题,提出一种基于深度字词融合的小麦种质信息实体关系联合抽取模型WGIE-DCWF(wheat germplasm information extraction model based on deep character and word fusion)。模型编码层通过深度字词融合和上下文语义特征融合,提高密集实体特征识别能力;模型三元组抽取层建立层叠指针网络,提高重叠关系的提取能力。在小麦种质数据集和公开数据集上的一系列对比实验结果表明,WGIE-DCWF模型能够有效提高小麦种质数据实体关系联合抽取效果,同时拥有较好的泛化性,可以为小麦种质信息知识库构建提供技术支撑。 展开更多
关键词 小麦种质信息 字词融合 实体关系抽取 联合抽取 层叠指针网络 实体识别 关系抽取
下载PDF
融合实体和上下文信息的篇章关系抽取研究
11
作者 黄河燕 袁长森 冯冲 《自动化学报》 EI CAS CSCD 北大核心 2024年第10期1953-1962,共10页
篇章关系抽取旨在识别篇章中实体对之间的关系.相较于传统的句子级别关系抽取,篇章级别关系抽取任务更加贴近实际应用,但是它对实体对的跨句子推理和上下文信息感知等问题提出了新的挑战.本文提出融合实体和上下文信息(Fuse entity and ... 篇章关系抽取旨在识别篇章中实体对之间的关系.相较于传统的句子级别关系抽取,篇章级别关系抽取任务更加贴近实际应用,但是它对实体对的跨句子推理和上下文信息感知等问题提出了新的挑战.本文提出融合实体和上下文信息(Fuse entity and context information,FECI)的篇章关系抽取方法,它包含两个模块,分别是实体信息抽取模块和上下文信息抽取模块.实体信息抽取模块从两个实体中自动地抽取出能够表示实体对关系的特征.上下文信息抽取模块根据实体对的提及位置信息,从篇章中抽取不同的上下文关系特征.本文在三个篇章级别的关系抽取数据集上进行实验,效果得到显著提升. 展开更多
关键词 篇章关系抽取 实体信息 上下文信息 提及位置信息 跨句子推理
下载PDF
基于多头注意力机制字词联合的中文命名实体识别
12
作者 王进 王猛旗 +2 位作者 张昕跃 孙开伟 朴昌浩 《江苏大学学报(自然科学版)》 CAS 北大核心 2024年第1期77-84,共8页
针对现有基于字词联合的中文命名实体识别方法会引入冗余词汇干扰、模型网络结构复杂、难以迁移的问题,提出一种基于多头注意力机制字词联合的中文命名实体识别算法.算法采用多头注意力机制融合词汇边界信息,并通过分类融合BIE词集降低... 针对现有基于字词联合的中文命名实体识别方法会引入冗余词汇干扰、模型网络结构复杂、难以迁移的问题,提出一种基于多头注意力机制字词联合的中文命名实体识别算法.算法采用多头注意力机制融合词汇边界信息,并通过分类融合BIE词集降低冗余词汇干扰.建立了多头注意力字词联合模型,包含字词匹配、多头注意力、融合等模块.与现有中文命名实体识别方法相比,本算法避免了设计复杂的序列模型,方便与现有基于字的中文命名实体识别模型结合.采用召回率、精确率以及F 1值作为评价指标,通过消融试验验证模型各个部分的效果.结果表明,本算法在MSRA和Weibo数据集上F 1值分别提升0.28、0.69,在Resume数据集上精确率提升0.07. 展开更多
关键词 中文命名实体识别 词汇冗余 词汇边界信息 字词联合 多头注意力机制 BIE词集
下载PDF
面向合同信息抽取的动态多任务学习方法
13
作者 王浩畅 郑冠彧 赵铁军 《软件学报》 EI CSCD 北大核心 2024年第7期3377-3391,共15页
对于合同文本中要素和条款两类信息的准确提取,可以有效提升合同的审查效率,为贸易各方提供便利化服务.然而当前的合同信息抽取方法一般训练单任务模型对要素和条款分别进行抽取,并没有深挖合同文本的特征,忽略了不同任务间的关联性.因... 对于合同文本中要素和条款两类信息的准确提取,可以有效提升合同的审查效率,为贸易各方提供便利化服务.然而当前的合同信息抽取方法一般训练单任务模型对要素和条款分别进行抽取,并没有深挖合同文本的特征,忽略了不同任务间的关联性.因此,采用深度神经网络结构对要素抽取和条款抽取两个任务间的相关性进行研究,并提出多任务学习方法.所提方法首先将上述两种任务进行融合,构建一种应用于合同信息抽取的基本多任务学习模型;然后对其进行优化,利用Attention机制进一步挖掘其相关性,形成基于Attention机制的动态多任务学习模型;最后针对篇章级合同文本中复杂的语义环境,在前两者的基础上提出一种融合词汇知识的动态多任务学习模型.实验结果表明,所提方法可以充分捕捉任务间的共享特征,不仅取得了比单任务模型更好的信息抽取结果,而且能够有效解决合同文本中要素与条款间实体嵌套的问题,实现合同要素与条款的信息联合抽取.此外,为了验证该方法的鲁棒性,在多个领域的公开数据集上进行实验,结果表明该方法的效果均优于基线方法. 展开更多
关键词 多任务学习 合同文本 信息联合抽取 注意力机制 实体嵌套
下载PDF
基于边界感知的工业设备故障命名实体识别方法
14
作者 葛卫京 刘晓丽 杜亚峰 《计算机应用与软件》 北大核心 2024年第6期237-242,249,共7页
命名实体识别在识别工业设备故障方面发挥关键作用,有助于故障预测、维护管理和智能决策。针对工业设备故障数据中存在的嵌套结构和长跨度问题,提出一种边界感知的实体识别方法。该方法通过边界感知精准定位实体跨距,并结合类别预测判... 命名实体识别在识别工业设备故障方面发挥关键作用,有助于故障预测、维护管理和智能决策。针对工业设备故障数据中存在的嵌套结构和长跨度问题,提出一种边界感知的实体识别方法。该方法通过边界感知精准定位实体跨距,并结合类别预测判断实体跨距的所属类别,以提高识别性能。此外,为解决标注数据的缺乏的问题,还构建面向工业设备故障的实体识别数据集。实验结果证明了该方法在工业设备故障实体识别方面的有效性,并为后续数据分析和知识图谱的构建提供了坚实基础。 展开更多
关键词 命名实体识别 预训练语言模型 工业设备 故障信息
下载PDF
基于BERT古文预训练模型的实体关系联合抽取
15
作者 李智杰 杨盛杰 +3 位作者 李昌华 张颉 董玮 介军 《计算机系统应用》 2024年第8期187-195,共9页
古汉语文本承载着丰富的历史和文化信息,对这类文本进行实体关系抽取研究并构建相关知识图谱对于文化传承具有重要作用.针对古汉语文本中存在大量生僻汉字、语义模糊和复义等问题,提出了一种基于BERT古文预训练模型的实体关系联合抽取模... 古汉语文本承载着丰富的历史和文化信息,对这类文本进行实体关系抽取研究并构建相关知识图谱对于文化传承具有重要作用.针对古汉语文本中存在大量生僻汉字、语义模糊和复义等问题,提出了一种基于BERT古文预训练模型的实体关系联合抽取模型(entity relation joint extraction model based on BERT-ancient-Chinese pretrained model,JEBAC).首先,通过融合BiLSTM神经网络和注意力机制的BERT古文预训练模型(BERT-ancientChinese pre-trained model integrated BiLSTM neural network and attention mechanism,BACBA),识别出句中所有的subject实体和object实体,为关系和object实体联合抽取提供依据.接下来,将subject实体的归一化编码向量与整个句子的嵌入向量相加,以更好地理解句中subject实体的语义特征;最后,结合带有subject实体特征的句子向量和object实体的提示信息,通过BACBA实现句中关系和object实体的联合抽取,从而得到句中所有的三元组信息(subject实体,关系,object实体).在中文实体关系抽取DuIE2.0数据集和CCKS 2021的文言文实体关系抽取CCLUE小样本数据集上,与现有的方法进行了性能比较.实验结果表明,该方法在抽取性能上更加有效,F1值分别可达79.2%和55.5%. 展开更多
关键词 古汉语文本 实体关系抽取 BERT古文预训练模型 BiLSTM 注意力 三元组信息
下载PDF
基于多模态和知识蒸馏的教材知识图谱构建方法
16
作者 刘军 冷芳玲 +1 位作者 吴旺旺 鲍玉斌 《计算机科学与探索》 CSCD 北大核心 2024年第11期2901-2911,共11页
为了高效构建教育领域多模态学科知识图谱,提出了基于大模型知识蒸馏和多模型协作推理的教材文本实体关系抽取算法。在模型训练阶段,利用闭源的千亿参数模型对文本数据进行标注,实现隐式知识蒸馏。然后对开源十亿规模参数模型进行领域... 为了高效构建教育领域多模态学科知识图谱,提出了基于大模型知识蒸馏和多模型协作推理的教材文本实体关系抽取算法。在模型训练阶段,利用闭源的千亿参数模型对文本数据进行标注,实现隐式知识蒸馏。然后对开源十亿规模参数模型进行领域数据指令微调,提升开源模型实体关系抽取任务的指令遵循能力。在模型推理阶段,闭源模型作为指导模型,开源的十亿规模参数模型作为执行模型。实验结果表明知识蒸馏、多模型协作、领域数据指令微调具有有效性,显著提高了基于指令提示的教材文本实体关系抽取任务的效果。提出了显隐式知识增强的教材示意图多模态命名实体识别算法。利用图像OCR、视觉语言模型等技术提取了教材示意图中的文字信息、全局内容描述信息。通过显式知识库检索增强和隐式LLM提示增强的方法,得到图像-标题对中可能关联的辅助知识,并将显式知识库和隐式LLM得到的知识进一步融合,形成最终的辅助知识。将示意图辅助知识和示意图标题进行拼接,实现教材示意图标题的多模态命名实体识别。实验结果表明,该算法具有先进性,同时增强了算法的可解释性。 展开更多
关键词 大语言模型 学科知识图谱 实体关系抽取 多模态命名实体识别 知识蒸馏
下载PDF
CLGLF:置信学习引导标签融合的多模态命名实体识别方法
17
作者 王海荣 王彤 +2 位作者 徐玺 荆博祥 陈芳萍 《电子学报》 EI CAS CSCD 北大核心 2024年第7期2429-2437,共9页
为解决多模态命名实体识别中存在的视觉语义理解和多模态语义的偏差问题,本文提出了置信学习引导标签融合的多模态命名实体识别方法 .该方法调用BLIP-2预训练模型生成图像描述,将其与输入的文本拼接,进行图文联合编码实现多模态特征融合... 为解决多模态命名实体识别中存在的视觉语义理解和多模态语义的偏差问题,本文提出了置信学习引导标签融合的多模态命名实体识别方法 .该方法调用BLIP-2预训练模型生成图像描述,将其与输入的文本拼接,进行图文联合编码实现多模态特征融合,对多模态表征和文本表征解码后得到候选标签和文本标签;在采用KL散度损失函数对齐两组标签的基础上,计算置信分数用来评估多模态表征质量,设置置信阈值辅助筛选出有偏差的候选标签,并使用相应位置的文本标签替换有偏差的候选标签,实现标签的融合,最终完成多模态命名实体识别.为了验证本文方法,在Twitter-2015和Twitter-2017多模态数据集上进行实验,并将实验结果与MSB、UMT等7种主流方法进行对比,实验结果证明了本文方法的有效性. 展开更多
关键词 多模态命名实体识别 图像描述 置信学习 多模态语义偏差 信息抽取
下载PDF
面向矿山机电设备监测文本的命名实体识别
18
作者 邱云飞 邢浩然 +1 位作者 于智龙 张文文 《计算机工程与应用》 CSCD 北大核心 2024年第11期129-138,共10页
正确抽取矿山机电设备监测文本中的设备名称、参数标准、故障位置、故障类型等实体,可以辅助专家尽早发现异常机电设备、提升分析设备故障的效率和精度。针对矿山机电设备领域实体多为嵌套实体,且具备字符较长、上下文关联性较强等特点... 正确抽取矿山机电设备监测文本中的设备名称、参数标准、故障位置、故障类型等实体,可以辅助专家尽早发现异常机电设备、提升分析设备故障的效率和精度。针对矿山机电设备领域实体多为嵌套实体,且具备字符较长、上下文关联性较强等特点,提出一种联合多粒度特征的实体识别方法,通过机器阅读理解框架初步确定长序列嵌套实体边界,采用融合注意力机制的BiLSTM神经网络深挖实体间上下文关联。实验结果表明,该方法对矿山机电设备监测文本中的实体具备较好的识别效果,并且提升了其他低资源场景下命名实体识别任务的效果。 展开更多
关键词 矿山机电设备 命名实体识别 多粒度信息 机器阅读理解
下载PDF
基于潜在关系的实体关系联合抽取模型
19
作者 彭晏飞 张睿思 +1 位作者 王瑞华 郭家隆 《计算机科学与探索》 CSCD 北大核心 2024年第4期1047-1056,共10页
实体关系联合抽取的作用是从特定文本中识别出实体和对应关系,同时它也是知识图谱构建和更新的基础。目前的联合抽取方法在追求性能的同时都忽略了抽取过程中的信息冗余。针对此问题,提出基于潜在关系的实体关系联合抽取模型,通过设计... 实体关系联合抽取的作用是从特定文本中识别出实体和对应关系,同时它也是知识图谱构建和更新的基础。目前的联合抽取方法在追求性能的同时都忽略了抽取过程中的信息冗余。针对此问题,提出基于潜在关系的实体关系联合抽取模型,通过设计一种新的解码方式来减少预测过程中关系、实体和三元组的冗余信息,从整体上分为提取潜在实体对、解码关系两步来完成从句子中抽取三元组的任务。首先通过潜在实体对提取器预测实体间是否存在潜在关系,同时筛选出置信度高的实体对作为最终的潜在实体对;其次将关系解码视作多标签二分类任务,通过关系解码器预测每个潜在实体对之间全部关系的置信度;最后通过置信度确定关系数量和类型,以完成三元组的抽取任务。在两个通用数据集上的实验结果表明,所提模型相比基线模型在准确率和F1指标上的效果更好,验证了所提模型的有效性,消融实验也证明了模型内部各部分的有效性。 展开更多
关键词 实体关系联合抽取 潜在关系 潜在实体对 多标签二分类任务 信息冗余
下载PDF
基于大规模预训练模型的地质矿物属性识别方法及应用
20
作者 王彬彬 周可法 +3 位作者 王金林 汪玮 李超 程寅益 《新疆地质》 CAS CSCD 2024年第1期139-144,共6页
地球科学的研究成果通常记录在技术报告、期刊论文、书籍等文献中,但许多详细的地球科学报告未被使用,这为信息提取提供了机遇。为此,我们提出了一种名为GMNER(Geological Minerals named entity recognize,MNER)的深度神经网络模型,用... 地球科学的研究成果通常记录在技术报告、期刊论文、书籍等文献中,但许多详细的地球科学报告未被使用,这为信息提取提供了机遇。为此,我们提出了一种名为GMNER(Geological Minerals named entity recognize,MNER)的深度神经网络模型,用于识别和提取矿物类型、地质构造、岩石与地质时间等关键信息。与传统方法不同,本次采用了大规模预训练模型BERT(Bidirectional Encoder Representations from Transformers,BERT)和深度神经网络来捕捉上下文信息,并结合条件随机场(Conditional random field,CRF)以获得准确结果。实验结果表明,MNER模型在中文地质文献中表现出色,平均精确度为0.8984,平均召回率0.9227,平均F1分数0.9104。研究不仅为自动矿物信息提取提供了新途径,也有望促进矿产资源管理和可持续利用。 展开更多
关键词 矿物信息提取 深度神经网络 矿物文献 命名实体识别
下载PDF
上一页 1 2 21 下一页 到第
使用帮助 返回顶部