At present,the entity and relation joint extraction task has attracted more and more scholars'attention in the field of natural language processing(NLP).However,most of their methods rely on NLP tools to construct...At present,the entity and relation joint extraction task has attracted more and more scholars'attention in the field of natural language processing(NLP).However,most of their methods rely on NLP tools to construct dependency trees to obtain sentence structure information.The adjacency matrix constructed by the dependency tree can convey syntactic information.Dependency trees obtained through NLP tools are too dependent on the tools and may not be very accurate in contextual semantic description.At the same time,a large amount of irrelevant information will cause redundancy.This paper presents a novel end-to-end entity and relation joint extraction based on the multihead attention graph convolutional network model(MAGCN),which does not rely on external tools.MAGCN generates an adjacency matrix through a multi-head attention mechanism to form an attention graph convolutional network model,uses head selection to identify multiple relations,and effectively improve the prediction result of overlapping relations.The authors extensively experiment and prove the method's effectiveness on three public datasets:NYT,WebNLG,and CoNLL04.The results show that the authors’method outperforms the state-of-the-art research results for the task of entities and relation extraction.展开更多
To address the increasing need for detecting and validating protein biomarkers in clinical specimens,mass spectrometry(MS)-based targeted proteomic techniques,including the selected reaction monitoring(SRM),parallel r...To address the increasing need for detecting and validating protein biomarkers in clinical specimens,mass spectrometry(MS)-based targeted proteomic techniques,including the selected reaction monitoring(SRM),parallel reaction monitoring(PRM),and massively parallel dataindependent acquisition(DIA),have been developed.For optimal performance,they require the fragment ion spectra of targeted peptides as prior knowledge.In this report,we describe a MS pipeline and spectral resource to support targeted proteomics studies for human tissue samples.To build the spectral resource,we integrated common open-source MS computational tools to assemble a freely accessible computational workflow based on Docker.We then applied the workflow to generate DPHL,a comprehensive DIA pan-human library,from 1096 data-dependent acquisition(DDA)MS raw files for 16 types of cancer samples.This extensive spectral resource was then applied to a proteomic study of 17 prostate cancer(PCa)patients.Thereafter,PRM validation was applied to a larger study of 57 PCa patients and the differential expression of three proteins in prostate tumor was validated.As a second application,the DPHL spectral resource was applied to a study consisting of plasma samples from 19 diffuse large B cell lymphoma(DLBCL)patients and 18 healthy control subjects.Differentially expressed proteins between DLBCL patients and healthy control subjects were detected by DIA-MS and confirmed by PRM.These data demonstrate that the DPHL supports DIA and PRM MS pipelines for robust protein biomarker discovery.DPHL is freely accessible at https://www.iprox.org/page/project.html?id=IPX0001400000.展开更多
Few-shot learning has been proposed and rapidly emerging as a viable means for completing various tasks.Many few-shot models have been widely used for relation learning tasks.However,each of these models has a shortag...Few-shot learning has been proposed and rapidly emerging as a viable means for completing various tasks.Many few-shot models have been widely used for relation learning tasks.However,each of these models has a shortage of capturing a certain aspect of semantic features,for example,CNN on long-range dependencies part,Transformer on local features.It is difficult for a single model to adapt to various relation learning,which results in a high variance problem.Ensemble strategy could be competitive in improving the accuracy of few-shot relation extraction and mitigating high variance risks.This paper explores an ensemble approach to reduce the variance and introduces fine-tuning and feature attention strategies to calibrate relation-level features.Results on several few-shot relation learning tasks show that our model significantly outperforms the previous state-of-the-art models.展开更多
基金State Key Program of National Natural Science of China,Grant/Award Number:61533018National Natural Science Foundation of China,Grant/Award Number:61402220+2 种基金Philosophy and Social Science Foundation of Hunan Province,Grant/Award Number:16YBA323Natural Science Foundation of Hunan Province,Grant/Award Number:2020JJ4525Scientific Research Fund of Hunan Provincial Education Department,Grant/Award Numbers:18B279,19A439。
文摘At present,the entity and relation joint extraction task has attracted more and more scholars'attention in the field of natural language processing(NLP).However,most of their methods rely on NLP tools to construct dependency trees to obtain sentence structure information.The adjacency matrix constructed by the dependency tree can convey syntactic information.Dependency trees obtained through NLP tools are too dependent on the tools and may not be very accurate in contextual semantic description.At the same time,a large amount of irrelevant information will cause redundancy.This paper presents a novel end-to-end entity and relation joint extraction based on the multihead attention graph convolutional network model(MAGCN),which does not rely on external tools.MAGCN generates an adjacency matrix through a multi-head attention mechanism to form an attention graph convolutional network model,uses head selection to identify multiple relations,and effectively improve the prediction result of overlapping relations.The authors extensively experiment and prove the method's effectiveness on three public datasets:NYT,WebNLG,and CoNLL04.The results show that the authors’method outperforms the state-of-the-art research results for the task of entities and relation extraction.
基金supported by the National Natural Science Foundation of China(Grant No.81972492)National Science Fund for Young Scholars(Grant No.21904107)+7 种基金Zhejiang Provincial Natural Science Foundation for Distinguished Young Scholars(Grant No.LR19C050001)Hangzhou Agriculture and Society Advancement Program(Grant No.20190101A04)Westlake Startup Grantresearch funds from the National Cancer Centre Singapore and Singapore General Hospital,Singaporethe National Key R&D Program of China(Grant No.2016YFC0901704)Zhejiang Innovation Discipline Project of Laboratory Animal Genetic Engineering(Grant No.201510)the Netherlands Cancer Society(Grant No.NKI 2014-6651)The Netherlands Organization for Scientific Research(NWO)-Middelgroot(Grant No.91116017)
文摘To address the increasing need for detecting and validating protein biomarkers in clinical specimens,mass spectrometry(MS)-based targeted proteomic techniques,including the selected reaction monitoring(SRM),parallel reaction monitoring(PRM),and massively parallel dataindependent acquisition(DIA),have been developed.For optimal performance,they require the fragment ion spectra of targeted peptides as prior knowledge.In this report,we describe a MS pipeline and spectral resource to support targeted proteomics studies for human tissue samples.To build the spectral resource,we integrated common open-source MS computational tools to assemble a freely accessible computational workflow based on Docker.We then applied the workflow to generate DPHL,a comprehensive DIA pan-human library,from 1096 data-dependent acquisition(DDA)MS raw files for 16 types of cancer samples.This extensive spectral resource was then applied to a proteomic study of 17 prostate cancer(PCa)patients.Thereafter,PRM validation was applied to a larger study of 57 PCa patients and the differential expression of three proteins in prostate tumor was validated.As a second application,the DPHL spectral resource was applied to a study consisting of plasma samples from 19 diffuse large B cell lymphoma(DLBCL)patients and 18 healthy control subjects.Differentially expressed proteins between DLBCL patients and healthy control subjects were detected by DIA-MS and confirmed by PRM.These data demonstrate that the DPHL supports DIA and PRM MS pipelines for robust protein biomarker discovery.DPHL is freely accessible at https://www.iprox.org/page/project.html?id=IPX0001400000.
基金The State Key Program of National Natural Science of China,Grant/Award Number:61533018National Natural Science Foundation of China,Grant/Award Number:61402220+2 种基金The Philosophy and Social Science Foundation of Hunan Province,Grant/Award Number:16YBA323Natural Science Foundation of Hunan Province,Grant/Award Number:2020JJ4525,2022JJ30495Scientific Research Fund of Hunan Provincial Education Department,Grant/Award Number:18B279,19A439
文摘Few-shot learning has been proposed and rapidly emerging as a viable means for completing various tasks.Many few-shot models have been widely used for relation learning tasks.However,each of these models has a shortage of capturing a certain aspect of semantic features,for example,CNN on long-range dependencies part,Transformer on local features.It is difficult for a single model to adapt to various relation learning,which results in a high variance problem.Ensemble strategy could be competitive in improving the accuracy of few-shot relation extraction and mitigating high variance risks.This paper explores an ensemble approach to reduce the variance and introduces fine-tuning and feature attention strategies to calibrate relation-level features.Results on several few-shot relation learning tasks show that our model significantly outperforms the previous state-of-the-art models.