期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Masked Sentence Model Based on BERT for Move Recognition in Medical Scientific Abstracts 被引量:20
1
作者 Gaihong Yu Zhixiong Zhang +1 位作者 Huan Liu Liangping Ding 《Journal of Data and Information Science》 CSCD 2019年第4期42-55,共14页
Purpose:Mo ve recognition in scientific abstracts is an NLP task of classifying sentences of the abstracts into different types of language units.To improve the performance of move recognition in scientific abstracts,... Purpose:Mo ve recognition in scientific abstracts is an NLP task of classifying sentences of the abstracts into different types of language units.To improve the performance of move recognition in scientific abstracts,a novel model of move recognition is proposed that outperforms the BERT-based method.Design/methodology/approach:Prevalent models based on BERT for sentence classification often classify sentences without considering the context of the sentences.In this paper,inspired by the BERT masked language model(MLM),we propose a novel model called the masked sentence model that integrates the content and contextual information of the sentences in move recognition.Experiments are conducted on the benchmark dataset PubMed 20K RCT in three steps.Then,we compare our model with HSLN-RNN,BERT-based and SciBERT using the same dataset.Findings:Compared with the BERT-based and SciBERT models,the F1 score of our model outperforms them by 4.96%and 4.34%,respectively,which shows the feasibility and effectiveness of the novel model and the result of our model comes closest to the state-of-theart results of HSLN-RNN at present.Research limitations:The sequential features of move labels are not considered,which might be one of the reasons why HSLN-RNN has better performance.Our model is restricted to dealing with biomedical English literature because we use a dataset from PubMed,which is a typical biomedical database,to fine-tune our model.Practical implications:The proposed model is better and simpler in identifying move structures in scientific abstracts and is worthy of text classification experiments for capturing contextual features of sentences.Originality/value:T he study proposes a masked sentence model based on BERT that considers the contextual features of the sentences in abstracts in a new way.The performance of this classification model is significantly improved by rebuilding the input layer without changing the structure of neural networks. 展开更多
关键词 move recognition BERT Masked sentence model Scientific abstracts
下载PDF
RCMR 280k:Refined Corpus for Move Recognition Based on PubMed Abstracts
2
作者 Jie Li Gaihong Yu Zhixiong Zhang 《Data Intelligence》 EI 2023年第3期511-536,共26页
Existing datasets for move recognition,such as PubMed 20ok RCT,exhibit several problems that significantly impact recognition performance,especially for Background and Objective labels.In order to improve the move rec... Existing datasets for move recognition,such as PubMed 20ok RCT,exhibit several problems that significantly impact recognition performance,especially for Background and Objective labels.In order to improve the move recognition performance,we introduce a method and construct a refined corpus based on PubMed,named RCMR 280k.This corpus comprises approximately 280,000 structured abstracts,totaling 3,386,008 sentences,each sentence is labeled with one of five categories:Background,Objective,Method,Result,or Conclusion.We also construct a subset of RCMR,named RCMR_RCT,corresponding to medical subdomain of RCTs.We conduct comparison experiments using our RCMR,RCMR_RCT with PubMed 380k and PubMed 200k RCT,respectively.The best results,obtained using the MSMBERT model,show that:(1)our RCMR outperforms PubMed 380k by 0.82%,while our RCMR_RCT outperforms PubMed 200k RCT by 9.35%;(2)compared with PubMed 380k,our corpus achieve better improvement on the Results and Conclusions categories,with average F1 performance improves 1%and 0.82%,respectively;(3)compared with PubMed 200k RCT,our corpus significantly improves the performance in the Background and Objective categories,with average F1 scores improves 28.31%and 37.22%,respectively.To the best of our knowledge,our RCMR is among the rarely high-quality,resource-rich refined PubMed corpora available.Our work in this paper has been applied in the SciAlEngine,which is openly accessible for researchers to conduct move recognition task. 展开更多
关键词 Refined corpus move recognition Sequential sentence classification Corpus construction Corpus analysis
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部