期刊文献+
共找到45篇文章
< 1 2 3 >
每页显示 20 50 100
Classification of Conversational Sentences Using an Ensemble Pre-Trained Language Model with the Fine-Tuned Parameter
1
作者 R.Sujatha K.Nimala 《Computers, Materials & Continua》 SCIE EI 2024年第2期1669-1686,共18页
Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requir... Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88. 展开更多
关键词 Bidirectional encoder for representation of transformer conversation ensemble model fine-tuning generalized autoregressive pretraining for language understanding generative pre-trained transformer hyperparameter tuning natural language processing robustly optimized BERT pretraining approach sentence classification transformer models
下载PDF
Adapter Based on Pre-Trained Language Models for Classification of Medical Text
2
作者 Quan Li 《Journal of Electronic Research and Application》 2024年第3期129-134,共6页
We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract informa... We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract information from medical text,facilitating more accurate classification while minimizing the number of trainable parameters.Extensive experiments conducted on various datasets demonstrate the effectiveness of our approach. 展开更多
关键词 Classification of medical text ADAPTER pre-trained language model
下载PDF
Construction and application of knowledge graph for grid dispatch fault handling based on pre-trained model
3
作者 Zhixiang Ji Xiaohui Wang +1 位作者 Jie Zhang Di Wu 《Global Energy Interconnection》 EI CSCD 2023年第4期493-504,共12页
With the construction of new power systems,the power grid has become extremely large,with an increasing proportion of new energy and AC/DC hybrid connections.The dynamic characteristics and fault patterns of the power... With the construction of new power systems,the power grid has become extremely large,with an increasing proportion of new energy and AC/DC hybrid connections.The dynamic characteristics and fault patterns of the power grid are complex;additionally,power grid control is difficult,operation risks are high,and the task of fault handling is arduous.Traditional power-grid fault handling relies primarily on human experience.The difference in and lack of knowledge reserve of control personnel restrict the accuracy and timeliness of fault handling.Therefore,this mode of operation is no longer suitable for the requirements of new systems.Based on the multi-source heterogeneous data of power grid dispatch,this paper proposes a joint entity–relationship extraction method for power-grid dispatch fault processing based on a pre-trained model,constructs a knowledge graph of power-grid dispatch fault processing and designs,and develops a fault-processing auxiliary decision-making system based on the knowledge graph.It was applied to study a provincial dispatch control center,and it effectively improved the accident processing ability and intelligent level of accident management and control of the power grid. 展开更多
关键词 Power-grid dispatch fault handling Knowledge graph pre-trained model Auxiliary decision-making
下载PDF
Leveraging Vision-Language Pre-Trained Model and Contrastive Learning for Enhanced Multimodal Sentiment Analysis
4
作者 Jieyu An Wan Mohd Nazmee Wan Zainon Binfen Ding 《Intelligent Automation & Soft Computing》 SCIE 2023年第8期1673-1689,共17页
Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on... Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on unimodal pre-trained models for feature extraction from each modality often overlook the intrinsic connections of semantic information between modalities.This limitation is attributed to their training on unimodal data,and necessitates the use of complex fusion mechanisms for sentiment analysis.In this study,we present a novel approach that combines a vision-language pre-trained model with a proposed multimodal contrastive learning method.Our approach harnesses the power of transfer learning by utilizing a vision-language pre-trained model to extract both visual and textual representations in a unified framework.We employ a Transformer architecture to integrate these representations,thereby enabling the capture of rich semantic infor-mation in image-text pairs.To further enhance the representation learning of these pairs,we introduce our proposed multimodal contrastive learning method,which leads to improved performance in sentiment analysis tasks.Our approach is evaluated through extensive experiments on two publicly accessible datasets,where we demonstrate its effectiveness.We achieve a significant improvement in sentiment analysis accuracy,indicating the supe-riority of our approach over existing techniques.These results highlight the potential of multimodal sentiment analysis and underscore the importance of considering the intrinsic semantic connections between modalities for accurate sentiment assessment. 展开更多
关键词 Multimodal sentiment analysis vision–language pre-trained model contrastive learning sentiment classification
下载PDF
Evaluating the role of large language models in inflammatory bowel disease patient information
5
作者 Eun Jeong Gong Chang Seok Bang 《World Journal of Gastroenterology》 SCIE CAS 2024年第29期3538-3540,共3页
This letter evaluates the article by Gravina et al on ChatGPT’s potential in providing medical information for inflammatory bowel disease patients.While promising,it highlights the need for advanced techniques like r... This letter evaluates the article by Gravina et al on ChatGPT’s potential in providing medical information for inflammatory bowel disease patients.While promising,it highlights the need for advanced techniques like reasoning+action and retrieval-augmented generation to improve accuracy and reliability.Emphasizing that simple question and answer testing is insufficient,it calls for more nuanced evaluation methods to truly gauge large language models’capabilities in clinical applications. 展开更多
关键词 Crohn’s disease Ulcerative colitis Inflammatory bowel disease Chat generative pre-trained transformer Large language model Artificial intelligence
下载PDF
A Classification–Detection Approach of COVID-19 Based on Chest X-ray and CT by Using Keras Pre-Trained Deep Learning Models 被引量:10
6
作者 Xing Deng Haijian Shao +2 位作者 Liang Shi Xia Wang Tongling Xie 《Computer Modeling in Engineering & Sciences》 SCIE EI 2020年第11期579-596,共18页
The Coronavirus Disease 2019(COVID-19)is wreaking havoc around the world,bring out that the enormous pressure on national health and medical staff systems.One of the most effective and critical steps in the fight agai... The Coronavirus Disease 2019(COVID-19)is wreaking havoc around the world,bring out that the enormous pressure on national health and medical staff systems.One of the most effective and critical steps in the fight against COVID-19,is to examine the patient’s lungs based on the Chest X-ray and CT generated by radiation imaging.In this paper,five keras-related deep learning models:ResNet50,InceptionResNetV2,Xception,transfer learning and pre-trained VGGNet16 is applied to formulate an classification-detection approaches of COVID-19.Two benchmark methods SVM(Support Vector Machine),CNN(Conventional Neural Networks)are provided to compare with the classification-detection approaches based on the performance indicators,i.e.,precision,recall,F1 scores,confusion matrix,classification accuracy and three types of AUC(Area Under Curve).The highest classification accuracy derived by classification-detection based on 5857 Chest X-rays and 767 Chest CTs are respectively 84%and 75%,which shows that the keras-related deep learning approaches facilitate accurate and effective COVID-19-assisted detection. 展开更多
关键词 COVID-19 detection deep learning transfer learning pre-trained models
下载PDF
Vulnerability Detection of Ethereum Smart Contract Based on SolBERT-BiGRU-Attention Hybrid Neural Model
7
作者 Guangxia Xu Lei Liu Jingnan Dong 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第10期903-922,共20页
In recent years,with the great success of pre-trained language models,the pre-trained BERT model has been gradually applied to the field of source code understanding.However,the time cost of training a language model ... In recent years,with the great success of pre-trained language models,the pre-trained BERT model has been gradually applied to the field of source code understanding.However,the time cost of training a language model from zero is very high,and how to transfer the pre-trained language model to the field of smart contract vulnerability detection is a hot research direction at present.In this paper,we propose a hybrid model to detect common vulnerabilities in smart contracts based on a lightweight pre-trained languagemodel BERT and connected to a bidirectional gate recurrent unitmodel.The downstream neural network adopts the bidirectional gate recurrent unit neural network model with a hierarchical attention mechanism to mine more semantic features contained in the source code of smart contracts by using their characteristics.Our experiments show that our proposed hybrid neural network model SolBERT-BiGRU-Attention is fitted by a large number of data samples with smart contract vulnerabilities,and it is found that compared with the existing methods,the accuracy of our model can reach 93.85%,and the Micro-F1 Score is 94.02%. 展开更多
关键词 Smart contract pre-trained language model deep learning recurrent neural network blockchain security
下载PDF
A PERT-BiLSTM-Att Model for Online Public Opinion Text Sentiment Analysis
8
作者 Mingyong Li Zheng Jiang +1 位作者 Zongwei Zhao Longfei Ma 《Intelligent Automation & Soft Computing》 SCIE 2023年第8期2387-2406,共20页
As an essential category of public event management and control,sentiment analysis of online public opinion text plays a vital role in public opinion early warning,network rumor management,and netizens’person-ality p... As an essential category of public event management and control,sentiment analysis of online public opinion text plays a vital role in public opinion early warning,network rumor management,and netizens’person-ality portraits under massive public opinion data.The traditional sentiment analysis model is not sensitive to the location information of words,it is difficult to solve the problem of polysemy,and the learning representation ability of long and short sentences is very different,which leads to the low accuracy of sentiment classification.This paper proposes a sentiment analysis model PERT-BiLSTM-Att for public opinion text based on the pre-training model of the disordered language model,bidirectional long-term and short-term memory network and attention mechanism.The model first uses the PERT model pre-trained from the lexical location information of a large amount of corpus to process the text data and obtain the dynamic feature representation of the text.Then the semantic features are input into BiLSTM to learn context sequence information and enhance the model’s ability to represent long sequences.Finally,the attention mechanism is used to focus on the words that contribute more to the overall emotional tendency to make up for the lack of short text representation ability of the traditional model,and then the classification results are output through the fully connected network.The experimental results show that the classification accuracy of the model on NLPCC14 and weibo_senti_100k public data sets reach 88.56%and 97.05%,respectively,and the accuracy reaches 95.95%on the data set MDC22 composed of Meituan,Dianping and Ctrip comment.It proves that the model has a good effect on sentiment analysis of online public opinion texts on different platforms.The experimental results on different datasets verify the model’s effectiveness in applying sentiment analysis of texts.At the same time,the model has a strong generalization ability and can achieve good results for sentiment analysis of datasets in different fields. 展开更多
关键词 Natural language processing PERT pre-training model emotional analysis BiLSTM
下载PDF
Robust Deep Learning Model for Black Fungus Detection Based on Gabor Filter and Transfer Learning
9
作者 Esraa Hassan Fatma M.Talaat +4 位作者 Samah Adel Samir Abdelrazek Ahsan Aziz Yunyoung Nam Nora El-Rashidy 《Computer Systems Science & Engineering》 SCIE EI 2023年第11期1507-1525,共19页
Black fungus is a rare and dangerous mycology that usually affects the brain and lungs and could be life-threatening in diabetic cases.Recently,some COVID-19 survivors,especially those with co-morbid diseases,have bee... Black fungus is a rare and dangerous mycology that usually affects the brain and lungs and could be life-threatening in diabetic cases.Recently,some COVID-19 survivors,especially those with co-morbid diseases,have been susceptible to black fungus.Therefore,recovered COVID-19 patients should seek medical support when they notice mucormycosis symptoms.This paper proposes a novel ensemble deep-learning model that includes three pre-trained models:reset(50),VGG(19),and Inception.Our approach is medically intuitive and efficient compared to the traditional deep learning models.An image dataset was aggregated from various resources and divided into two classes:a black fungus class and a skin infection class.To the best of our knowledge,our study is the first that is concerned with building black fungus detection models based on deep learning algorithms.The proposed approach can significantly improve the performance of the classification task and increase the generalization ability of such a binary classification task.According to the reported results,it has empirically achieved a sensitivity value of 0.9907,a specificity value of 0.9938,a precision value of 0.9938,and a negative predictive value of 0.9907. 展开更多
关键词 Black fungus COVID-19 Transfer learning pre-trained models medical image
下载PDF
Intelligent Deep Convolutional Neural Network Based Object DetectionModel for Visually Challenged People
10
作者 S.Kiruthika Devi Amani Abdulrahman Albraikan +3 位作者 Fahd N.Al-Wesabi Mohamed K.Nour Ahmed Ashour Anwer Mustafa Hilal 《Computer Systems Science & Engineering》 SCIE EI 2023年第9期3191-3207,共17页
Artificial Intelligence(AI)and Computer Vision(CV)advancements have led to many useful methodologies in recent years,particularly to help visually-challenged people.Object detection includes a variety of challenges,fo... Artificial Intelligence(AI)and Computer Vision(CV)advancements have led to many useful methodologies in recent years,particularly to help visually-challenged people.Object detection includes a variety of challenges,for example,handlingmultiple class images,images that get augmented when captured by a camera and so on.The test images include all these variants as well.These detection models alert them about their surroundings when they want to walk independently.This study compares four CNN-based pre-trainedmodels:ResidualNetwork(ResNet-50),Inception v3,DenseConvolutional Network(DenseNet-121),and SqueezeNet,predominantly used in image recognition applications.Based on the analysis performed on these test images,the study infers that Inception V3 outperformed other pre-trained models in terms of accuracy and speed.To further improve the performance of the Inception v3 model,the thermal exchange optimization(TEO)algorithm is applied to tune the hyperparameters(number of epochs,batch size,and learning rate)showing the novelty of the work.Better accuracy was achieved owing to the inclusion of an auxiliary classifier as a regularizer,hyperparameter optimizer,and factorization approach.Additionally,Inception V3 can handle images of different sizes.This makes Inception V3 the optimum model for assisting visually challenged people in real-world communication when integrated with Internet of Things(IoT)-based devices. 展开更多
关键词 pre-trained models object detection visually challenged people deep learning Inception V3 DenseNet-121
下载PDF
Efficient Grad-Cam-Based Model for COVID-19 Classification and Detection
11
作者 Saleh Albahli Ghulam Nabi Ahmad Hassan Yar 《Computer Systems Science & Engineering》 SCIE EI 2023年第3期2743-2757,共15页
Corona Virus(COVID-19)is a novel virus that crossed an animal-human barrier and emerged in Wuhan,China.Until now it has affected more than 119 million people.Detection of COVID-19 is a critical task and due to a large... Corona Virus(COVID-19)is a novel virus that crossed an animal-human barrier and emerged in Wuhan,China.Until now it has affected more than 119 million people.Detection of COVID-19 is a critical task and due to a large number of patients,a shortage of doctors has occurred for its detection.In this paper,a model has been suggested that not only detects the COVID-19 using X-ray and CT-Scan images but also shows the affected areas.Three classes have been defined;COVID-19,normal,and Pneumonia for X-ray images.For CT-Scan images,2 classes have been defined COVID-19 and non-COVID-19.For classi-fication purposes,pretrained models like ResNet50,VGG-16,and VGG19 have been used with some tuning.For detecting the affected areas Gradient-weighted Class Activation Mapping(GradCam)has been used.As the X-rays and ct images are taken at different intensities,so the contrast limited adaptive histogram equalization(CLAHE)has been applied to see the effect on the training of the models.As a result of these experiments,we achieved a maximum validation accuracy of 88.10%with a training accuracy of 88.48%for CT-Scan images using the ResNet50 model.While for X-ray images we achieved a maximum validation accuracy of 97.31%with a training accuracy of 95.64%using the VGG16 model. 展开更多
关键词 Convolutional neural networks(CNN) COVID-19 pre-trained models CLAHE Grad-Cam X-RAY data augmentation
下载PDF
Personality Trait Detection via Transfer Learning
12
作者 Bashar Alshouha Jesus Serrano-Guerrero +2 位作者 Francisco Chiclana Francisco P.Romero Jose A.Olivas 《Computers, Materials & Continua》 SCIE EI 2024年第2期1933-1956,共24页
Personality recognition plays a pivotal role when developing user-centric solutions such as recommender systems or decision support systems across various domains,including education,e-commerce,or human resources.Tra-... Personality recognition plays a pivotal role when developing user-centric solutions such as recommender systems or decision support systems across various domains,including education,e-commerce,or human resources.Tra-ditional machine learning techniques have been broadly employed for personality trait identification;nevertheless,the development of new technologies based on deep learning has led to new opportunities to improve their performance.This study focuses on the capabilities of pre-trained language models such as BERT,RoBERTa,ALBERT,ELECTRA,ERNIE,or XLNet,to deal with the task of personality recognition.These models are able to capture structural features from textual content and comprehend a multitude of language facets and complex features such as hierarchical relationships or long-term dependencies.This makes them suitable to classify multi-label personality traits from reviews while mitigating computational costs.The focus of this approach centers on developing an architecture based on different layers able to capture the semantic context and structural features from texts.Moreover,it is able to fine-tune the previous models using the MyPersonality dataset,which comprises 9,917 status updates contributed by 250 Facebook users.These status updates are categorized according to the well-known Big Five personality model,setting the stage for a comprehensive exploration of personality traits.To test the proposal,a set of experiments have been performed using different metrics such as the exact match ratio,hamming loss,zero-one-loss,precision,recall,F1-score,and weighted averages.The results reveal ERNIE is the top-performing model,achieving an exact match ratio of 72.32%,an accuracy rate of 87.17%,and 84.41%of F1-score.The findings demonstrate that the tested models substantially outperform other state-of-the-art studies,enhancing the accuracy by at least 3%and confirming them as powerful tools for personality recognition.These findings represent substantial advancements in personality recognition,making them appropriate for the development of user-centric applications. 展开更多
关键词 Personality trait detection pre-trained language model big five model transfer learning
下载PDF
Short-term displacement prediction for newly established monitoring slopes based on transfer learning
13
作者 Yuan Tian Yang-landuo Deng +3 位作者 Ming-zhi Zhang Xiao Pang Rui-ping Ma Jian-xue Zhang 《China Geology》 CAS CSCD 2024年第2期351-364,共14页
This study makes a significant progress in addressing the challenges of short-term slope displacement prediction in the Universal Landslide Monitoring Program,an unprecedented disaster mitigation program in China,wher... This study makes a significant progress in addressing the challenges of short-term slope displacement prediction in the Universal Landslide Monitoring Program,an unprecedented disaster mitigation program in China,where lots of newly established monitoring slopes lack sufficient historical deformation data,making it difficult to extract deformation patterns and provide effective predictions which plays a crucial role in the early warning and forecasting of landslide hazards.A slope displacement prediction method based on transfer learning is therefore proposed.Initially,the method transfers the deformation patterns learned from slopes with relatively rich deformation data by a pre-trained model based on a multi-slope integrated dataset to newly established monitoring slopes with limited or even no useful data,thus enabling rapid and efficient predictions for these slopes.Subsequently,as time goes on and monitoring data accumulates,fine-tuning of the pre-trained model for individual slopes can further improve prediction accuracy,enabling continuous optimization of prediction results.A case study indicates that,after being trained on a multi-slope integrated dataset,the TCN-Transformer model can efficiently serve as a pretrained model for displacement prediction at newly established monitoring slopes.The three-day average RMSE is significantly reduced by 34.6%compared to models trained only on individual slope data,and it also successfully predicts the majority of deformation peaks.The fine-tuned model based on accumulated data on the target newly established monitoring slope further reduced the three-day RMSE by 37.2%,demonstrating a considerable predictive accuracy.In conclusion,taking advantage of transfer learning,the proposed slope displacement prediction method effectively utilizes the available data,which enables the rapid deployment and continual refinement of displacement predictions on newly established monitoring slopes. 展开更多
关键词 LANDSLIDE Slope displacement prediction Transfer learning Integrated dataset Transformer pre-trained model Universal Landslide Monitoring Program(ULMP) Geological hazards survey engineering
下载PDF
SHEL:a semantically enhanced hardware-friendly entity linking method
14
作者 亓东林 CHEN Shudong +2 位作者 DU Rong TONG Da YU Yong 《High Technology Letters》 EI CAS 2024年第1期13-22,共10页
With the help of pre-trained language models,the accuracy of the entity linking task has made great strides in recent years.However,most models with excellent performance require fine-tuning on a large amount of train... With the help of pre-trained language models,the accuracy of the entity linking task has made great strides in recent years.However,most models with excellent performance require fine-tuning on a large amount of training data using large pre-trained language models,which is a hardware threshold to accomplish this task.Some researchers have achieved competitive results with less training data through ingenious methods,such as utilizing information provided by the named entity recognition model.This paper presents a novel semantic-enhancement-based entity linking approach,named semantically enhanced hardware-friendly entity linking(SHEL),which is designed to be hardware friendly and efficient while maintaining good performance.Specifically,SHEL's semantic enhancement approach consists of three aspects:(1)semantic compression of entity descriptions using a text summarization model;(2)maximizing the capture of mention contexts using asymmetric heuristics;(3)calculating a fixed size mention representation through pooling operations.These series of semantic enhancement methods effectively improve the model's ability to capture semantic information while taking into account the hardware constraints,and significantly improve the model's convergence speed by more than 50%compared with the strong baseline model proposed in this paper.In terms of performance,SHEL is comparable to the previous method,with superior performance on six well-established datasets,even though SHEL is trained using a smaller pre-trained language model as the encoder. 展开更多
关键词 entity linking(EL) pre-trained models knowledge graph text summarization semantic enhancement
下载PDF
May ChatGPT be a tool producing medical information for common inflammatory bowel disease patients’questions?An evidencecontrolled analysis
15
作者 Antonietta Gerarda Gravina Raffaele Pellegrino +6 位作者 Marina Cipullo Giovanna Palladino Giuseppe Imperio Andrea Ventura Salvatore Auletta Paola Ciamarra Alessandro Federico 《World Journal of Gastroenterology》 SCIE CAS 2024年第1期17-33,共17页
Artificial intelligence is increasingly entering everyday healthcare.Large language model(LLM)systems such as Chat Generative Pre-trained Transformer(ChatGPT)have become potentially accessible to everyone,including pa... Artificial intelligence is increasingly entering everyday healthcare.Large language model(LLM)systems such as Chat Generative Pre-trained Transformer(ChatGPT)have become potentially accessible to everyone,including patients with inflammatory bowel diseases(IBD).However,significant ethical issues and pitfalls exist in innovative LLM tools.The hype generated by such systems may lead to unweighted patient trust in these systems.Therefore,it is necessary to understand whether LLMs(trendy ones,such as ChatGPT)can produce plausible medical information(MI)for patients.This review examined ChatGPT’s potential to provide MI regarding questions commonly addressed by patients with IBD to their gastroenterologists.From the review of the outputs provided by ChatGPT,this tool showed some attractive potential while having significant limitations in updating and detailing information and providing inaccurate information in some cases.Further studies and refinement of the ChatGPT,possibly aligning the outputs with the leading medical evidence provided by reliable databases,are needed. 展开更多
关键词 Crohn’s disease Ulcerative colitis Inflammatory bowel disease Chat Generative pre-trained Transformer Large language model Artificial intelligence
下载PDF
An Approach to Detect Structural Development Defects in Object-Oriented Programs
16
作者 Maxime Seraphin Gnagne Mouhamadou Dosso +1 位作者 Mamadou Diarra Souleymane Oumtanaga 《Open Journal of Applied Sciences》 2024年第2期494-510,共17页
Structural development defects essentially refer to code structure that violates object-oriented design principles. They make program maintenance challenging and deteriorate software quality over time. Various detecti... Structural development defects essentially refer to code structure that violates object-oriented design principles. They make program maintenance challenging and deteriorate software quality over time. Various detection approaches, ranging from traditional heuristic algorithms to machine learning methods, are used to identify these defects. Ensemble learning methods have strengthened the detection of these defects. However, existing approaches do not simultaneously exploit the capabilities of extracting relevant features from pre-trained models and the performance of neural networks for the classification task. Therefore, our goal has been to design a model that combines a pre-trained model to extract relevant features from code excerpts through transfer learning and a bagging method with a base estimator, a dense neural network, for defect classification. To achieve this, we composed multiple samples of the same size with replacements from the imbalanced dataset MLCQ1. For all the samples, we used the CodeT5-small variant to extract features and trained a bagging method with the neural network Roberta Classification Head to classify defects based on these features. We then compared this model to RandomForest, one of the ensemble methods that yields good results. Our experiments showed that the number of base estimators to use for bagging depends on the defect to be detected. Next, we observed that it was not necessary to use a data balancing technique with our model when the imbalance rate was 23%. Finally, for blob detection, RandomForest had a median MCC value of 0.36 compared to 0.12 for our method. However, our method was predominant in Long Method detection with a median MCC value of 0.53 compared to 0.42 for RandomForest. These results suggest that the performance of ensemble methods in detecting structural development defects is dependent on specific defects. 展开更多
关键词 Object-Oriented Programming Structural Development Defect Detection Software Maintenance pre-trained models Features Extraction BAGGING Neural Network
下载PDF
Embedding Extraction for Arabic Text Using the AraBERT Model
17
作者 Amira Hamed Abo-Elghit Taher Hamza Aya Al-Zoghby 《Computers, Materials & Continua》 SCIE EI 2022年第7期1967-1994,共28页
Nowadays,we can use the multi-task learning approach to train a machine-learning algorithm to learn multiple related tasks instead of training it to solve a single task.In this work,we propose an algorithm for estimat... Nowadays,we can use the multi-task learning approach to train a machine-learning algorithm to learn multiple related tasks instead of training it to solve a single task.In this work,we propose an algorithm for estimating textual similarity scores and then use these scores in multiple tasks such as text ranking,essay grading,and question answering systems.We used several vectorization schemes to represent the Arabic texts in the SemEval2017-task3-subtask-D dataset.The used schemes include lexical-based similarity features,frequency-based features,and pre-trained model-based features.Also,we used contextual-based embedding models such as Arabic Bidirectional Encoder Representations from Transformers(AraBERT).We used the AraBERT model in two different variants.First,as a feature extractor in addition to the text vectorization schemes’features.We fed those features to various regression models to make a prediction value that represents the relevancy score between Arabic text units.Second,AraBERT is adopted as a pre-trained model,and its parameters are fine-tuned to estimate the relevancy scores between Arabic textual sentences.To evaluate the research results,we conducted several experiments to compare the use of the AraBERT model in its two variants.In terms of Mean Absolute Percentage Error(MAPE),the results showminor variance between AraBERT v0.2 as a feature extractor(21.7723)and the fine-tuned AraBERT v2(21.8211).On the other hand,AraBERT v0.2-Large as a feature extractor outperforms the finetuned AraBERT v2 model on the used data set in terms of the coefficient of determination(R2)values(0.014050,−0.032861),respectively. 展开更多
关键词 Semantic textual similarity arabic language EMBEDDINGS AraBERT pre-trained models regression contextual-based models concurrency concept
下载PDF
Textual Content Prediction via Fuzzy Attention Neural Network Model without Predefined Knowledge
18
作者 Canghong Jin Guangjie Zhang +2 位作者 Minghui Wu Shengli Zhou Taotao Fu 《China Communications》 SCIE CSCD 2020年第6期211-222,共12页
Text analysis is a popular technique for finding the most significant information from texts including semantic,emotional,and other hidden features,which became a research hotspot in the last few years.Specially,there... Text analysis is a popular technique for finding the most significant information from texts including semantic,emotional,and other hidden features,which became a research hotspot in the last few years.Specially,there are some text analysis tasks with judgment reports,such as analyzing the criminal process and predicting prison terms.Traditional researches on text analysis are generally based on special feature selection and ontology model generation or require legal experts to provide external knowledge.All these methods require a lot of time and labor costs.Therefore,in this paper,we use textual data such as judgment reports creatively to perform prison term prediction without external legal knowledge.We propose a framework that combines value-based rules and a fuzzy text to predict the target prison term.The procedure in our framework includes information extraction,term fuzzification,and document vector regression.We carry out experiments with real-world judgment reports and compare our model’s performance with those of ten traditional classification and regression models and two deep learning models.The results show that our model achieves competitive results compared with other models as evaluated by the RMSE and R-squared metrics.Finally,we implement a prototype system with a user-friendly GUI that can be used to predict prison terms according to the legal text inputted by the user. 展开更多
关键词 judgment content understanding pre-trained model FUZZIFICATION content representation vectors
下载PDF
Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey 被引量:5
19
作者 Xiao Wang Guangyao Chen +5 位作者 Guangwu Qian Pengcheng Gao Xiao-Yong Wei Yaowei Wang Yonghong Tian Wen Gao 《Machine Intelligence Research》 EI CSCD 2023年第4期447-482,共36页
With the urgent demand for generalized deep models,many pre-trained big models are proposed,such as bidirectional encoder representations(BERT),vision transformer(ViT),generative pre-trained transformers(GPT),etc.Insp... With the urgent demand for generalized deep models,many pre-trained big models are proposed,such as bidirectional encoder representations(BERT),vision transformer(ViT),generative pre-trained transformers(GPT),etc.Inspired by the success of these models in single domains(like computer vision and natural language processing),the multi-modal pre-trained big models have also drawn more and more attention in recent years.In this work,we give a comprehensive survey of these models and hope this paper could provide new insights and helps fresh researchers to track the most cutting-edge works.Specifically,we firstly introduce the background of multi-modal pre-training by reviewing the conventional deep learning,pre-training works in natural language process,computer vision,and speech.Then,we introduce the task definition,key challenges,and advantages of multi-modal pre-training models(MM-PTMs),and discuss the MM-PTMs with a focus on data,objectives,network architectures,and knowledge enhanced pre-training.After that,we introduce the downstream tasks used for the validation of large-scale MM-PTMs,including generative,classification,and regression tasks.We also give visualization and analysis of the model parameters and results on representative downstream tasks.Finally,we point out possible research directions for this topic that may benefit future works.In addition,we maintain a continuously updated paper list for large-scale pre-trained multi-modal big models:https://github.com/wangxiao5791509/MultiModal_BigModels_Survey. 展开更多
关键词 Multi-modal(MM) pre-trained model(PTM) information fusion representation learning deep learning
原文传递
EVA2.0:Investigating Open-domain Chinese Dialogue Systems with Large-scale Pre-training 被引量:1
20
作者 Yuxian Gu Jiaxin Wen +8 位作者 Hao Sun Yi Song Pei Ke Chujie Zheng Zheng Zhang Jianzhu Yao Lei Liu Xiaoyan Zhu Minlie Huang 《Machine Intelligence Research》 EI CSCD 2023年第2期207-219,共13页
Large-scale pre-training has shown remarkable performance in building open-domain dialogue systems.However,previous works mainly focus on showing and evaluating the conversational performance of the released dialogue ... Large-scale pre-training has shown remarkable performance in building open-domain dialogue systems.However,previous works mainly focus on showing and evaluating the conversational performance of the released dialogue model,ignoring the discussion of some key factors towards a powerful human-like chatbot,especially in Chinese scenarios.In this paper,we conduct extensive experiments to investigate these under-explored factors,including data quality control,model architecture designs,training approaches,and decoding strategies.We propose EVA2.0,a large-scale pre-trained open-domain Chinese dialogue model with 2.8 billion parameters,and will make our models and codes publicly available.Automatic and human evaluations show that EVA2.0 significantly outperforms other open-source counterparts.We also discuss the limitations of this work by presenting some failure cases and pose some future research directions on large-scale Chinese open-domain dialogue systems. 展开更多
关键词 Natural language processing deep learning(DL) large-scale pre-training dialogue systems Chinese open-domain conversational model
原文传递
上一页 1 2 3 下一页 到第
使用帮助 返回顶部