Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on...Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on unimodal pre-trained models for feature extraction from each modality often overlook the intrinsic connections of semantic information between modalities.This limitation is attributed to their training on unimodal data,and necessitates the use of complex fusion mechanisms for sentiment analysis.In this study,we present a novel approach that combines a vision-language pre-trained model with a proposed multimodal contrastive learning method.Our approach harnesses the power of transfer learning by utilizing a vision-language pre-trained model to extract both visual and textual representations in a unified framework.We employ a Transformer architecture to integrate these representations,thereby enabling the capture of rich semantic infor-mation in image-text pairs.To further enhance the representation learning of these pairs,we introduce our proposed multimodal contrastive learning method,which leads to improved performance in sentiment analysis tasks.Our approach is evaluated through extensive experiments on two publicly accessible datasets,where we demonstrate its effectiveness.We achieve a significant improvement in sentiment analysis accuracy,indicating the supe-riority of our approach over existing techniques.These results highlight the potential of multimodal sentiment analysis and underscore the importance of considering the intrinsic semantic connections between modalities for accurate sentiment assessment.展开更多
Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requir...Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88.展开更多
We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract informa...We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract information from medical text,facilitating more accurate classification while minimizing the number of trainable parameters.Extensive experiments conducted on various datasets demonstrate the effectiveness of our approach.展开更多
With the construction of new power systems,the power grid has become extremely large,with an increasing proportion of new energy and AC/DC hybrid connections.The dynamic characteristics and fault patterns of the power...With the construction of new power systems,the power grid has become extremely large,with an increasing proportion of new energy and AC/DC hybrid connections.The dynamic characteristics and fault patterns of the power grid are complex;additionally,power grid control is difficult,operation risks are high,and the task of fault handling is arduous.Traditional power-grid fault handling relies primarily on human experience.The difference in and lack of knowledge reserve of control personnel restrict the accuracy and timeliness of fault handling.Therefore,this mode of operation is no longer suitable for the requirements of new systems.Based on the multi-source heterogeneous data of power grid dispatch,this paper proposes a joint entity–relationship extraction method for power-grid dispatch fault processing based on a pre-trained model,constructs a knowledge graph of power-grid dispatch fault processing and designs,and develops a fault-processing auxiliary decision-making system based on the knowledge graph.It was applied to study a provincial dispatch control center,and it effectively improved the accident processing ability and intelligent level of accident management and control of the power grid.展开更多
The Coronavirus Disease 2019(COVID-19)is wreaking havoc around the world,bring out that the enormous pressure on national health and medical staff systems.One of the most effective and critical steps in the fight agai...The Coronavirus Disease 2019(COVID-19)is wreaking havoc around the world,bring out that the enormous pressure on national health and medical staff systems.One of the most effective and critical steps in the fight against COVID-19,is to examine the patient’s lungs based on the Chest X-ray and CT generated by radiation imaging.In this paper,five keras-related deep learning models:ResNet50,InceptionResNetV2,Xception,transfer learning and pre-trained VGGNet16 is applied to formulate an classification-detection approaches of COVID-19.Two benchmark methods SVM(Support Vector Machine),CNN(Conventional Neural Networks)are provided to compare with the classification-detection approaches based on the performance indicators,i.e.,precision,recall,F1 scores,confusion matrix,classification accuracy and three types of AUC(Area Under Curve).The highest classification accuracy derived by classification-detection based on 5857 Chest X-rays and 767 Chest CTs are respectively 84%and 75%,which shows that the keras-related deep learning approaches facilitate accurate and effective COVID-19-assisted detection.展开更多
This article elucidates the concept of large model technology,summarizes the research status of large model technology both domestically and internationally,provides an overview of the application status of large mode...This article elucidates the concept of large model technology,summarizes the research status of large model technology both domestically and internationally,provides an overview of the application status of large models in vertical industries,outlines the challenges and issues confronted in applying large models in the oil and gas sector,and offers prospects for the application of large models in the oil and gas industry.The existing large models can be briefly divided into three categories:large language models,visual large models,and multimodal large models.The application of large models in the oil and gas industry is still in its infancy.Based on open-source large language models,some oil and gas enterprises have released large language model products using methods like fine-tuning and retrieval augmented generation.Scholars have attempted to develop scenario-specific models for oil and gas operations by using visual/multimodal foundation models.A few researchers have constructed pre-trained foundation models for seismic data processing and interpretation,as well as core analysis.The application of large models in the oil and gas industry faces challenges such as current data quantity and quality being difficult to support the training of large models,high research and development costs,and poor algorithm autonomy and control.The application of large models should be guided by the needs of oil and gas business,taking the application of large models as an opportunity to improve data lifecycle management,enhance data governance capabilities,promote the construction of computing power,strengthen the construction of“artificial intelligence+energy”composite teams,and boost the autonomy and control of large model technology.展开更多
This letter evaluates the article by Gravina et al on ChatGPT’s potential in providing medical information for inflammatory bowel disease patients.While promising,it highlights the need for advanced techniques like r...This letter evaluates the article by Gravina et al on ChatGPT’s potential in providing medical information for inflammatory bowel disease patients.While promising,it highlights the need for advanced techniques like reasoning+action and retrieval-augmented generation to improve accuracy and reliability.Emphasizing that simple question and answer testing is insufficient,it calls for more nuanced evaluation methods to truly gauge large language models’capabilities in clinical applications.展开更多
With current success of large-scale pre-trained models(PTMs),how efficiently adapting PTMs to downstream tasks has attracted tremendous attention,especially for PTMs with billions of parameters.Previous work focuses o...With current success of large-scale pre-trained models(PTMs),how efficiently adapting PTMs to downstream tasks has attracted tremendous attention,especially for PTMs with billions of parameters.Previous work focuses on designing parameter-efficient tuning paradigms but needs to save and compute the gradient of the whole computational graph.In this paper,we propose y-Tuning,an efficient yet effective paradigm to adapt frozen large-scale PTMs to specific downstream tasks.y-Tuning learns dense representations for labels y defined in a given task and aligns them to fixed feature representation.Without computing the gradients of text encoder at training phrase,y-Tuning is not only parameterefficient but also training-efficient.Experimental results show that for DeBERTaxxL with 1.6 billion parameters,y-Tuning achieves performance more than 96%of full fine-tuning on GLUE Benchmark with only 2%tunable parameters and much fewer training costs.展开更多
In recent years,with the great success of pre-trained language models,the pre-trained BERT model has been gradually applied to the field of source code understanding.However,the time cost of training a language model ...In recent years,with the great success of pre-trained language models,the pre-trained BERT model has been gradually applied to the field of source code understanding.However,the time cost of training a language model from zero is very high,and how to transfer the pre-trained language model to the field of smart contract vulnerability detection is a hot research direction at present.In this paper,we propose a hybrid model to detect common vulnerabilities in smart contracts based on a lightweight pre-trained languagemodel BERT and connected to a bidirectional gate recurrent unitmodel.The downstream neural network adopts the bidirectional gate recurrent unit neural network model with a hierarchical attention mechanism to mine more semantic features contained in the source code of smart contracts by using their characteristics.Our experiments show that our proposed hybrid neural network model SolBERT-BiGRU-Attention is fitted by a large number of data samples with smart contract vulnerabilities,and it is found that compared with the existing methods,the accuracy of our model can reach 93.85%,and the Micro-F1 Score is 94.02%.展开更多
As an essential category of public event management and control,sentiment analysis of online public opinion text plays a vital role in public opinion early warning,network rumor management,and netizens’person-ality p...As an essential category of public event management and control,sentiment analysis of online public opinion text plays a vital role in public opinion early warning,network rumor management,and netizens’person-ality portraits under massive public opinion data.The traditional sentiment analysis model is not sensitive to the location information of words,it is difficult to solve the problem of polysemy,and the learning representation ability of long and short sentences is very different,which leads to the low accuracy of sentiment classification.This paper proposes a sentiment analysis model PERT-BiLSTM-Att for public opinion text based on the pre-training model of the disordered language model,bidirectional long-term and short-term memory network and attention mechanism.The model first uses the PERT model pre-trained from the lexical location information of a large amount of corpus to process the text data and obtain the dynamic feature representation of the text.Then the semantic features are input into BiLSTM to learn context sequence information and enhance the model’s ability to represent long sequences.Finally,the attention mechanism is used to focus on the words that contribute more to the overall emotional tendency to make up for the lack of short text representation ability of the traditional model,and then the classification results are output through the fully connected network.The experimental results show that the classification accuracy of the model on NLPCC14 and weibo_senti_100k public data sets reach 88.56%and 97.05%,respectively,and the accuracy reaches 95.95%on the data set MDC22 composed of Meituan,Dianping and Ctrip comment.It proves that the model has a good effect on sentiment analysis of online public opinion texts on different platforms.The experimental results on different datasets verify the model’s effectiveness in applying sentiment analysis of texts.At the same time,the model has a strong generalization ability and can achieve good results for sentiment analysis of datasets in different fields.展开更多
Black fungus is a rare and dangerous mycology that usually affects the brain and lungs and could be life-threatening in diabetic cases.Recently,some COVID-19 survivors,especially those with co-morbid diseases,have bee...Black fungus is a rare and dangerous mycology that usually affects the brain and lungs and could be life-threatening in diabetic cases.Recently,some COVID-19 survivors,especially those with co-morbid diseases,have been susceptible to black fungus.Therefore,recovered COVID-19 patients should seek medical support when they notice mucormycosis symptoms.This paper proposes a novel ensemble deep-learning model that includes three pre-trained models:reset(50),VGG(19),and Inception.Our approach is medically intuitive and efficient compared to the traditional deep learning models.An image dataset was aggregated from various resources and divided into two classes:a black fungus class and a skin infection class.To the best of our knowledge,our study is the first that is concerned with building black fungus detection models based on deep learning algorithms.The proposed approach can significantly improve the performance of the classification task and increase the generalization ability of such a binary classification task.According to the reported results,it has empirically achieved a sensitivity value of 0.9907,a specificity value of 0.9938,a precision value of 0.9938,and a negative predictive value of 0.9907.展开更多
Artificial Intelligence(AI)and Computer Vision(CV)advancements have led to many useful methodologies in recent years,particularly to help visually-challenged people.Object detection includes a variety of challenges,fo...Artificial Intelligence(AI)and Computer Vision(CV)advancements have led to many useful methodologies in recent years,particularly to help visually-challenged people.Object detection includes a variety of challenges,for example,handlingmultiple class images,images that get augmented when captured by a camera and so on.The test images include all these variants as well.These detection models alert them about their surroundings when they want to walk independently.This study compares four CNN-based pre-trainedmodels:ResidualNetwork(ResNet-50),Inception v3,DenseConvolutional Network(DenseNet-121),and SqueezeNet,predominantly used in image recognition applications.Based on the analysis performed on these test images,the study infers that Inception V3 outperformed other pre-trained models in terms of accuracy and speed.To further improve the performance of the Inception v3 model,the thermal exchange optimization(TEO)algorithm is applied to tune the hyperparameters(number of epochs,batch size,and learning rate)showing the novelty of the work.Better accuracy was achieved owing to the inclusion of an auxiliary classifier as a regularizer,hyperparameter optimizer,and factorization approach.Additionally,Inception V3 can handle images of different sizes.This makes Inception V3 the optimum model for assisting visually challenged people in real-world communication when integrated with Internet of Things(IoT)-based devices.展开更多
Corona Virus(COVID-19)is a novel virus that crossed an animal-human barrier and emerged in Wuhan,China.Until now it has affected more than 119 million people.Detection of COVID-19 is a critical task and due to a large...Corona Virus(COVID-19)is a novel virus that crossed an animal-human barrier and emerged in Wuhan,China.Until now it has affected more than 119 million people.Detection of COVID-19 is a critical task and due to a large number of patients,a shortage of doctors has occurred for its detection.In this paper,a model has been suggested that not only detects the COVID-19 using X-ray and CT-Scan images but also shows the affected areas.Three classes have been defined;COVID-19,normal,and Pneumonia for X-ray images.For CT-Scan images,2 classes have been defined COVID-19 and non-COVID-19.For classi-fication purposes,pretrained models like ResNet50,VGG-16,and VGG19 have been used with some tuning.For detecting the affected areas Gradient-weighted Class Activation Mapping(GradCam)has been used.As the X-rays and ct images are taken at different intensities,so the contrast limited adaptive histogram equalization(CLAHE)has been applied to see the effect on the training of the models.As a result of these experiments,we achieved a maximum validation accuracy of 88.10%with a training accuracy of 88.48%for CT-Scan images using the ResNet50 model.While for X-ray images we achieved a maximum validation accuracy of 97.31%with a training accuracy of 95.64%using the VGG16 model.展开更多
Artificial intelligence is increasingly entering everyday healthcare.Large language model(LLM)systems such as Chat Generative Pre-trained Transformer(ChatGPT)have become potentially accessible to everyone,including pa...Artificial intelligence is increasingly entering everyday healthcare.Large language model(LLM)systems such as Chat Generative Pre-trained Transformer(ChatGPT)have become potentially accessible to everyone,including patients with inflammatory bowel diseases(IBD).However,significant ethical issues and pitfalls exist in innovative LLM tools.The hype generated by such systems may lead to unweighted patient trust in these systems.Therefore,it is necessary to understand whether LLMs(trendy ones,such as ChatGPT)can produce plausible medical information(MI)for patients.This review examined ChatGPT’s potential to provide MI regarding questions commonly addressed by patients with IBD to their gastroenterologists.From the review of the outputs provided by ChatGPT,this tool showed some attractive potential while having significant limitations in updating and detailing information and providing inaccurate information in some cases.Further studies and refinement of the ChatGPT,possibly aligning the outputs with the leading medical evidence provided by reliable databases,are needed.展开更多
With the urgent demand for generalized deep models,many pre-trained big models are proposed,such as bidirectional encoder representations(BERT),vision transformer(ViT),generative pre-trained transformers(GPT),etc.Insp...With the urgent demand for generalized deep models,many pre-trained big models are proposed,such as bidirectional encoder representations(BERT),vision transformer(ViT),generative pre-trained transformers(GPT),etc.Inspired by the success of these models in single domains(like computer vision and natural language processing),the multi-modal pre-trained big models have also drawn more and more attention in recent years.In this work,we give a comprehensive survey of these models and hope this paper could provide new insights and helps fresh researchers to track the most cutting-edge works.Specifically,we firstly introduce the background of multi-modal pre-training by reviewing the conventional deep learning,pre-training works in natural language process,computer vision,and speech.Then,we introduce the task definition,key challenges,and advantages of multi-modal pre-training models(MM-PTMs),and discuss the MM-PTMs with a focus on data,objectives,network architectures,and knowledge enhanced pre-training.After that,we introduce the downstream tasks used for the validation of large-scale MM-PTMs,including generative,classification,and regression tasks.We also give visualization and analysis of the model parameters and results on representative downstream tasks.Finally,we point out possible research directions for this topic that may benefit future works.In addition,we maintain a continuously updated paper list for large-scale pre-trained multi-modal big models:https://github.com/wangxiao5791509/MultiModal_BigModels_Survey.展开更多
Personality recognition plays a pivotal role when developing user-centric solutions such as recommender systems or decision support systems across various domains,including education,e-commerce,or human resources.Tra-...Personality recognition plays a pivotal role when developing user-centric solutions such as recommender systems or decision support systems across various domains,including education,e-commerce,or human resources.Tra-ditional machine learning techniques have been broadly employed for personality trait identification;nevertheless,the development of new technologies based on deep learning has led to new opportunities to improve their performance.This study focuses on the capabilities of pre-trained language models such as BERT,RoBERTa,ALBERT,ELECTRA,ERNIE,or XLNet,to deal with the task of personality recognition.These models are able to capture structural features from textual content and comprehend a multitude of language facets and complex features such as hierarchical relationships or long-term dependencies.This makes them suitable to classify multi-label personality traits from reviews while mitigating computational costs.The focus of this approach centers on developing an architecture based on different layers able to capture the semantic context and structural features from texts.Moreover,it is able to fine-tune the previous models using the MyPersonality dataset,which comprises 9,917 status updates contributed by 250 Facebook users.These status updates are categorized according to the well-known Big Five personality model,setting the stage for a comprehensive exploration of personality traits.To test the proposal,a set of experiments have been performed using different metrics such as the exact match ratio,hamming loss,zero-one-loss,precision,recall,F1-score,and weighted averages.The results reveal ERNIE is the top-performing model,achieving an exact match ratio of 72.32%,an accuracy rate of 87.17%,and 84.41%of F1-score.The findings demonstrate that the tested models substantially outperform other state-of-the-art studies,enhancing the accuracy by at least 3%and confirming them as powerful tools for personality recognition.These findings represent substantial advancements in personality recognition,making them appropriate for the development of user-centric applications.展开更多
This study makes a significant progress in addressing the challenges of short-term slope displacement prediction in the Universal Landslide Monitoring Program,an unprecedented disaster mitigation program in China,wher...This study makes a significant progress in addressing the challenges of short-term slope displacement prediction in the Universal Landslide Monitoring Program,an unprecedented disaster mitigation program in China,where lots of newly established monitoring slopes lack sufficient historical deformation data,making it difficult to extract deformation patterns and provide effective predictions which plays a crucial role in the early warning and forecasting of landslide hazards.A slope displacement prediction method based on transfer learning is therefore proposed.Initially,the method transfers the deformation patterns learned from slopes with relatively rich deformation data by a pre-trained model based on a multi-slope integrated dataset to newly established monitoring slopes with limited or even no useful data,thus enabling rapid and efficient predictions for these slopes.Subsequently,as time goes on and monitoring data accumulates,fine-tuning of the pre-trained model for individual slopes can further improve prediction accuracy,enabling continuous optimization of prediction results.A case study indicates that,after being trained on a multi-slope integrated dataset,the TCN-Transformer model can efficiently serve as a pretrained model for displacement prediction at newly established monitoring slopes.The three-day average RMSE is significantly reduced by 34.6%compared to models trained only on individual slope data,and it also successfully predicts the majority of deformation peaks.The fine-tuned model based on accumulated data on the target newly established monitoring slope further reduced the three-day RMSE by 37.2%,demonstrating a considerable predictive accuracy.In conclusion,taking advantage of transfer learning,the proposed slope displacement prediction method effectively utilizes the available data,which enables the rapid deployment and continual refinement of displacement predictions on newly established monitoring slopes.展开更多
With the help of pre-trained language models,the accuracy of the entity linking task has made great strides in recent years.However,most models with excellent performance require fine-tuning on a large amount of train...With the help of pre-trained language models,the accuracy of the entity linking task has made great strides in recent years.However,most models with excellent performance require fine-tuning on a large amount of training data using large pre-trained language models,which is a hardware threshold to accomplish this task.Some researchers have achieved competitive results with less training data through ingenious methods,such as utilizing information provided by the named entity recognition model.This paper presents a novel semantic-enhancement-based entity linking approach,named semantically enhanced hardware-friendly entity linking(SHEL),which is designed to be hardware friendly and efficient while maintaining good performance.Specifically,SHEL's semantic enhancement approach consists of three aspects:(1)semantic compression of entity descriptions using a text summarization model;(2)maximizing the capture of mention contexts using asymmetric heuristics;(3)calculating a fixed size mention representation through pooling operations.These series of semantic enhancement methods effectively improve the model's ability to capture semantic information while taking into account the hardware constraints,and significantly improve the model's convergence speed by more than 50%compared with the strong baseline model proposed in this paper.In terms of performance,SHEL is comparable to the previous method,with superior performance on six well-established datasets,even though SHEL is trained using a smaller pre-trained language model as the encoder.展开更多
Unsupervised text simplification has attracted much attention due to the scarcity of high-quality parallel text simplification corpora. Recent an unsupervised statistical text simplification based on phrase-based mach...Unsupervised text simplification has attracted much attention due to the scarcity of high-quality parallel text simplification corpora. Recent an unsupervised statistical text simplification based on phrase-based machine translation system (UnsupPBMT) achieved good performance, which initializes the phrase tables using the similar words obtained by word embedding modeling. Since word embedding modeling only considers the relevance between words, the phrase table in UnsupPBMT contains a lot of dissimilar words. In this paper, we propose an unsupervised statistical text simplification using pre-trained language modeling BERT for initialization. Specifically, we use BERT as a general linguistic knowledge base for predicting similar words. Experimental results show that our method outperforms the state-of-the-art unsupervised text simplification methods on three benchmarks, even outperforms some supervised baselines.展开更多
Structural development defects essentially refer to code structure that violates object-oriented design principles. They make program maintenance challenging and deteriorate software quality over time. Various detecti...Structural development defects essentially refer to code structure that violates object-oriented design principles. They make program maintenance challenging and deteriorate software quality over time. Various detection approaches, ranging from traditional heuristic algorithms to machine learning methods, are used to identify these defects. Ensemble learning methods have strengthened the detection of these defects. However, existing approaches do not simultaneously exploit the capabilities of extracting relevant features from pre-trained models and the performance of neural networks for the classification task. Therefore, our goal has been to design a model that combines a pre-trained model to extract relevant features from code excerpts through transfer learning and a bagging method with a base estimator, a dense neural network, for defect classification. To achieve this, we composed multiple samples of the same size with replacements from the imbalanced dataset MLCQ1. For all the samples, we used the CodeT5-small variant to extract features and trained a bagging method with the neural network Roberta Classification Head to classify defects based on these features. We then compared this model to RandomForest, one of the ensemble methods that yields good results. Our experiments showed that the number of base estimators to use for bagging depends on the defect to be detected. Next, we observed that it was not necessary to use a data balancing technique with our model when the imbalance rate was 23%. Finally, for blob detection, RandomForest had a median MCC value of 0.36 compared to 0.12 for our method. However, our method was predominant in Long Method detection with a median MCC value of 0.53 compared to 0.42 for RandomForest. These results suggest that the performance of ensemble methods in detecting structural development defects is dependent on specific defects.展开更多
基金supported by Science and Technology Research Project of Jiangxi Education Department.Project Grant No.GJJ2203306.
文摘Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on unimodal pre-trained models for feature extraction from each modality often overlook the intrinsic connections of semantic information between modalities.This limitation is attributed to their training on unimodal data,and necessitates the use of complex fusion mechanisms for sentiment analysis.In this study,we present a novel approach that combines a vision-language pre-trained model with a proposed multimodal contrastive learning method.Our approach harnesses the power of transfer learning by utilizing a vision-language pre-trained model to extract both visual and textual representations in a unified framework.We employ a Transformer architecture to integrate these representations,thereby enabling the capture of rich semantic infor-mation in image-text pairs.To further enhance the representation learning of these pairs,we introduce our proposed multimodal contrastive learning method,which leads to improved performance in sentiment analysis tasks.Our approach is evaluated through extensive experiments on two publicly accessible datasets,where we demonstrate its effectiveness.We achieve a significant improvement in sentiment analysis accuracy,indicating the supe-riority of our approach over existing techniques.These results highlight the potential of multimodal sentiment analysis and underscore the importance of considering the intrinsic semantic connections between modalities for accurate sentiment assessment.
文摘Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88.
文摘We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract information from medical text,facilitating more accurate classification while minimizing the number of trainable parameters.Extensive experiments conducted on various datasets demonstrate the effectiveness of our approach.
基金supported by the Science and Technology Project of the State Grid Corporation“Research on Key Technologies of Power Artificial Intelligence Open Platform”(5700-202155260A-0-0-00).
文摘With the construction of new power systems,the power grid has become extremely large,with an increasing proportion of new energy and AC/DC hybrid connections.The dynamic characteristics and fault patterns of the power grid are complex;additionally,power grid control is difficult,operation risks are high,and the task of fault handling is arduous.Traditional power-grid fault handling relies primarily on human experience.The difference in and lack of knowledge reserve of control personnel restrict the accuracy and timeliness of fault handling.Therefore,this mode of operation is no longer suitable for the requirements of new systems.Based on the multi-source heterogeneous data of power grid dispatch,this paper proposes a joint entity–relationship extraction method for power-grid dispatch fault processing based on a pre-trained model,constructs a knowledge graph of power-grid dispatch fault processing and designs,and develops a fault-processing auxiliary decision-making system based on the knowledge graph.It was applied to study a provincial dispatch control center,and it effectively improved the accident processing ability and intelligent level of accident management and control of the power grid.
基金This project is supported by National Natural Science Foundation of China(NSFC)(Nos.61902158,61806087)Graduate student innovation program for academic degrees in general university in Jiangsu Province(No.KYZZ16-0337).
文摘The Coronavirus Disease 2019(COVID-19)is wreaking havoc around the world,bring out that the enormous pressure on national health and medical staff systems.One of the most effective and critical steps in the fight against COVID-19,is to examine the patient’s lungs based on the Chest X-ray and CT generated by radiation imaging.In this paper,five keras-related deep learning models:ResNet50,InceptionResNetV2,Xception,transfer learning and pre-trained VGGNet16 is applied to formulate an classification-detection approaches of COVID-19.Two benchmark methods SVM(Support Vector Machine),CNN(Conventional Neural Networks)are provided to compare with the classification-detection approaches based on the performance indicators,i.e.,precision,recall,F1 scores,confusion matrix,classification accuracy and three types of AUC(Area Under Curve).The highest classification accuracy derived by classification-detection based on 5857 Chest X-rays and 767 Chest CTs are respectively 84%and 75%,which shows that the keras-related deep learning approaches facilitate accurate and effective COVID-19-assisted detection.
基金Supported by the National Natural Science Foundation of China(72088101,42372175)PetroChina Science and Technology Innovation Fund Program(2021DQ02-0904)。
文摘This article elucidates the concept of large model technology,summarizes the research status of large model technology both domestically and internationally,provides an overview of the application status of large models in vertical industries,outlines the challenges and issues confronted in applying large models in the oil and gas sector,and offers prospects for the application of large models in the oil and gas industry.The existing large models can be briefly divided into three categories:large language models,visual large models,and multimodal large models.The application of large models in the oil and gas industry is still in its infancy.Based on open-source large language models,some oil and gas enterprises have released large language model products using methods like fine-tuning and retrieval augmented generation.Scholars have attempted to develop scenario-specific models for oil and gas operations by using visual/multimodal foundation models.A few researchers have constructed pre-trained foundation models for seismic data processing and interpretation,as well as core analysis.The application of large models in the oil and gas industry faces challenges such as current data quantity and quality being difficult to support the training of large models,high research and development costs,and poor algorithm autonomy and control.The application of large models should be guided by the needs of oil and gas business,taking the application of large models as an opportunity to improve data lifecycle management,enhance data governance capabilities,promote the construction of computing power,strengthen the construction of“artificial intelligence+energy”composite teams,and boost the autonomy and control of large model technology.
文摘This letter evaluates the article by Gravina et al on ChatGPT’s potential in providing medical information for inflammatory bowel disease patients.While promising,it highlights the need for advanced techniques like reasoning+action and retrieval-augmented generation to improve accuracy and reliability.Emphasizing that simple question and answer testing is insufficient,it calls for more nuanced evaluation methods to truly gauge large language models’capabilities in clinical applications.
基金National Key R&D Program of China(No.2020AAA0108702)National Natural Science Foundation of China(Grant No.62022027).
文摘With current success of large-scale pre-trained models(PTMs),how efficiently adapting PTMs to downstream tasks has attracted tremendous attention,especially for PTMs with billions of parameters.Previous work focuses on designing parameter-efficient tuning paradigms but needs to save and compute the gradient of the whole computational graph.In this paper,we propose y-Tuning,an efficient yet effective paradigm to adapt frozen large-scale PTMs to specific downstream tasks.y-Tuning learns dense representations for labels y defined in a given task and aligns them to fixed feature representation.Without computing the gradients of text encoder at training phrase,y-Tuning is not only parameterefficient but also training-efficient.Experimental results show that for DeBERTaxxL with 1.6 billion parameters,y-Tuning achieves performance more than 96%of full fine-tuning on GLUE Benchmark with only 2%tunable parameters and much fewer training costs.
基金supported by the National Natural Science Foundation of China(Grant Nos.62272120,62106030,U20B2046,62272119,61972105)the Technology Innovation and Application Development Projects of Chongqing(Grant Nos.cstc2021jscx-gksbX0032,cstc2021jscxgksbX0029).
文摘In recent years,with the great success of pre-trained language models,the pre-trained BERT model has been gradually applied to the field of source code understanding.However,the time cost of training a language model from zero is very high,and how to transfer the pre-trained language model to the field of smart contract vulnerability detection is a hot research direction at present.In this paper,we propose a hybrid model to detect common vulnerabilities in smart contracts based on a lightweight pre-trained languagemodel BERT and connected to a bidirectional gate recurrent unitmodel.The downstream neural network adopts the bidirectional gate recurrent unit neural network model with a hierarchical attention mechanism to mine more semantic features contained in the source code of smart contracts by using their characteristics.Our experiments show that our proposed hybrid neural network model SolBERT-BiGRU-Attention is fitted by a large number of data samples with smart contract vulnerabilities,and it is found that compared with the existing methods,the accuracy of our model can reach 93.85%,and the Micro-F1 Score is 94.02%.
基金supported by the Chongqing Natural Science Foundation of China (Grant No.CSTB2022NSCQ-MSX1417)the Science and Technology Research Program of Chongqing Municipal Education Commission (Grant No.KJZD-K202200513)Chongqing Normal University Fund (Grant No.22XLB003).
文摘As an essential category of public event management and control,sentiment analysis of online public opinion text plays a vital role in public opinion early warning,network rumor management,and netizens’person-ality portraits under massive public opinion data.The traditional sentiment analysis model is not sensitive to the location information of words,it is difficult to solve the problem of polysemy,and the learning representation ability of long and short sentences is very different,which leads to the low accuracy of sentiment classification.This paper proposes a sentiment analysis model PERT-BiLSTM-Att for public opinion text based on the pre-training model of the disordered language model,bidirectional long-term and short-term memory network and attention mechanism.The model first uses the PERT model pre-trained from the lexical location information of a large amount of corpus to process the text data and obtain the dynamic feature representation of the text.Then the semantic features are input into BiLSTM to learn context sequence information and enhance the model’s ability to represent long sequences.Finally,the attention mechanism is used to focus on the words that contribute more to the overall emotional tendency to make up for the lack of short text representation ability of the traditional model,and then the classification results are output through the fully connected network.The experimental results show that the classification accuracy of the model on NLPCC14 and weibo_senti_100k public data sets reach 88.56%and 97.05%,respectively,and the accuracy reaches 95.95%on the data set MDC22 composed of Meituan,Dianping and Ctrip comment.It proves that the model has a good effect on sentiment analysis of online public opinion texts on different platforms.The experimental results on different datasets verify the model’s effectiveness in applying sentiment analysis of texts.At the same time,the model has a strong generalization ability and can achieve good results for sentiment analysis of datasets in different fields.
基金supported by the MSIT (Ministry of Science and ICT),Korea,under the ICAN (ICT Challenge and Advanced Network of HRD)Program (IITP-2023-2020-0-01832)supervised by the IITP (Institute of Information&Communications Technology Planning&Evaluation)and the Soonchunhyang University Research Fund.
文摘Black fungus is a rare and dangerous mycology that usually affects the brain and lungs and could be life-threatening in diabetic cases.Recently,some COVID-19 survivors,especially those with co-morbid diseases,have been susceptible to black fungus.Therefore,recovered COVID-19 patients should seek medical support when they notice mucormycosis symptoms.This paper proposes a novel ensemble deep-learning model that includes three pre-trained models:reset(50),VGG(19),and Inception.Our approach is medically intuitive and efficient compared to the traditional deep learning models.An image dataset was aggregated from various resources and divided into two classes:a black fungus class and a skin infection class.To the best of our knowledge,our study is the first that is concerned with building black fungus detection models based on deep learning algorithms.The proposed approach can significantly improve the performance of the classification task and increase the generalization ability of such a binary classification task.According to the reported results,it has empirically achieved a sensitivity value of 0.9907,a specificity value of 0.9938,a precision value of 0.9938,and a negative predictive value of 0.9907.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2023R191)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:(22UQU4310373DSR61)This study is supported via funding from Prince Sattam bin Abdulaziz University project number(PSAU/2023/R/1444).
文摘Artificial Intelligence(AI)and Computer Vision(CV)advancements have led to many useful methodologies in recent years,particularly to help visually-challenged people.Object detection includes a variety of challenges,for example,handlingmultiple class images,images that get augmented when captured by a camera and so on.The test images include all these variants as well.These detection models alert them about their surroundings when they want to walk independently.This study compares four CNN-based pre-trainedmodels:ResidualNetwork(ResNet-50),Inception v3,DenseConvolutional Network(DenseNet-121),and SqueezeNet,predominantly used in image recognition applications.Based on the analysis performed on these test images,the study infers that Inception V3 outperformed other pre-trained models in terms of accuracy and speed.To further improve the performance of the Inception v3 model,the thermal exchange optimization(TEO)algorithm is applied to tune the hyperparameters(number of epochs,batch size,and learning rate)showing the novelty of the work.Better accuracy was achieved owing to the inclusion of an auxiliary classifier as a regularizer,hyperparameter optimizer,and factorization approach.Additionally,Inception V3 can handle images of different sizes.This makes Inception V3 the optimum model for assisting visually challenged people in real-world communication when integrated with Internet of Things(IoT)-based devices.
文摘Corona Virus(COVID-19)is a novel virus that crossed an animal-human barrier and emerged in Wuhan,China.Until now it has affected more than 119 million people.Detection of COVID-19 is a critical task and due to a large number of patients,a shortage of doctors has occurred for its detection.In this paper,a model has been suggested that not only detects the COVID-19 using X-ray and CT-Scan images but also shows the affected areas.Three classes have been defined;COVID-19,normal,and Pneumonia for X-ray images.For CT-Scan images,2 classes have been defined COVID-19 and non-COVID-19.For classi-fication purposes,pretrained models like ResNet50,VGG-16,and VGG19 have been used with some tuning.For detecting the affected areas Gradient-weighted Class Activation Mapping(GradCam)has been used.As the X-rays and ct images are taken at different intensities,so the contrast limited adaptive histogram equalization(CLAHE)has been applied to see the effect on the training of the models.As a result of these experiments,we achieved a maximum validation accuracy of 88.10%with a training accuracy of 88.48%for CT-Scan images using the ResNet50 model.While for X-ray images we achieved a maximum validation accuracy of 97.31%with a training accuracy of 95.64%using the VGG16 model.
文摘Artificial intelligence is increasingly entering everyday healthcare.Large language model(LLM)systems such as Chat Generative Pre-trained Transformer(ChatGPT)have become potentially accessible to everyone,including patients with inflammatory bowel diseases(IBD).However,significant ethical issues and pitfalls exist in innovative LLM tools.The hype generated by such systems may lead to unweighted patient trust in these systems.Therefore,it is necessary to understand whether LLMs(trendy ones,such as ChatGPT)can produce plausible medical information(MI)for patients.This review examined ChatGPT’s potential to provide MI regarding questions commonly addressed by patients with IBD to their gastroenterologists.From the review of the outputs provided by ChatGPT,this tool showed some attractive potential while having significant limitations in updating and detailing information and providing inaccurate information in some cases.Further studies and refinement of the ChatGPT,possibly aligning the outputs with the leading medical evidence provided by reliable databases,are needed.
基金supported by National Natural Science Foundation of China(Nos.61872256 and 62102205)Key-Area Research and Development Program of Guangdong Province,China(No.2021B0101400002)+1 种基金Peng Cheng Laboratory Key Research Project,China(No.PCL 2021A07)Multi-source Cross-platform Video Analysis and Understanding for Intelligent Perception in Smart City,China(No.U20B2052).
文摘With the urgent demand for generalized deep models,many pre-trained big models are proposed,such as bidirectional encoder representations(BERT),vision transformer(ViT),generative pre-trained transformers(GPT),etc.Inspired by the success of these models in single domains(like computer vision and natural language processing),the multi-modal pre-trained big models have also drawn more and more attention in recent years.In this work,we give a comprehensive survey of these models and hope this paper could provide new insights and helps fresh researchers to track the most cutting-edge works.Specifically,we firstly introduce the background of multi-modal pre-training by reviewing the conventional deep learning,pre-training works in natural language process,computer vision,and speech.Then,we introduce the task definition,key challenges,and advantages of multi-modal pre-training models(MM-PTMs),and discuss the MM-PTMs with a focus on data,objectives,network architectures,and knowledge enhanced pre-training.After that,we introduce the downstream tasks used for the validation of large-scale MM-PTMs,including generative,classification,and regression tasks.We also give visualization and analysis of the model parameters and results on representative downstream tasks.Finally,we point out possible research directions for this topic that may benefit future works.In addition,we maintain a continuously updated paper list for large-scale pre-trained multi-modal big models:https://github.com/wangxiao5791509/MultiModal_BigModels_Survey.
基金This work has been partially supported by FEDER and the State Research Agency(AEI)of the Spanish Ministry of Economy and Competition under Grant SAFER:PID2019-104735RB-C42(AEI/FEDER,UE)the General Subdirection for Gambling Regulation of the Spanish ConsumptionMinistry under the Grant Detec-EMO:SUBV23/00010the Project PLEC2021-007681 funded by MCIN/AEI/10.13039/501100011033 and by the European Union NextGenerationEU/PRTR.
文摘Personality recognition plays a pivotal role when developing user-centric solutions such as recommender systems or decision support systems across various domains,including education,e-commerce,or human resources.Tra-ditional machine learning techniques have been broadly employed for personality trait identification;nevertheless,the development of new technologies based on deep learning has led to new opportunities to improve their performance.This study focuses on the capabilities of pre-trained language models such as BERT,RoBERTa,ALBERT,ELECTRA,ERNIE,or XLNet,to deal with the task of personality recognition.These models are able to capture structural features from textual content and comprehend a multitude of language facets and complex features such as hierarchical relationships or long-term dependencies.This makes them suitable to classify multi-label personality traits from reviews while mitigating computational costs.The focus of this approach centers on developing an architecture based on different layers able to capture the semantic context and structural features from texts.Moreover,it is able to fine-tune the previous models using the MyPersonality dataset,which comprises 9,917 status updates contributed by 250 Facebook users.These status updates are categorized according to the well-known Big Five personality model,setting the stage for a comprehensive exploration of personality traits.To test the proposal,a set of experiments have been performed using different metrics such as the exact match ratio,hamming loss,zero-one-loss,precision,recall,F1-score,and weighted averages.The results reveal ERNIE is the top-performing model,achieving an exact match ratio of 72.32%,an accuracy rate of 87.17%,and 84.41%of F1-score.The findings demonstrate that the tested models substantially outperform other state-of-the-art studies,enhancing the accuracy by at least 3%and confirming them as powerful tools for personality recognition.These findings represent substantial advancements in personality recognition,making them appropriate for the development of user-centric applications.
基金funded by the project of the China Geological Survey(DD20211364)the Science and Technology Talent Program of Ministry of Natural Resources of China(grant number 121106000000180039–2201)。
文摘This study makes a significant progress in addressing the challenges of short-term slope displacement prediction in the Universal Landslide Monitoring Program,an unprecedented disaster mitigation program in China,where lots of newly established monitoring slopes lack sufficient historical deformation data,making it difficult to extract deformation patterns and provide effective predictions which plays a crucial role in the early warning and forecasting of landslide hazards.A slope displacement prediction method based on transfer learning is therefore proposed.Initially,the method transfers the deformation patterns learned from slopes with relatively rich deformation data by a pre-trained model based on a multi-slope integrated dataset to newly established monitoring slopes with limited or even no useful data,thus enabling rapid and efficient predictions for these slopes.Subsequently,as time goes on and monitoring data accumulates,fine-tuning of the pre-trained model for individual slopes can further improve prediction accuracy,enabling continuous optimization of prediction results.A case study indicates that,after being trained on a multi-slope integrated dataset,the TCN-Transformer model can efficiently serve as a pretrained model for displacement prediction at newly established monitoring slopes.The three-day average RMSE is significantly reduced by 34.6%compared to models trained only on individual slope data,and it also successfully predicts the majority of deformation peaks.The fine-tuned model based on accumulated data on the target newly established monitoring slope further reduced the three-day RMSE by 37.2%,demonstrating a considerable predictive accuracy.In conclusion,taking advantage of transfer learning,the proposed slope displacement prediction method effectively utilizes the available data,which enables the rapid deployment and continual refinement of displacement predictions on newly established monitoring slopes.
基金the Beijing Municipal Science and Technology Program(Z231100001323004)。
文摘With the help of pre-trained language models,the accuracy of the entity linking task has made great strides in recent years.However,most models with excellent performance require fine-tuning on a large amount of training data using large pre-trained language models,which is a hardware threshold to accomplish this task.Some researchers have achieved competitive results with less training data through ingenious methods,such as utilizing information provided by the named entity recognition model.This paper presents a novel semantic-enhancement-based entity linking approach,named semantically enhanced hardware-friendly entity linking(SHEL),which is designed to be hardware friendly and efficient while maintaining good performance.Specifically,SHEL's semantic enhancement approach consists of three aspects:(1)semantic compression of entity descriptions using a text summarization model;(2)maximizing the capture of mention contexts using asymmetric heuristics;(3)calculating a fixed size mention representation through pooling operations.These series of semantic enhancement methods effectively improve the model's ability to capture semantic information while taking into account the hardware constraints,and significantly improve the model's convergence speed by more than 50%compared with the strong baseline model proposed in this paper.In terms of performance,SHEL is comparable to the previous method,with superior performance on six well-established datasets,even though SHEL is trained using a smaller pre-trained language model as the encoder.
基金supported by the National Natural Science Foundation of China(Grant Nos.62076217 and 61906060)and the Program for Changjiang Scholars and Innovative Research Team in University(PCSIRT)of the Ministry of Education,China(IRT17R32).
文摘Unsupervised text simplification has attracted much attention due to the scarcity of high-quality parallel text simplification corpora. Recent an unsupervised statistical text simplification based on phrase-based machine translation system (UnsupPBMT) achieved good performance, which initializes the phrase tables using the similar words obtained by word embedding modeling. Since word embedding modeling only considers the relevance between words, the phrase table in UnsupPBMT contains a lot of dissimilar words. In this paper, we propose an unsupervised statistical text simplification using pre-trained language modeling BERT for initialization. Specifically, we use BERT as a general linguistic knowledge base for predicting similar words. Experimental results show that our method outperforms the state-of-the-art unsupervised text simplification methods on three benchmarks, even outperforms some supervised baselines.
文摘Structural development defects essentially refer to code structure that violates object-oriented design principles. They make program maintenance challenging and deteriorate software quality over time. Various detection approaches, ranging from traditional heuristic algorithms to machine learning methods, are used to identify these defects. Ensemble learning methods have strengthened the detection of these defects. However, existing approaches do not simultaneously exploit the capabilities of extracting relevant features from pre-trained models and the performance of neural networks for the classification task. Therefore, our goal has been to design a model that combines a pre-trained model to extract relevant features from code excerpts through transfer learning and a bagging method with a base estimator, a dense neural network, for defect classification. To achieve this, we composed multiple samples of the same size with replacements from the imbalanced dataset MLCQ1. For all the samples, we used the CodeT5-small variant to extract features and trained a bagging method with the neural network Roberta Classification Head to classify defects based on these features. We then compared this model to RandomForest, one of the ensemble methods that yields good results. Our experiments showed that the number of base estimators to use for bagging depends on the defect to be detected. Next, we observed that it was not necessary to use a data balancing technique with our model when the imbalance rate was 23%. Finally, for blob detection, RandomForest had a median MCC value of 0.36 compared to 0.12 for our method. However, our method was predominant in Long Method detection with a median MCC value of 0.53 compared to 0.42 for RandomForest. These results suggest that the performance of ensemble methods in detecting structural development defects is dependent on specific defects.