期刊文献+
共找到223篇文章
< 1 2 12 >
每页显示 20 50 100
Terrorism Attack Classification Using Machine Learning: The Effectiveness of Using Textual Features Extracted from GTD Dataset
1
作者 Mohammed Abdalsalam Chunlin Li +1 位作者 Abdelghani Dahou Natalia Kryvinska 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第2期1427-1467,共41页
One of the biggest dangers to society today is terrorism, where attacks have become one of the most significantrisks to international peace and national security. Big data, information analysis, and artificial intelli... One of the biggest dangers to society today is terrorism, where attacks have become one of the most significantrisks to international peace and national security. Big data, information analysis, and artificial intelligence (AI) havebecome the basis for making strategic decisions in many sensitive areas, such as fraud detection, risk management,medical diagnosis, and counter-terrorism. However, there is still a need to assess how terrorist attacks are related,initiated, and detected. For this purpose, we propose a novel framework for classifying and predicting terroristattacks. The proposed framework posits that neglected text attributes included in the Global Terrorism Database(GTD) can influence the accuracy of the model’s classification of terrorist attacks, where each part of the datacan provide vital information to enrich the ability of classifier learning. Each data point in a multiclass taxonomyhas one or more tags attached to it, referred as “related tags.” We applied machine learning classifiers to classifyterrorist attack incidents obtained from the GTD. A transformer-based technique called DistilBERT extracts andlearns contextual features from text attributes to acquiremore information from text data. The extracted contextualfeatures are combined with the “key features” of the dataset and used to perform the final classification. Thestudy explored different experimental setups with various classifiers to evaluate the model’s performance. Theexperimental results show that the proposed framework outperforms the latest techniques for classifying terroristattacks with an accuracy of 98.7% using a combined feature set and extreme gradient boosting classifier. 展开更多
关键词 Artificial intelligence machine learning natural language processing data analytic DistilBERT feature extraction terrorism classification GTD dataset
下载PDF
Audio-visual keyword transformer for unconstrained sentence-level keyword spotting
2
作者 Yidi Li Jiale Ren +3 位作者 Yawei Wang Guoquan Wang Xia Li Hong Liu 《CAAI Transactions on Intelligence Technology》 SCIE EI 2024年第1期142-152,共11页
As one of the most effective methods to improve the accuracy and robustness of speech tasks,the audio-visual fusion approach has recently been introduced into the field of Keyword Spotting(KWS).However,existing audio-... As one of the most effective methods to improve the accuracy and robustness of speech tasks,the audio-visual fusion approach has recently been introduced into the field of Keyword Spotting(KWS).However,existing audio-visual keyword spotting models are limited to detecting isolated words,while keyword spotting for unconstrained speech is still a challenging problem.To this end,an Audio-Visual Keyword Transformer(AVKT)network is proposed to spot keywords in unconstrained video clips.The authors present a transformer classifier with learnable CLS tokens to extract distinctive keyword features from the variable-length audio and visual inputs.The outputs of audio and visual branches are combined in a decision fusion module.As humans can easily notice whether a keyword appears in a sentence or not,our AVKT network can detect whether a video clip with a spoken sentence contains a pre-specified keyword.Moreover,the position of the keyword is localised in the attention map without additional position labels.Exper-imental results on the LRS2-KWS dataset and our newly collected PKU-KWS dataset show that the accuracy of AVKT exceeded 99%in clean scenes and 85%in extremely noisy conditions.The code is available at https://github.com/jialeren/AVKT. 展开更多
关键词 artificial intelligence multimodal approaches natural language processing neural network speech processing
下载PDF
Identification of Software Bugs by Analyzing Natural Language-Based Requirements Using Optimized Deep Learning Features
3
作者 Qazi Mazhar ul Haq Fahim Arif +4 位作者 Khursheed Aurangzeb Noor ul Ain Javed Ali Khan Saddaf Rubab Muhammad Shahid Anwar 《Computers, Materials & Continua》 SCIE EI 2024年第3期4379-4397,共19页
Software project outcomes heavily depend on natural language requirements,often causing diverse interpretations and issues like ambiguities and incomplete or faulty requirements.Researchers are exploring machine learn... Software project outcomes heavily depend on natural language requirements,often causing diverse interpretations and issues like ambiguities and incomplete or faulty requirements.Researchers are exploring machine learning to predict software bugs,but a more precise and general approach is needed.Accurate bug prediction is crucial for software evolution and user training,prompting an investigation into deep and ensemble learning methods.However,these studies are not generalized and efficient when extended to other datasets.Therefore,this paper proposed a hybrid approach combining multiple techniques to explore their effectiveness on bug identification problems.The methods involved feature selection,which is used to reduce the dimensionality and redundancy of features and select only the relevant ones;transfer learning is used to train and test the model on different datasets to analyze how much of the learning is passed to other datasets,and ensemble method is utilized to explore the increase in performance upon combining multiple classifiers in a model.Four National Aeronautics and Space Administration(NASA)and four Promise datasets are used in the study,showing an increase in the model’s performance by providing better Area Under the Receiver Operating Characteristic Curve(AUC-ROC)values when different classifiers were combined.It reveals that using an amalgam of techniques such as those used in this study,feature selection,transfer learning,and ensemble methods prove helpful in optimizing the software bug prediction models and providing high-performing,useful end mode. 展开更多
关键词 Natural language processing software bug prediction transfer learning ensemble learning feature selection
下载PDF
Literature classification and its applications in condensed matter physics and materials science by natural language processing
4
作者 吴思远 朱天念 +5 位作者 涂思佳 肖睿娟 袁洁 吴泉生 李泓 翁红明 《Chinese Physics B》 SCIE EI CAS CSCD 2024年第5期117-123,共7页
The exponential growth of literature is constraining researchers’access to comprehensive information in related fields.While natural language processing(NLP)may offer an effective solution to literature classificatio... The exponential growth of literature is constraining researchers’access to comprehensive information in related fields.While natural language processing(NLP)may offer an effective solution to literature classification,it remains hindered by the lack of labelled dataset.In this article,we introduce a novel method for generating literature classification models through semi-supervised learning,which can generate labelled dataset iteratively with limited human input.We apply this method to train NLP models for classifying literatures related to several research directions,i.e.,battery,superconductor,topological material,and artificial intelligence(AI)in materials science.The trained NLP‘battery’model applied on a larger dataset different from the training and testing dataset can achieve F1 score of 0.738,which indicates the accuracy and reliability of this scheme.Furthermore,our approach demonstrates that even with insufficient data,the not-well-trained model in the first few cycles can identify the relationships among different research fields and facilitate the discovery and understanding of interdisciplinary directions. 展开更多
关键词 natural language processing text mining materials science
下载PDF
Sentiment Analysis of Low-Resource Language Literature Using Data Processing and Deep Learning
5
作者 Aizaz Ali Maqbool Khan +2 位作者 Khalil Khan Rehan Ullah Khan Abdulrahman Aloraini 《Computers, Materials & Continua》 SCIE EI 2024年第4期713-733,共21页
Sentiment analysis, a crucial task in discerning emotional tones within the text, plays a pivotal role in understandingpublic opinion and user sentiment across diverse languages.While numerous scholars conduct sentime... Sentiment analysis, a crucial task in discerning emotional tones within the text, plays a pivotal role in understandingpublic opinion and user sentiment across diverse languages.While numerous scholars conduct sentiment analysisin widely spoken languages such as English, Chinese, Arabic, Roman Arabic, and more, we come to grapplingwith resource-poor languages like Urdu literature which becomes a challenge. Urdu is a uniquely crafted language,characterized by a script that amalgamates elements from diverse languages, including Arabic, Parsi, Pashtu,Turkish, Punjabi, Saraiki, and more. As Urdu literature, characterized by distinct character sets and linguisticfeatures, presents an additional hurdle due to the lack of accessible datasets, rendering sentiment analysis aformidable undertaking. The limited availability of resources has fueled increased interest among researchers,prompting a deeper exploration into Urdu sentiment analysis. This research is dedicated to Urdu languagesentiment analysis, employing sophisticated deep learning models on an extensive dataset categorized into fivelabels: Positive, Negative, Neutral, Mixed, and Ambiguous. The primary objective is to discern sentiments andemotions within the Urdu language, despite the absence of well-curated datasets. To tackle this challenge, theinitial step involves the creation of a comprehensive Urdu dataset by aggregating data from various sources such asnewspapers, articles, and socialmedia comments. Subsequent to this data collection, a thorough process of cleaningand preprocessing is implemented to ensure the quality of the data. The study leverages two well-known deeplearningmodels, namely Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), for bothtraining and evaluating sentiment analysis performance. Additionally, the study explores hyperparameter tuning tooptimize the models’ efficacy. Evaluation metrics such as precision, recall, and the F1-score are employed to assessthe effectiveness of the models. The research findings reveal that RNN surpasses CNN in Urdu sentiment analysis,gaining a significantly higher accuracy rate of 91%. This result accentuates the exceptional performance of RNN,solidifying its status as a compelling option for conducting sentiment analysis tasks in the Urdu language. 展开更多
关键词 Urdu sentiment analysis convolutional neural networks recurrent neural network deep learning natural language processing neural networks
下载PDF
A Joint Entity Relation Extraction Model Based on Relation Semantic Template Automatically Constructed
6
作者 Wei Liu Meijuan Yin +1 位作者 Jialong Zhang Lunchong Cui 《Computers, Materials & Continua》 SCIE EI 2024年第1期975-997,共23页
The joint entity relation extraction model which integrates the semantic information of relation is favored by relevant researchers because of its effectiveness in solving the overlapping of entities,and the method of... The joint entity relation extraction model which integrates the semantic information of relation is favored by relevant researchers because of its effectiveness in solving the overlapping of entities,and the method of defining the semantic template of relation manually is particularly prominent in the extraction effect because it can obtain the deep semantic information of relation.However,this method has some problems,such as relying on expert experience and poor portability.Inspired by the rule-based entity relation extraction method,this paper proposes a joint entity relation extraction model based on a relation semantic template automatically constructed,which is abbreviated as RSTAC.This model refines the extraction rules of relation semantic templates from relation corpus through dependency parsing and realizes the automatic construction of relation semantic templates.Based on the relation semantic template,the process of relation classification and triplet extraction is constrained,and finally,the entity relation triplet is obtained.The experimental results on the three major Chinese datasets of DuIE,SanWen,and FinRE showthat the RSTAC model successfully obtains rich deep semantics of relation,improves the extraction effect of entity relation triples,and the F1 scores are increased by an average of 0.96% compared with classical joint extraction models such as CasRel,TPLinker,and RFBFN. 展开更多
关键词 Natural language processing deep learning information extraction relation extraction relation semantic template
下载PDF
Classification of Conversational Sentences Using an Ensemble Pre-Trained Language Model with the Fine-Tuned Parameter
7
作者 R.Sujatha K.Nimala 《Computers, Materials & Continua》 SCIE EI 2024年第2期1669-1686,共18页
Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requir... Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88. 展开更多
关键词 Bidirectional encoder for representation of transformer conversation ensemble model fine-tuning generalized autoregressive pretraining for language understanding generative pre-trained transformer hyperparameter tuning natural language processing robustly optimized BERT pretraining approach sentence classification transformer models
下载PDF
Digital Disparities:How Artificial Intelligence Can Facilitate Anti-Black Racism in the U.S.Healthcare Sector
8
作者 Anthony Victor Onwuegbuzia 《International Relations and Diplomacy》 2024年第1期40-50,共11页
This paper delves into the intricate interplay between artificial intelligence(AI)systems and the perpetuation of Anti-Black racism within the United States medical industry.Despite the promising potential of AI to en... This paper delves into the intricate interplay between artificial intelligence(AI)systems and the perpetuation of Anti-Black racism within the United States medical industry.Despite the promising potential of AI to enhance healthcare outcomes and reduce disparities,there is a growing concern that these technologies may inadvertently/advertently exacerbate existing racial inequalities.Focusing specifically on the experiences of Black patients,this research investigates how the following AI components:medical algorithms,machine learning,and natural learning processes are contributing to the unequal distribution of medical resources,diagnosis,and health care treatment of those classified as Black.Furthermore,this review employs a multidisciplinary approach,combining insights from computer science,medical ethics,and social justice theory to analyze the mechanisms through which AI systems may encode and reinforce racial biases.By dissecting the three primary components of AI,this paper aims to present a clear understanding of how these technologies work,how they intersect,and how they may inherently perpetuate harmful stereotypes resulting in negligent outcomes for Black patients.Furthermore,this paper explores the ethical implications of deploying AI in healthcare settings and calls for increased transparency,accountability,and diversity in the development and implementation of these technologies.Finally,it is important that I prefer the following paper with a clear and concise definition of what I refer to as Anti-Black racism throughout the text.Therefore,I assert the following:Anti-Black racism refers to prejudice,discrimination,or antagonism directed against individuals or communities of African descent based on their race.It involves the belief in the inherent superiority of one race over another and the systemic and institutional practices that perpetuate inequality and disadvantage for Black people.Furthermore,I proclaim that this form of racism can be manifested in various ways,such as unequal access to opportunities,resources,education,employment,and fair treatment within social,economic,and political systems.It is also pertinent to acknowledge that Anti-Black racism is deeply rooted in historical and societal structures throughout the U.S.borders and beyond,leading to systemic disadvantages and disparities that impact the well-being and life chances of Black individuals and communities.Addressing Anti-Black racism involves recognizing and challenging both individual attitudes and systemic structures that contribute to discrimination and inequality.Efforts to combat Anti-Black racism include promoting awareness,education,advocacy for policy changes,and fostering a culture of inclusivity and equality. 展开更多
关键词 Bias in algorithms Racial disparities in U.S.healthcare Discriminatory healthcare practices Black patient outcomes Automated decision-making and racism Machine Learning Natural language processing
下载PDF
Research and Application of AI-Based Interactive Exhibits in Wuhan Museum of Science and Technology
9
作者 Ting Yan 《Journal of Electronic Research and Application》 2024年第2期95-102,共8页
This article aims to explore the development and application of AI-based interactive exhibits in Wuhan Museum of Science and Technology.By utilizing computer vision,natural language processing,and machine learning tec... This article aims to explore the development and application of AI-based interactive exhibits in Wuhan Museum of Science and Technology.By utilizing computer vision,natural language processing,and machine learning technologies,an innovative exhibit development and application system is proposed.This system employs deep learning algorithms and data analysis methods to achieve real-time perception of visitor behavior and adaptive interaction.The development process involves designing user interfaces and interaction methods to effectively enhance visitor engagement and learning outcomes.Through evaluation and comparison in practical applications,the potential of this system in enhancing exhibit interaction,increasing visitor engagement,improving educational effectiveness,and expanding avenues for scientific knowledge dissemination are validated. 展开更多
关键词 Artificial intelligence Interactive exhibits Computer vision Natural language processing Machine learning
下载PDF
Cyber Deception Using NLP
10
作者 Igor Godefroy Kouam Kamdem Marcellin Nkenlifack 《Journal of Information Security》 2024年第2期279-297,共19页
Cyber security addresses the protection of information systems in cyberspace. These systems face multiple attacks on a daily basis, with the level of complication getting increasingly challenging. Despite the existenc... Cyber security addresses the protection of information systems in cyberspace. These systems face multiple attacks on a daily basis, with the level of complication getting increasingly challenging. Despite the existence of multiple solutions, attackers are still quite successful at identifying vulnerabilities to exploit. This is why cyber deception is increasingly being used to divert attackers’ attention and, therefore, enhance the security of information systems. To be effective, deception environments need fake data. This is where Natural Language (NLP) Processing comes in. Many cyber security models have used NLP for vulnerability detection in information systems, email classification, fake citation detection, and many others. Although it is used for text generation, existing models seem to be unsuitable for data generation in a deception environment. Our goal is to use text generation in NLP to generate data in the deception context that will be used to build multi-level deception in information systems. Our model consists of three (3) components, including the connection component, the deception component, composed of several states in which an attacker may be, depending on whether he is malicious or not, and the text generation component. The text generation component considers as input the real data of the information system and allows the production of several texts as output, which are usable at different deception levels. 展开更多
关键词 Cyber Deception CYBERSECURITY Natural Language Processing Text Generation
下载PDF
Smart Approaches to Efficient Text Mining for Categorizing Sexual Reproductive Health Short Messages into Key Themes
11
作者 Tobias Makai Mayumbo Nyirenda 《Open Journal of Applied Sciences》 2024年第2期511-532,共22页
To promote behavioral change among adolescents in Zambia, the National HIV/AIDS/STI/TB Council, in collaboration with UNICEF, developed the Zambia U-Report platform. This platform provides young people with improved a... To promote behavioral change among adolescents in Zambia, the National HIV/AIDS/STI/TB Council, in collaboration with UNICEF, developed the Zambia U-Report platform. This platform provides young people with improved access to information on various Sexual Reproductive Health topics through Short Messaging Service (SMS) messages. Over the years, the platform has accumulated millions of incoming and outgoing messages, which need to be categorized into key thematic areas for better tracking of sexual reproductive health knowledge gaps among young people. The current manual categorization process of these text messages is inefficient and time-consuming and this study aims to automate the process for improved analysis using text-mining techniques. Firstly, the study investigates the current text message categorization process and identifies a list of categories adopted by counselors over time which are then used to build and train a categorization model. Secondly, the study presents a proof of concept tool that automates the categorization of U-report messages into key thematic areas using the developed categorization model. Finally, it compares the performance and effectiveness of the developed proof of concept tool against the manual system. The study used a dataset comprising 206,625 text messages. The current process would take roughly 2.82 years to categorise this dataset whereas the trained SVM model would require only 6.4 minutes while achieving an accuracy of 70.4% demonstrating that the automated method is significantly faster, more scalable, and consistent when compared to the current manual categorization. These advantages make the SVM model a more efficient and effective tool for categorizing large unstructured text datasets. These results and the proof-of-concept tool developed demonstrate the potential for enhancing the efficiency and accuracy of message categorization on the Zambia U-report platform and other similar text messages-based platforms. 展开更多
关键词 Knowledge Discovery in Text (KDT) Sexual Reproductive Health (SRH) Text Categorization Text Classification Text Extraction Text Mining Feature Extraction Automated Classification Process Performance Stemming and Lemmatization Natural Language Processing (NLP)
下载PDF
Intelligent Deep Learning Based Cybersecurity Phishing Email Detection and Classification 被引量:1
12
作者 R.Brindha S.Nandagopal +3 位作者 H.Azath V.Sathana Gyanendra Prasad Joshi Sung Won Kim 《Computers, Materials & Continua》 SCIE EI 2023年第3期5901-5914,共14页
Phishing is a type of cybercrime in which cyber-attackers pose themselves as authorized persons or entities and hack the victims’sensitive data.E-mails,instant messages and phone calls are some of the common modes us... Phishing is a type of cybercrime in which cyber-attackers pose themselves as authorized persons or entities and hack the victims’sensitive data.E-mails,instant messages and phone calls are some of the common modes used in cyberattacks.Though the security models are continuously upgraded to prevent cyberattacks,hackers find innovative ways to target the victims.In this background,there is a drastic increase observed in the number of phishing emails sent to potential targets.This scenario necessitates the importance of designing an effective classification model.Though numerous conventional models are available in the literature for proficient classification of phishing emails,the Machine Learning(ML)techniques and the Deep Learning(DL)models have been employed in the literature.The current study presents an Intelligent Cuckoo Search(CS)Optimization Algorithm with a Deep Learning-based Phishing Email Detection and Classification(ICSOA-DLPEC)model.The aim of the proposed ICSOA-DLPEC model is to effectually distinguish the emails as either legitimate or phishing ones.At the initial stage,the pre-processing is performed through three stages such as email cleaning,tokenization and stop-word elimination.Then,the N-gram approach is;moreover,the CS algorithm is applied to extract the useful feature vectors.Moreover,the CS algorithm is employed with the Gated Recurrent Unit(GRU)model to detect and classify phishing emails.Furthermore,the CS algorithm is used to fine-tune the parameters involved in the GRU model.The performance of the proposed ICSOA-DLPEC model was experimentally validated using a benchmark dataset,and the results were assessed under several dimensions.Extensive comparative studies were conducted,and the results confirmed the superior performance of the proposed ICSOA-DLPEC model over other existing approaches.The proposed model achieved a maximum accuracy of 99.72%. 展开更多
关键词 Phishing email data classification natural language processing deep learning CYBERSECURITY
下载PDF
A Semi-Supervised Approach for Aspect Category Detection and Aspect Term Extraction from Opinionated Text 被引量:1
13
作者 Bishrul Haq Sher Muhammad Daudpota +2 位作者 Ali Shariq Imran Zenun Kastrati Waheed Noor 《Computers, Materials & Continua》 SCIE EI 2023年第10期115-137,共23页
The Internet has become one of the significant sources for sharing information and expressing users’opinions about products and their interests with the associated aspects.It is essential to learn about product revie... The Internet has become one of the significant sources for sharing information and expressing users’opinions about products and their interests with the associated aspects.It is essential to learn about product reviews;however,to react to such reviews,extracting aspects of the entity to which these reviews belong is equally important.Aspect-based Sentiment Analysis(ABSA)refers to aspects extracted from an opinionated text.The literature proposes different approaches for ABSA;however,most research is focused on supervised approaches,which require labeled datasets with manual sentiment polarity labeling and aspect tagging.This study proposes a semisupervised approach with minimal human supervision to extract aspect terms by detecting the aspect categories.Hence,the study deals with two main sub-tasks in ABSA,named Aspect Category Detection(ACD)and Aspect Term Extraction(ATE).In the first sub-task,aspects categories are extracted using topic modeling and filtered by an oracle further,and it is fed to zero-shot learning as the prompts and the augmented text.The predicted categories are the input to find similar phrases curated with extracting meaningful phrases(e.g.,Nouns,Proper Nouns,NER(Named Entity Recognition)entities)to detect the aspect terms.The study sets a baseline accuracy for two main sub-tasks in ABSA on the Multi-Aspect Multi-Sentiment(MAMS)dataset along with SemEval-2014 Task 4 subtask 1 to show that the proposed approach helps detect aspect terms via aspect categories. 展开更多
关键词 Natural language processing sentiment analysis aspect-based sentiment analysis topic-modeling POS tagging zero-shot learning
下载PDF
Fake News Detection Based on Multimodal Inputs 被引量:1
14
作者 Zhiping Liang 《Computers, Materials & Continua》 SCIE EI 2023年第5期4519-4534,共16页
In view of the various adverse effects,fake news detection has become an extremely important task.So far,many detection methods have been proposed,but these methods still have some limitations.For example,only two ind... In view of the various adverse effects,fake news detection has become an extremely important task.So far,many detection methods have been proposed,but these methods still have some limitations.For example,only two independently encoded unimodal information are concatenated together,but not integrated with multimodal information to complete the complementary information,and to obtain the correlated information in the news content.This simple fusion approach may lead to the omission of some information and bring some interference to the model.To solve the above problems,this paper proposes the FakeNewsDetectionmodel based on BLIP(FNDB).First,the XLNet and VGG-19 based feature extractors are used to extract textual and visual feature representation respectively,and BLIP based multimodal feature extractor to obtain multimodal feature representation in news content.Then,the feature fusion layer will fuse these features with the help of the cross-modal attention module to promote various modal feature representations for information complementation.The fake news detector uses these fused features to identify the input content,and finally complete fake news detection.Based on this design,FNDB can extract as much information as possible from the news content and fuse the information between multiple modalities effectively.The fake news detector in the FNDB can also learn more information to achieve better performance.The verification experiments on Weibo and Gossipcop,two widely used real-world datasets,show that FNDB is 4.4%and 0.6%higher in accuracy than the state-of-theart fake news detection methods,respectively. 展开更多
关键词 Natural language processing fake news detection machine learning text classification
下载PDF
Sentiment Analysis with Tweets Behaviour in Twitter Streaming API 被引量:1
15
作者 Kuldeep Chouhan Mukesh Yadav +4 位作者 Ranjeet Kumar Rout Kshira Sagar Sahoo NZ Jhanjhi Mehedi Masud Sultan Aljahdali 《Computer Systems Science & Engineering》 SCIE EI 2023年第5期1113-1128,共16页
Twitter is a radiant platform with a quick and effective technique to analyze users’perceptions of activities on social media.Many researchers and industry experts show their attention to Twitter sentiment analysis t... Twitter is a radiant platform with a quick and effective technique to analyze users’perceptions of activities on social media.Many researchers and industry experts show their attention to Twitter sentiment analysis to recognize the stakeholder group.The sentiment analysis needs an advanced level of approaches including adoption to encompass data sentiment analysis and various machine learning tools.An assessment of sentiment analysis in multiple fields that affect their elevations among the people in real-time by using Naive Bayes and Support Vector Machine(SVM).This paper focused on analysing the distinguished sentiment techniques in tweets behaviour datasets for various spheres such as healthcare,behaviour estimation,etc.In addition,the results in this work explore and validate the statistical machine learning classifiers that provide the accuracy percentages attained in terms of positive,negative and neutral tweets.In this work,we obligated Twitter Application Programming Interface(API)account and programmed in python for sentiment analysis approach for the computational measure of user’s perceptions that extract a massive number of tweets and provide market value to the Twitter account proprietor.To distinguish the results in terms of the performance evaluation,an error analysis investigates the features of various stakeholders comprising social media analytics researchers,Natural Language Processing(NLP)developers,engineering managers and experts involved to have a decision-making approach. 展开更多
关键词 Machine learning Naive Bayes natural language processing sentiment analysis social media analytics support vector machine Twitter application programming interface
下载PDF
Deep-BERT:Transfer Learning for Classifying Multilingual Offensive Texts on Social Media 被引量:1
16
作者 Md.Anwar Hussen Wadud M.F.Mridha +2 位作者 Jungpil Shin Kamruddin Nur Aloke Kumar Saha 《Computer Systems Science & Engineering》 SCIE EI 2023年第2期1775-1791,共17页
Offensive messages on social media,have recently been frequently used to harass and criticize people.In recent studies,many promising algorithms have been developed to identify offensive texts.Most algorithms analyze ... Offensive messages on social media,have recently been frequently used to harass and criticize people.In recent studies,many promising algorithms have been developed to identify offensive texts.Most algorithms analyze text in a unidirectional manner,where a bidirectional method can maximize performance results and capture semantic and contextual information in sentences.In addition,there are many separate models for identifying offensive texts based on monolin-gual and multilingual,but there are a few models that can detect both monolingual and multilingual-based offensive texts.In this study,a detection system has been developed for both monolingual and multilingual offensive texts by combining deep convolutional neural network and bidirectional encoder representations from transformers(Deep-BERT)to identify offensive posts on social media that are used to harass others.This paper explores a variety of ways to deal with multilin-gualism,including collaborative multilingual and translation-based approaches.Then,the Deep-BERT is tested on the Bengali and English datasets,including the different bidirectional encoder representations from transformers(BERT)pre-trained word-embedding techniques,and found that the proposed Deep-BERT’s efficacy outperformed all existing offensive text classification algorithms reaching an accuracy of 91.83%.The proposed model is a state-of-the-art model that can classify both monolingual-based and multilingual-based offensive texts. 展开更多
关键词 Offensive text classification deep convolutional neural network(DCNN) bidirectional encoder representations from transformers(BERT) natural language processing(NLP)
下载PDF
Extraction and analysis of risk factors from Chinese chemical accident reports
17
作者 Xi Luo Xiayuan Feng +4 位作者 Xu Ji Yagu Dang Li Zhou Kexin Bi Yiyang Dai 《Chinese Journal of Chemical Engineering》 SCIE EI CAS CSCD 2023年第9期68-81,共14页
Accidents in chemical production usually result in fatal injury,economic loss and negative social impact.Chemical accident reports which record past accident information,contain a large amount of expert knowledge.Howe... Accidents in chemical production usually result in fatal injury,economic loss and negative social impact.Chemical accident reports which record past accident information,contain a large amount of expert knowledge.However,manually finding out the key factors causing accidents needs reading and analyzing of numerous accident reports,which is time-consuming and labor intensive.Herein,in this paper,a semiautomatic method based on natural language process(NLP)technology is developed to construct a knowledge graph of chemical accidents.Firstly,we build a named entity recognition(NER)model using SoftLexicon(simplify the usage of lexicon)+BERT-Transformer-CRF(conditional random field)to automatically extract the accident information and risk factors.The risk factors leading to accident in chemical accident reports are divided into five categories:human,machine,material,management,and environment.Through analysis of the extraction results of different chemical industries and different accident types,corresponding accident prevention suggestions are given.Secondly,based on the definition of classes and hierarchies of information in chemical accident reports,the seven-step method developed at Stanford University is used to construct the ontology-based chemical accident knowledge description model.Finally,the ontology knowledge description model is imported into the graph database Neo4j,and the knowledge graph is constructed to realize the structu red storage of chemical accident knowledge.In the case of information extraction from 290 Chinese chemical accident reports,SoftLexicon+BERT-Transformer-CRF shows the best extraction performance among nine experimental models.Demonstrating that the method developed in the current work can be a promising tool in obtaining the factors causing accidents,which contributes to intelligent accident analysis and auxiliary accident prevention. 展开更多
关键词 Chemical processes Chemical process safety Natural language process Knowledge graph Neural networks Algorithm
下载PDF
Investigating the tourism image of mountain scenic spots in China through the lens of tourist perception
18
作者 LI Feng-jiao LIAO Xia +3 位作者 LIU Jia-ming JIANG Li-li WANG Meng-di LIU Jin-feng 《Journal of Mountain Science》 SCIE CSCD 2023年第8期2298-2314,共17页
A favorable tourism image of high-quality mountain scenic spots(HQMSS)is crucial for tourism prosperity and sustainability.This paper establishes a framework for investigating the tourism image based on cognitive-emot... A favorable tourism image of high-quality mountain scenic spots(HQMSS)is crucial for tourism prosperity and sustainability.This paper establishes a framework for investigating the tourism image based on cognitive-emotion theory and uses natural language processing(NLP)tools to clarify the cognition,emotion,and overall tourist image of the HQMSS in China from the perspective of tourist perception.This paper examines the multi-dimensional spatial differentiation of China's overall image,including province,scenic spot scales,as well as the spatial pattern of the overall comprehensive tourism image.Strategies for comprehensively improving HQMSS's tourism image are also formulated.The results show that:(1)The cognitive image of Chinese HQMSS is categorized into core and marginal images,and the core images such as scenery and cable car are the expression of the uniqueness of mountainous scenic spots.Additionally,the cognitive image is classified into six dimensions:tourism environment,tourism supporting facilities,tourism experience,tourism price,tourism service,and tourism safety.(2)Positive emotions are the dominant mood type of HQMSS in China,followed by neutral emotions,with negative emotions being the least frequent.Emotional images vary across dimensions,with tourism environment and tourism experience evoking relatively higher emotion.(3)The spatial pattern of HQMSS for each dimension at the national,provincial,and scenic scales is diversifying.This article provides a multidimensional perspective for investigating the tourism image of mountainous scenic spots,proposes targeted recommendations to improve the overall image of HQMSS in China,and can greatly contribute to the sustainable development of mountain tourism. 展开更多
关键词 Mountain scenic spot Tourism image Spatial differentiation Natural language processing Cognitive-emotion theory Tourist perception
下载PDF
Predicting Carpark Prices Indices in Hong Kong Using AutoML
19
作者 Rita YiMan Li Lingxi Song +2 位作者 Bo Li M.James C.Crabbe Xiao-Guang Yue 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第3期2247-2282,共36页
The aims of this study were threefold:1)study the research gap in carpark and price index via big data and natural language processing,2)examine the research gap of carpark indices,and 3)construct carpark price indice... The aims of this study were threefold:1)study the research gap in carpark and price index via big data and natural language processing,2)examine the research gap of carpark indices,and 3)construct carpark price indices via repeat sales methods and predict carpark indices via the AutoML.By researching the keyword“carpark”in Google Scholar,the largest electronic academic database that coversWeb of Science and Scopus indexed articles,this study obtained 999 articles and book chapters from 1910 to 2019.It confirmed that most carpark research threw light on multi-storey carparks,management and ventilation systems,and reinforced concrete carparks.The most common research method was case studies.Regarding price index research,many previous studies focused on consumer,stock,press and futures,with many keywords being related to finance and economics.These indicated that there is no research predicting carpark price indices based on an AutoML approach.This study constructed repeat sales indices for 18 districts in Hong Kong by using 34,562 carpark transaction records from December 2009 to June 2019.Wanchai’s carpark price was about four times that of Yuen Long’s carpark price,indicating the considerable carpark price differences inHong Kong.This research evidenced the features that affected the carpark price indices models most:gold price ranked the first in all 19 models;oil price or Link stock price ranked second depending on the district,and carpark affordability ranked third. 展开更多
关键词 Carpark repeat sales index AutoML Hong Kong natural language processing TOKENIZATION
下载PDF
SA-Model:Multi-Feature Fusion Poetic Sentiment Analysis Based on a Hybrid Word Vector Model
20
作者 Lingli Zhang Yadong Wu +5 位作者 Qikai Chu Pan Li Guijuan Wang Weihan Zhang Yu Qiu Yi Li 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第10期631-645,共15页
Sentiment analysis in Chinese classical poetry has become a prominent topic in historical and cultural tracing,ancient literature research,etc.However,the existing research on sentiment analysis is relatively small.It... Sentiment analysis in Chinese classical poetry has become a prominent topic in historical and cultural tracing,ancient literature research,etc.However,the existing research on sentiment analysis is relatively small.It does not effectively solve the problems such as the weak feature extraction ability of poetry text,which leads to the low performance of the model on sentiment analysis for Chinese classical poetry.In this research,we offer the SA-Model,a poetic sentiment analysis model.SA-Model firstly extracts text vector information and fuses it through Bidirectional encoder representation from transformers-Whole word masking-extension(BERT-wwmext)and Enhanced representation through knowledge integration(ERNIE)to enrich text vector information;Secondly,it incorporates numerous encoders to remove text features at multiple levels,thereby increasing text feature information,improving text semantics accuracy,and enhancing the model’s learning and generalization capabilities;finally,multi-feature fusion poetry sentiment analysis model is constructed.The feasibility and accuracy of the model are validated through the ancient poetry sentiment corpus.Compared with other baseline models,the experimental findings indicate that SA-Model may increase the accuracy of text semantics and hence improve the capability of poetry sentiment analysis. 展开更多
关键词 Sentiment analysis Chinese classical poetry natural language processing BERT-wwm-ext ERNIE multi-feature fusion
下载PDF
上一页 1 2 12 下一页 到第
使用帮助 返回顶部