The outbreak of the pandemic,caused by Coronavirus Disease 2019(COVID-19),has affected the daily activities of people across the globe.During COVID-19 outbreak and the successive lockdowns,Twitter was heavily used and...The outbreak of the pandemic,caused by Coronavirus Disease 2019(COVID-19),has affected the daily activities of people across the globe.During COVID-19 outbreak and the successive lockdowns,Twitter was heavily used and the number of tweets regarding COVID-19 increased tremendously.Several studies used Sentiment Analysis(SA)to analyze the emotions expressed through tweets upon COVID-19.Therefore,in current study,a new Artificial Bee Colony(ABC)with Machine Learning-driven SA(ABCMLSA)model is developed for conducting Sentiment Analysis of COVID-19 Twitter data.The prime focus of the presented ABCML-SA model is to recognize the sentiments expressed in tweets made uponCOVID-19.It involves data pre-processing at the initial stage followed by n-gram based feature extraction to derive the feature vectors.For identification and classification of the sentiments,the Support Vector Machine(SVM)model is exploited.At last,the ABC algorithm is applied to fine tune the parameters involved in SVM.To demonstrate the improved performance of the proposed ABCML-SA model,a sequence of simulations was conducted.The comparative assessment results confirmed the effectual performance of the proposed ABCML-SA model over other approaches.展开更多
Today social media became a communication line among people to share their happiness,sadness,and anger with their end-users.It is necessary to know people’s emotions are very important to identify depressed people fr...Today social media became a communication line among people to share their happiness,sadness,and anger with their end-users.It is necessary to know people’s emotions are very important to identify depressed people from their messages.Early depression detection helps to save people’s lives and other dangerous mental diseases.There are many intelligent algorithms for predicting depression with high accuracy,but they lack the definition of such cases.Several machine learning methods help to identify depressed people.But the accuracy of existing methods was not satisfactory.To overcome this issue,the deep learning method is used in the proposed method for depression detection.In this paper,a novel Deep Learning Multi-Aspect Depression Detection with Hierarchical Atten-tion Network(MDHAN)is used for classifying the depression data.Initially,the Twitter data was preprocessed by tokenization,punctuation mark removal,stop word removal,stemming,and lemmatization.The Adaptive Particle and grey Wolf optimization methods are used for feature selection.The MDHAN classifies the Twitter data and predicts the depressed and non-depressed users.Finally,the proposed method is compared with existing methods such as Convolutional Neur-al Network(CNN),Support Vector Machine(SVM),Minimum Description Length(MDL),and MDHAN.The suggested MDH-PWO architecture gains 99.86%accuracy,more significant than frequency-based deep learning models,with a lower false-positive rate.The experimental result shows that the proposed method achieves better accuracy,precision,recall,and F1-measure.It also mini-mizes the execution time.展开更多
Aflood is a significant damaging natural calamity that causes loss of life and property.Earlier work on the construction offlood prediction models intended to reduce risks,suggest policies,reduce mortality,and limit prop...Aflood is a significant damaging natural calamity that causes loss of life and property.Earlier work on the construction offlood prediction models intended to reduce risks,suggest policies,reduce mortality,and limit property damage caused byfloods.The massive amount of data generated by social media platforms such as Twitter opens the door toflood analysis.Because of the real-time nature of Twitter data,some government agencies and authorities have used it to track natural catastrophe events in order to build a more rapid rescue strategy.However,due to the shorter duration of Tweets,it is difficult to construct a perfect prediction model for determiningflood.Machine learning(ML)and deep learning(DL)approaches can be used to statistically developflood prediction models.At the same time,the vast amount of Tweets necessitates the use of a big data analytics(BDA)tool forflood prediction.In this regard,this work provides an optimal deep learning-basedflood forecasting model with big data analytics(ODLFF-BDA)based on Twitter data.The suggested ODLFF-BDA technique intends to anticipate the existence offloods using tweets in a big data setting.The ODLFF-BDA technique comprises data pre-processing to convert the input tweets into a usable format.In addition,a Bidirectional Encoder Representations from Transformers(BERT)model is used to generate emotive contextual embed-ding from tweets.Furthermore,a gated recurrent unit(GRU)with a Multilayer Convolutional Neural Network(MLCNN)is used to extract local data and predict theflood.Finally,an Equilibrium Optimizer(EO)is used tofine-tune the hyper-parameters of the GRU and MLCNN models in order to increase prediction performance.The memory usage is pull down lesser than 3.5 MB,if its compared with the other algorithm techniques.The ODLFF-BDA technique’s performance was validated using a benchmark Kaggle dataset,and thefindings showed that it outperformed other recent approaches significantly.展开更多
Arabic is one of the most spoken languages across the globe.However,there are fewer studies concerning Sentiment Analysis(SA)in Arabic.In recent years,the detected sentiments and emotions expressed in tweets have rece...Arabic is one of the most spoken languages across the globe.However,there are fewer studies concerning Sentiment Analysis(SA)in Arabic.In recent years,the detected sentiments and emotions expressed in tweets have received significant interest.The substantial role played by the Arab region in international politics and the global economy has urged the need to examine the sentiments and emotions in the Arabic language.Two common models are available:Machine Learning and lexicon-based approaches to address emotion classification problems.With this motivation,the current research article develops a Teaching and Learning Optimization with Machine Learning Based Emotion Recognition and Classification(TLBOML-ERC)model for Sentiment Analysis on tweets made in the Arabic language.The presented TLBOML-ERC model focuses on recognising emotions and sentiments expressed in Arabic tweets.To attain this,the proposed TLBOMLERC model initially carries out data pre-processing and a Continuous Bag Of Words(CBOW)-based word embedding process.In addition,Denoising Autoencoder(DAE)model is also exploited to categorise different emotions expressed in Arabic tweets.To improve the efficacy of the DAE model,the Teaching and Learning-based Optimization(TLBO)algorithm is utilized to optimize the parameters.The proposed TLBOML-ERC method was experimentally validated with the help of an Arabic tweets dataset.The obtained results show the promising performance of the proposed TLBOML-ERC model on Arabic emotion classification.展开更多
Currently,individuals use online social media,namely Facebook or Twitter,for sharing their thoughts and emotions.Detection of emotions on social networking sites’finds useful in several applications in social welfare...Currently,individuals use online social media,namely Facebook or Twitter,for sharing their thoughts and emotions.Detection of emotions on social networking sites’finds useful in several applications in social welfare,commerce,public health,and so on.Emotion is expressed in several means,like facial and speech expressions,gestures,and written text.Emotion recognition in a text document is a content-based classification problem that includes notions from deep learning(DL)and natural language processing(NLP)domains.This article proposes a Deer HuntingOptimizationwithDeep Belief Network Enabled Emotion Classification(DHODBN-EC)on English Twitter Data in this study.The presented DHODBN-EC model aims to examine the existence of distinct emotion classes in tweets.At the introductory level,the DHODBN-EC technique pre-processes the tweets at different levels.Besides,the word2vec feature extraction process is applied to generate the word embedding process.For emotion classification,the DHODBN-EC model utilizes the DBN model,which helps to determine distinct emotion class labels.Lastly,the DHO algorithm is leveraged for optimal hyperparameter adjustment of the DBN technique.An extensive range of experimental analyses can be executed to demonstrate the enhanced performance of the DHODBN-EC approach.A comprehensive comparison study exhibited the improvements of the DHODBN-EC model over other approaches with increased accuracy of 96.67%.展开更多
This study is an exploratory analysis of applying natural language processing techniques such as Term Frequency-Inverse Document Frequency and Sentiment Analysis on Twitter data. The uniqueness of this work is establi...This study is an exploratory analysis of applying natural language processing techniques such as Term Frequency-Inverse Document Frequency and Sentiment Analysis on Twitter data. The uniqueness of this work is established by determining the overall sentiment of a politician’s tweets based on TF-IDF values of terms used in their published tweets. By calculating the TF-IDF value of terms from the corpus, this work displays the correlation between TF-IDF score and polarity. The results of this work show that calculating the TF-IDF score of the corpus allows for a more accurate representation of the overall polarity since terms are given a weight based on their uniqueness and relevance rather than just the frequency at which they appear in the corpus.展开更多
Social media data created a paradigm shift in assessing situational awareness during a natural disaster or emergencies such as wildfire, hurricane, tropical storm etc. Twitter as an emerging data source is an effectiv...Social media data created a paradigm shift in assessing situational awareness during a natural disaster or emergencies such as wildfire, hurricane, tropical storm etc. Twitter as an emerging data source is an effective and innovative digital platform to observe trend from social media users’ perspective who are direct or indirect witnesses of the calamitous event. This paper aims to collect and analyze twitter data related to the recent wildfire in California to perform a trend analysis by classifying firsthand and credible information from Twitter users. This work investigates tweets on the recent wildfire in California and classifies them based on witnesses into two types: 1) direct witnesses and 2) indirect witnesses. The collected and analyzed information can be useful for law enforcement agencies and humanitarian organizations for communication and verification of the situational awareness during wildfire hazards. Trend analysis is an aggregated approach that includes sentimental analysis and topic modeling performed through domain-expert manual annotation and machine learning. Trend analysis ultimately builds a fine-grained analysis to assess evacuation routes and provide valuable information to the firsthand emergency responders<span style="font-family:Verdana;">.</span>展开更多
In March 2021,we witnessed a surge in Bitcoin price.The cause seemed to be a tweet by Elon Musk.Are other blockchains as sensitive to social media as Bitcoin?And more precisely,could Ethereum's popularity be expla...In March 2021,we witnessed a surge in Bitcoin price.The cause seemed to be a tweet by Elon Musk.Are other blockchains as sensitive to social media as Bitcoin?And more precisely,could Ethereum's popularity be explained using social media data?This work aims to explore the determinants of Ethereum's popularity.We use both data from Etherscan to retrieve the relevant historic Ethereum factors and Twitter data.Our sample consists of data ranging from 2015 to 2022.We use Ordinary Least Squares to assess the relationship between these factors(Ethereum characteristics and Twitter data)and Ethereum's popularity.Our findings show that Ethereum's popularity—translated here by the number of daily new addresses—is related to the following elements:the Ether(ETH)price,the transaction fees,and the polarity of tweets related to Ethereum.The results could have multiple practical implications for both researchers and practitioners.First of all,we believe that it will enable readers to better understand the technology of Ethereum and its stake.Secondly,it will help the community identify pointers for anticipating or explaining the popularity of existing or future platforms.And finally,the results could help in understanding the factors facilitating the design of future platforms.展开更多
Social media,like Twitter,is a data repository,and people exchange views on global issues like the COVID-19 pandemic.Social media has been shown to influence the low acceptance of vaccines.This work aims to identify p...Social media,like Twitter,is a data repository,and people exchange views on global issues like the COVID-19 pandemic.Social media has been shown to influence the low acceptance of vaccines.This work aims to identify public sentiments concerning the COVID-19 vaccines and better understand the individual’s sensitivities and feelings that lead to achievement.This work proposes a method to analyze the opinion of an individual’s tweet about the COVID-19 vaccines.This paper introduces a sigmoidal particle swarm optimization(SPSO)algorithm.First,the performance of SPSO is measured on a set of 12 benchmark problems,and later it is deployed for selecting optimal text features and categorizing sentiment.The proposed method uses TextBlob and VADER for sentiment analysis,CountVectorizer,and term frequency-inverse document frequency(TF-IDF)vectorizer for feature extraction,followed by SPSO-based feature selection.The Covid-19 vaccination tweets dataset was created and used for training,validating,and testing.The proposed approach outperformed considered algorithms in terms of accuracy.Additionally,we augmented the newly created dataset to make it balanced to increase performance.A classical support vector machine(SVM)gives better accuracy for the augmented dataset without a feature selection algorithm.It shows that augmentation improves the overall accuracy of tweet analysis.After the augmentation performance of PSO and SPSO is improved by almost 7%and 5%,respectively,it is observed that simple SVMwith 10-fold cross-validation significantly improved compared to the primary dataset.展开更多
One of the main purposes for which people use Twitter is to share emotions with others. Users can easily post a message as a short text when they experience emotions such as pleasure or sadness. Such tweet serves to a...One of the main purposes for which people use Twitter is to share emotions with others. Users can easily post a message as a short text when they experience emotions such as pleasure or sadness. Such tweet serves to acquire empathy from followers, and can possibly influence others' emotions. In this study, we analyze the influence of emotional behaviors to user relationships based on Twitter data using two dictionaries of emotional words. Emotion scores are calculated via keyword matching. Moreover, we design three experiments with different settings: calculate the average emotion score of a user with random sampling, calculate the average emotion score using all emotional tweets, and calculate the average emotion score using emotional tweets, excluding users of few emotional tweets. We evaluate the influence of emotional behaviors to user relationships through the Brunner-Munzel test. The result shows that a positive user is more active than a negative user in constructing user relationships in a specific condition.展开更多
Health authorities worldwide strive to detect Influenza prevalence as early as possible in order to prepare for it and minimize its impacts. To this end, we address the Influenza prevalence surveillance and prediction...Health authorities worldwide strive to detect Influenza prevalence as early as possible in order to prepare for it and minimize its impacts. To this end, we address the Influenza prevalence surveillance and prediction problem. In this paper, we develop a new Influenza prevalence prediction model, called Tweetluenza, to predict the spread of the Influenza in real time using cross-lingual data harvested from Twitter data streams with emphases on the United Arab Emirates(UAE). Based on the features of tweets, Tweetluenza filters the Influenza tweets and classifies them into two classes, reporting and non-reporting. To monitor the growth of Influenza, the reporting tweets were employed. Furthermore, a linear regression model leverages the reporting tweets to predict the Influenza-related hospital visits in the future. We evaluated Tweetluenza empirically to study its feasibility and compared the results with the actual hospital visits recorded by the UAE Ministry of Health. The results of our experiments demonstrate the practicality of Tweetluenza, which was verified by the high correlation between the Influenza-related Twitter data and hospital visits due to Influenza. Furthermore, the evaluation of the analysis and prediction of Influenza shows that combining English and Arabic tweets improves the correlation results.展开更多
With the huge increase in popularity of Twitter in recent years, the ability to draw information regarding public sentiment from Twitter data has become an area of immense interest. Numerous methods of determining the...With the huge increase in popularity of Twitter in recent years, the ability to draw information regarding public sentiment from Twitter data has become an area of immense interest. Numerous methods of determining the sentiment of tweets, both in general and in regard to a specific topic, have been developed, however most of these functions are in a batch learning environment where instances may be passed over multiple times. Since Twitter data in real world situations are far similar to a stream environment, we proposed several algorithms which classify the sentiment of tweets in a data stream. We were able to determine whether a tweet was subjective or objective with an error rate as low as 0.24 and an F-score as high as 0.85. For the determination of positive or negative sentiment in subjective tweets, an error rate as low as 0.23 and an F-score as high as 0.78 were achieved.展开更多
Social media plays a crucial role in the organization of massive social movements. However, the sheer quantity of data generated by the events as well as the data collection restrictions that researchers encounter, le...Social media plays a crucial role in the organization of massive social movements. However, the sheer quantity of data generated by the events as well as the data collection restrictions that researchers encounter, leads to a series of challenges for researchers who want to analyze dynamic public discourse and opinion in response to and in the creation of world events. In this paper we present gatherTweet, a Python package that helps researchers efficiently collect social media data for events that are composed of many decentralized actions (across both space and time). The package is useful for studies that require analysis of the organizational or baseline messaging before an action, the action itself, and the effects of the action on subsequent public discourse. By capturing these aspects of world events gatherTweet enables the study of events and actions like protests, natural disasters, and elections.展开更多
Social media is an essential component of our personal and professional lives. We use it extensively to share various things, including our opinions on daily topics and feelings about different subjects. This sharing ...Social media is an essential component of our personal and professional lives. We use it extensively to share various things, including our opinions on daily topics and feelings about different subjects. This sharing of posts provides insights into someone’s current emotions. In artificial intelligence (AI) and deep learning (DL), researchers emphasize opinion mining and analysis of sentiment, particularly on social media platforms such as Twitter (currently known as X), which has a global user base. This research work revolves explicitly around a comparison between two popular approaches: Lexicon-based and Deep learning-based Approaches. To conduct this study, this study has used a Twitter dataset called sentiment140, which contains over 1.5 million data points. The primary focus was the Long Short-Term Memory (LSTM) deep learning sequence model. In the beginning, we used particular techniques to preprocess the data. The dataset is divided into training and test data. We evaluated the performance of our model using the test data. Simultaneously, we have applied the lexicon-based approach to the same test data and recorded the outputs. Finally, we compared the two approaches by creating confusion matrices based on their respective outputs. This allows us to assess their precision, recall, and F1-Score, enabling us to determine which approach yields better accuracy. This research achieved 98% model accuracy for deep learning algorithms and 95% model accuracy for the lexicon-based approach.展开更多
Vendor lock-in can occur at any layer of the cloud stack-Infrastructure,Platform,and Software-as-a-service.This paper covers the vendor lock-in issue at Platform as a Service(PaaS)level where applications can be creat...Vendor lock-in can occur at any layer of the cloud stack-Infrastructure,Platform,and Software-as-a-service.This paper covers the vendor lock-in issue at Platform as a Service(PaaS)level where applications can be created,deployed,and managed without worrying about the underlying infrastructure.These applications and their persisted data on one PaaS provider are not easy to port to another provider.To overcome this issue,we propose a middleware to abstract and make the database services as cloud-agnostic.The middleware supports several SQL and NoSQL data stores that can be hosted and ported among disparate PaaS providers.It facilitates the developers with data portability and data migration among relational and NoSQL-based cloud databases.NoSQL databases are fundamental to endure Big Data applications as they support the handling of an enormous volume of highly variable data while assuring fault tolerance,availability,and scalability.The implementation of the middleware depicts that using it alleviates the efforts of rewriting the application code while changing the backend database system.A working protocol of a migration tool has been developed using this middleware to facilitate the migration of the database(move existing data from a database on one cloud to a new database even on a different cloud).Although the middleware adds some overhead compared to the native code for the cloud services being used,the experimental evaluation on Twitter(a Big Data application)data set,proves this overhead is negligible.展开更多
基金The Deanship of ScientificResearch (DSR)at King Abdulaziz University,Jeddah,Saudi Arabia has funded this project,under Grant No. (FP-205-43).
文摘The outbreak of the pandemic,caused by Coronavirus Disease 2019(COVID-19),has affected the daily activities of people across the globe.During COVID-19 outbreak and the successive lockdowns,Twitter was heavily used and the number of tweets regarding COVID-19 increased tremendously.Several studies used Sentiment Analysis(SA)to analyze the emotions expressed through tweets upon COVID-19.Therefore,in current study,a new Artificial Bee Colony(ABC)with Machine Learning-driven SA(ABCMLSA)model is developed for conducting Sentiment Analysis of COVID-19 Twitter data.The prime focus of the presented ABCML-SA model is to recognize the sentiments expressed in tweets made uponCOVID-19.It involves data pre-processing at the initial stage followed by n-gram based feature extraction to derive the feature vectors.For identification and classification of the sentiments,the Support Vector Machine(SVM)model is exploited.At last,the ABC algorithm is applied to fine tune the parameters involved in SVM.To demonstrate the improved performance of the proposed ABCML-SA model,a sequence of simulations was conducted.The comparative assessment results confirmed the effectual performance of the proposed ABCML-SA model over other approaches.
基金funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2022R300),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Today social media became a communication line among people to share their happiness,sadness,and anger with their end-users.It is necessary to know people’s emotions are very important to identify depressed people from their messages.Early depression detection helps to save people’s lives and other dangerous mental diseases.There are many intelligent algorithms for predicting depression with high accuracy,but they lack the definition of such cases.Several machine learning methods help to identify depressed people.But the accuracy of existing methods was not satisfactory.To overcome this issue,the deep learning method is used in the proposed method for depression detection.In this paper,a novel Deep Learning Multi-Aspect Depression Detection with Hierarchical Atten-tion Network(MDHAN)is used for classifying the depression data.Initially,the Twitter data was preprocessed by tokenization,punctuation mark removal,stop word removal,stemming,and lemmatization.The Adaptive Particle and grey Wolf optimization methods are used for feature selection.The MDHAN classifies the Twitter data and predicts the depressed and non-depressed users.Finally,the proposed method is compared with existing methods such as Convolutional Neur-al Network(CNN),Support Vector Machine(SVM),Minimum Description Length(MDL),and MDHAN.The suggested MDH-PWO architecture gains 99.86%accuracy,more significant than frequency-based deep learning models,with a lower false-positive rate.The experimental result shows that the proposed method achieves better accuracy,precision,recall,and F1-measure.It also mini-mizes the execution time.
文摘Aflood is a significant damaging natural calamity that causes loss of life and property.Earlier work on the construction offlood prediction models intended to reduce risks,suggest policies,reduce mortality,and limit property damage caused byfloods.The massive amount of data generated by social media platforms such as Twitter opens the door toflood analysis.Because of the real-time nature of Twitter data,some government agencies and authorities have used it to track natural catastrophe events in order to build a more rapid rescue strategy.However,due to the shorter duration of Tweets,it is difficult to construct a perfect prediction model for determiningflood.Machine learning(ML)and deep learning(DL)approaches can be used to statistically developflood prediction models.At the same time,the vast amount of Tweets necessitates the use of a big data analytics(BDA)tool forflood prediction.In this regard,this work provides an optimal deep learning-basedflood forecasting model with big data analytics(ODLFF-BDA)based on Twitter data.The suggested ODLFF-BDA technique intends to anticipate the existence offloods using tweets in a big data setting.The ODLFF-BDA technique comprises data pre-processing to convert the input tweets into a usable format.In addition,a Bidirectional Encoder Representations from Transformers(BERT)model is used to generate emotive contextual embed-ding from tweets.Furthermore,a gated recurrent unit(GRU)with a Multilayer Convolutional Neural Network(MLCNN)is used to extract local data and predict theflood.Finally,an Equilibrium Optimizer(EO)is used tofine-tune the hyper-parameters of the GRU and MLCNN models in order to increase prediction performance.The memory usage is pull down lesser than 3.5 MB,if its compared with the other algorithm techniques.The ODLFF-BDA technique’s performance was validated using a benchmark Kaggle dataset,and thefindings showed that it outperformed other recent approaches significantly.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2022R263)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:22UQU4340237DSR36The authors are thankful to the Deanship of Scientific Research at Najran University for funding thiswork under theResearch Groups Funding program grant code(NU/RG/SERC/11/7).
文摘Arabic is one of the most spoken languages across the globe.However,there are fewer studies concerning Sentiment Analysis(SA)in Arabic.In recent years,the detected sentiments and emotions expressed in tweets have received significant interest.The substantial role played by the Arab region in international politics and the global economy has urged the need to examine the sentiments and emotions in the Arabic language.Two common models are available:Machine Learning and lexicon-based approaches to address emotion classification problems.With this motivation,the current research article develops a Teaching and Learning Optimization with Machine Learning Based Emotion Recognition and Classification(TLBOML-ERC)model for Sentiment Analysis on tweets made in the Arabic language.The presented TLBOML-ERC model focuses on recognising emotions and sentiments expressed in Arabic tweets.To attain this,the proposed TLBOMLERC model initially carries out data pre-processing and a Continuous Bag Of Words(CBOW)-based word embedding process.In addition,Denoising Autoencoder(DAE)model is also exploited to categorise different emotions expressed in Arabic tweets.To improve the efficacy of the DAE model,the Teaching and Learning-based Optimization(TLBO)algorithm is utilized to optimize the parameters.The proposed TLBOML-ERC method was experimentally validated with the help of an Arabic tweets dataset.The obtained results show the promising performance of the proposed TLBOML-ERC model on Arabic emotion classification.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number (PNURSP2022R281)Princess Nourah bint Abdulrahman University,Riyadh,Saudi ArabiaDeanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code: (22UQU4340237DSR61).
文摘Currently,individuals use online social media,namely Facebook or Twitter,for sharing their thoughts and emotions.Detection of emotions on social networking sites’finds useful in several applications in social welfare,commerce,public health,and so on.Emotion is expressed in several means,like facial and speech expressions,gestures,and written text.Emotion recognition in a text document is a content-based classification problem that includes notions from deep learning(DL)and natural language processing(NLP)domains.This article proposes a Deer HuntingOptimizationwithDeep Belief Network Enabled Emotion Classification(DHODBN-EC)on English Twitter Data in this study.The presented DHODBN-EC model aims to examine the existence of distinct emotion classes in tweets.At the introductory level,the DHODBN-EC technique pre-processes the tweets at different levels.Besides,the word2vec feature extraction process is applied to generate the word embedding process.For emotion classification,the DHODBN-EC model utilizes the DBN model,which helps to determine distinct emotion class labels.Lastly,the DHO algorithm is leveraged for optimal hyperparameter adjustment of the DBN technique.An extensive range of experimental analyses can be executed to demonstrate the enhanced performance of the DHODBN-EC approach.A comprehensive comparison study exhibited the improvements of the DHODBN-EC model over other approaches with increased accuracy of 96.67%.
文摘This study is an exploratory analysis of applying natural language processing techniques such as Term Frequency-Inverse Document Frequency and Sentiment Analysis on Twitter data. The uniqueness of this work is established by determining the overall sentiment of a politician’s tweets based on TF-IDF values of terms used in their published tweets. By calculating the TF-IDF value of terms from the corpus, this work displays the correlation between TF-IDF score and polarity. The results of this work show that calculating the TF-IDF score of the corpus allows for a more accurate representation of the overall polarity since terms are given a weight based on their uniqueness and relevance rather than just the frequency at which they appear in the corpus.
文摘Social media data created a paradigm shift in assessing situational awareness during a natural disaster or emergencies such as wildfire, hurricane, tropical storm etc. Twitter as an emerging data source is an effective and innovative digital platform to observe trend from social media users’ perspective who are direct or indirect witnesses of the calamitous event. This paper aims to collect and analyze twitter data related to the recent wildfire in California to perform a trend analysis by classifying firsthand and credible information from Twitter users. This work investigates tweets on the recent wildfire in California and classifies them based on witnesses into two types: 1) direct witnesses and 2) indirect witnesses. The collected and analyzed information can be useful for law enforcement agencies and humanitarian organizations for communication and verification of the situational awareness during wildfire hazards. Trend analysis is an aggregated approach that includes sentimental analysis and topic modeling performed through domain-expert manual annotation and machine learning. Trend analysis ultimately builds a fine-grained analysis to assess evacuation routes and provide valuable information to the firsthand emergency responders<span style="font-family:Verdana;">.</span>
文摘In March 2021,we witnessed a surge in Bitcoin price.The cause seemed to be a tweet by Elon Musk.Are other blockchains as sensitive to social media as Bitcoin?And more precisely,could Ethereum's popularity be explained using social media data?This work aims to explore the determinants of Ethereum's popularity.We use both data from Etherscan to retrieve the relevant historic Ethereum factors and Twitter data.Our sample consists of data ranging from 2015 to 2022.We use Ordinary Least Squares to assess the relationship between these factors(Ethereum characteristics and Twitter data)and Ethereum's popularity.Our findings show that Ethereum's popularity—translated here by the number of daily new addresses—is related to the following elements:the Ether(ETH)price,the transaction fees,and the polarity of tweets related to Ethereum.The results could have multiple practical implications for both researchers and practitioners.First of all,we believe that it will enable readers to better understand the technology of Ethereum and its stake.Secondly,it will help the community identify pointers for anticipating or explaining the popularity of existing or future platforms.And finally,the results could help in understanding the factors facilitating the design of future platforms.
基金supported by Deputyship for Research&Innovation,Ministry of Education in Saudi Arabia,for funding this research work through project number 959.
文摘Social media,like Twitter,is a data repository,and people exchange views on global issues like the COVID-19 pandemic.Social media has been shown to influence the low acceptance of vaccines.This work aims to identify public sentiments concerning the COVID-19 vaccines and better understand the individual’s sensitivities and feelings that lead to achievement.This work proposes a method to analyze the opinion of an individual’s tweet about the COVID-19 vaccines.This paper introduces a sigmoidal particle swarm optimization(SPSO)algorithm.First,the performance of SPSO is measured on a set of 12 benchmark problems,and later it is deployed for selecting optimal text features and categorizing sentiment.The proposed method uses TextBlob and VADER for sentiment analysis,CountVectorizer,and term frequency-inverse document frequency(TF-IDF)vectorizer for feature extraction,followed by SPSO-based feature selection.The Covid-19 vaccination tweets dataset was created and used for training,validating,and testing.The proposed approach outperformed considered algorithms in terms of accuracy.Additionally,we augmented the newly created dataset to make it balanced to increase performance.A classical support vector machine(SVM)gives better accuracy for the augmented dataset without a feature selection algorithm.It shows that augmentation improves the overall accuracy of tweet analysis.After the augmentation performance of PSO and SPSO is improved by almost 7%and 5%,respectively,it is observed that simple SVMwith 10-fold cross-validation significantly improved compared to the primary dataset.
文摘One of the main purposes for which people use Twitter is to share emotions with others. Users can easily post a message as a short text when they experience emotions such as pleasure or sadness. Such tweet serves to acquire empathy from followers, and can possibly influence others' emotions. In this study, we analyze the influence of emotional behaviors to user relationships based on Twitter data using two dictionaries of emotional words. Emotion scores are calculated via keyword matching. Moreover, we design three experiments with different settings: calculate the average emotion score of a user with random sampling, calculate the average emotion score using all emotional tweets, and calculate the average emotion score using emotional tweets, excluding users of few emotional tweets. We evaluate the influence of emotional behaviors to user relationships through the Brunner-Munzel test. The result shows that a positive user is more active than a negative user in constructing user relationships in a specific condition.
文摘Health authorities worldwide strive to detect Influenza prevalence as early as possible in order to prepare for it and minimize its impacts. To this end, we address the Influenza prevalence surveillance and prediction problem. In this paper, we develop a new Influenza prevalence prediction model, called Tweetluenza, to predict the spread of the Influenza in real time using cross-lingual data harvested from Twitter data streams with emphases on the United Arab Emirates(UAE). Based on the features of tweets, Tweetluenza filters the Influenza tweets and classifies them into two classes, reporting and non-reporting. To monitor the growth of Influenza, the reporting tweets were employed. Furthermore, a linear regression model leverages the reporting tweets to predict the Influenza-related hospital visits in the future. We evaluated Tweetluenza empirically to study its feasibility and compared the results with the actual hospital visits recorded by the UAE Ministry of Health. The results of our experiments demonstrate the practicality of Tweetluenza, which was verified by the high correlation between the Influenza-related Twitter data and hospital visits due to Influenza. Furthermore, the evaluation of the analysis and prediction of Influenza shows that combining English and Arabic tweets improves the correlation results.
文摘With the huge increase in popularity of Twitter in recent years, the ability to draw information regarding public sentiment from Twitter data has become an area of immense interest. Numerous methods of determining the sentiment of tweets, both in general and in regard to a specific topic, have been developed, however most of these functions are in a batch learning environment where instances may be passed over multiple times. Since Twitter data in real world situations are far similar to a stream environment, we proposed several algorithms which classify the sentiment of tweets in a data stream. We were able to determine whether a tweet was subjective or objective with an error rate as low as 0.24 and an F-score as high as 0.85. For the determination of positive or negative sentiment in subjective tweets, an error rate as low as 0.23 and an F-score as high as 0.78 were achieved.
文摘Social media plays a crucial role in the organization of massive social movements. However, the sheer quantity of data generated by the events as well as the data collection restrictions that researchers encounter, leads to a series of challenges for researchers who want to analyze dynamic public discourse and opinion in response to and in the creation of world events. In this paper we present gatherTweet, a Python package that helps researchers efficiently collect social media data for events that are composed of many decentralized actions (across both space and time). The package is useful for studies that require analysis of the organizational or baseline messaging before an action, the action itself, and the effects of the action on subsequent public discourse. By capturing these aspects of world events gatherTweet enables the study of events and actions like protests, natural disasters, and elections.
文摘Social media is an essential component of our personal and professional lives. We use it extensively to share various things, including our opinions on daily topics and feelings about different subjects. This sharing of posts provides insights into someone’s current emotions. In artificial intelligence (AI) and deep learning (DL), researchers emphasize opinion mining and analysis of sentiment, particularly on social media platforms such as Twitter (currently known as X), which has a global user base. This research work revolves explicitly around a comparison between two popular approaches: Lexicon-based and Deep learning-based Approaches. To conduct this study, this study has used a Twitter dataset called sentiment140, which contains over 1.5 million data points. The primary focus was the Long Short-Term Memory (LSTM) deep learning sequence model. In the beginning, we used particular techniques to preprocess the data. The dataset is divided into training and test data. We evaluated the performance of our model using the test data. Simultaneously, we have applied the lexicon-based approach to the same test data and recorded the outputs. Finally, we compared the two approaches by creating confusion matrices based on their respective outputs. This allows us to assess their precision, recall, and F1-Score, enabling us to determine which approach yields better accuracy. This research achieved 98% model accuracy for deep learning algorithms and 95% model accuracy for the lexicon-based approach.
文摘Vendor lock-in can occur at any layer of the cloud stack-Infrastructure,Platform,and Software-as-a-service.This paper covers the vendor lock-in issue at Platform as a Service(PaaS)level where applications can be created,deployed,and managed without worrying about the underlying infrastructure.These applications and their persisted data on one PaaS provider are not easy to port to another provider.To overcome this issue,we propose a middleware to abstract and make the database services as cloud-agnostic.The middleware supports several SQL and NoSQL data stores that can be hosted and ported among disparate PaaS providers.It facilitates the developers with data portability and data migration among relational and NoSQL-based cloud databases.NoSQL databases are fundamental to endure Big Data applications as they support the handling of an enormous volume of highly variable data while assuring fault tolerance,availability,and scalability.The implementation of the middleware depicts that using it alleviates the efforts of rewriting the application code while changing the backend database system.A working protocol of a migration tool has been developed using this middleware to facilitate the migration of the database(move existing data from a database on one cloud to a new database even on a different cloud).Although the middleware adds some overhead compared to the native code for the cloud services being used,the experimental evaluation on Twitter(a Big Data application)data set,proves this overhead is negligible.