Sentiment analysis is becoming increasingly important in today’s digital age, with social media being a significantsource of user-generated content. The development of sentiment lexicons that can support languages ot...Sentiment analysis is becoming increasingly important in today’s digital age, with social media being a significantsource of user-generated content. The development of sentiment lexicons that can support languages other thanEnglish is a challenging task, especially for analyzing sentiment analysis in social media reviews. Most existingsentiment analysis systems focus on English, leaving a significant research gap in other languages due to limitedresources and tools. This research aims to address this gap by building a sentiment lexicon for local languages,which is then used with a machine learning algorithm for efficient sentiment analysis. In the first step, a lexiconis developed that includes five languages: Urdu, Roman Urdu, Pashto, Roman Pashto, and English. The sentimentscores from SentiWordNet are associated with each word in the lexicon to produce an effective sentiment score. Inthe second step, a naive Bayesian algorithm is applied to the developed lexicon for efficient sentiment analysis ofRoman Pashto. Both the sentiment lexicon and sentiment analysis steps were evaluated using information retrievalmetrics, with an accuracy score of 0.89 for the sentiment lexicon and 0.83 for the sentiment analysis. The resultsshowcase the potential for improving software engineering tasks related to user feedback analysis and productdevelopment.展开更多
Metaphors We Live By lieve that metaphor is not only the form of human language, but the fundamental mode of human thought and behavior. This thesiswill make further analyses of the vocabularies reflecting sexism from...Metaphors We Live By lieve that metaphor is not only the form of human language, but the fundamental mode of human thought and behavior. This thesiswill make further analyses of the vocabularies reflecting sexism from three aspects:animal metaphor, plant metaphor and food meta-phor. In addition, this paper makes a simple analysis of the causes of the phenomenon. Thereby people can have a better under-standing of the language system and the appropriate usage of language.展开更多
Named entity recognition(NER)is a fundamental task of information extraction(IE),and it has attracted considerable research attention in recent years.The abundant annotated English NER datasets have significantly prom...Named entity recognition(NER)is a fundamental task of information extraction(IE),and it has attracted considerable research attention in recent years.The abundant annotated English NER datasets have significantly promoted the NER research in the English field.By contrast,much fewer efforts are made to the Chinese NER research,especially in the scientific domain,due to the scarcity of Chinese NER datasets.To alleviate this problem,we present aChinese scientificNER dataset–SciCN,which contains entity annotations of titles and abstracts derived from 3,500 scientific papers.We manually annotate a total of 62,059 entities,and these entities are classified into six types.Compared to English scientific NER datasets,SciCN has a larger scale and is more diverse,for it not only contains more paper abstracts but these abstracts are derived from more research fields.To investigate the properties of SciCN and provide baselines for future research,we adapt a number of previous state-of-theart Chinese NER models to evaluate SciCN.Experimental results show that SciCN is more challenging than other Chinese NER datasets.In addition,previous studies have proven the effectiveness of using lexicons to enhance Chinese NER models.Motivated by this fact,we provide a scientific domain-specific lexicon.Validation results demonstrate that our lexicon delivers better performance gains than lexicons of other domains.We hope that the SciCN dataset and the lexicon will enable us to benchmark the NER task regarding the Chinese scientific domain and make progress for future research.The dataset and lexicon are available at:https://github.com/yangjingla/SciCN.git.展开更多
Social media is an essential component of our personal and professional lives. We use it extensively to share various things, including our opinions on daily topics and feelings about different subjects. This sharing ...Social media is an essential component of our personal and professional lives. We use it extensively to share various things, including our opinions on daily topics and feelings about different subjects. This sharing of posts provides insights into someone’s current emotions. In artificial intelligence (AI) and deep learning (DL), researchers emphasize opinion mining and analysis of sentiment, particularly on social media platforms such as Twitter (currently known as X), which has a global user base. This research work revolves explicitly around a comparison between two popular approaches: Lexicon-based and Deep learning-based Approaches. To conduct this study, this study has used a Twitter dataset called sentiment140, which contains over 1.5 million data points. The primary focus was the Long Short-Term Memory (LSTM) deep learning sequence model. In the beginning, we used particular techniques to preprocess the data. The dataset is divided into training and test data. We evaluated the performance of our model using the test data. Simultaneously, we have applied the lexicon-based approach to the same test data and recorded the outputs. Finally, we compared the two approaches by creating confusion matrices based on their respective outputs. This allows us to assess their precision, recall, and F1-Score, enabling us to determine which approach yields better accuracy. This research achieved 98% model accuracy for deep learning algorithms and 95% model accuracy for the lexicon-based approach.展开更多
Speaker variability is an important source of speech variations which makes continuous speech recognition a difficult task.Adapting automatic speech recognition(ASR) models to the speaker variations is a well-known st...Speaker variability is an important source of speech variations which makes continuous speech recognition a difficult task.Adapting automatic speech recognition(ASR) models to the speaker variations is a well-known strategy to cope with the challenge.Almost all such techniques focus on developing adaptation solutions within the acoustic models of the ASR systems.Although variations of the acoustic features constitute an important portion of the inter-speaker variations,they do not cover variations at the phonetic level.Phonetic variations are known to form an important part of variations which are influenced by both micro-segmental and suprasegmental factors.Inter-speaker phonetic variations are influenced by the structure and anatomy of a speaker's articulatory system and also his/her speaking style which is driven by many speaker background characteristics such as accent,gender,age,socioeconomic and educational class.The effect of inter-speaker variations in the feature space may cause explicit phone recognition errors.These errors can be compensated later by having appropriate pronunciation variants for the lexicon entries which consider likely phone misclassifications besides pronunciation.In this paper,we introduce speaker adaptive dynamic pronunciation models,which generate different lexicons for various speaker clusters and different ranges of speech rate.The models are hybrids of speaker adapted contextual rules and dynamic generalized decision trees,which take into account word phonological structures,rate of speech,unigram probabilities and stress to generate pronunciation variants of words.Employing the set of speaker adapted dynamic lexicons in a Farsi(Persian) continuous speech recognition task results in word error rate reductions of as much as 10.1% in a speaker-dependent scenario and 7.4% in a speaker-independent scenario.展开更多
Sentiment analysis is based on the orientation of user attitudes and satisfaction towards services and subjects.Different methods and techniques have been introduced to analyze sentiments for obtaining high accuracy.T...Sentiment analysis is based on the orientation of user attitudes and satisfaction towards services and subjects.Different methods and techniques have been introduced to analyze sentiments for obtaining high accuracy.The sentiment analysis accuracy depends mainly on supervised and unsupervised mechanisms.Supervised mechanisms are based on machine learning algorithms that achieve moderate or high accuracy but the manual annotation of data is considered a time-consuming process.In unsupervised mechanisms,a lexicon is constructed for storing polarity terms.The accuracy of analyzing data is considered moderate or low if the lexicon contains small terms.In addition,most research methodologies analyze datasets using only 3-weight polarity that can mainly affect the performance of the analysis process.Applying both methods for obtaining high accuracy and efficiency with low user intervention during the analysis process is considered a challenging process.This paper provides a comprehensive evaluation of polarity weights and mechanisms for recent sentiment analysis research.A semi-supervised framework is applied for processing data using both lexicon and machine learning algorithms.An interactive sentiment analysis algorithm is proposed for distributing multi-weight polarities on Arabic lexicons that contain high morphological and linguistic terms.An enhanced scaling algorithm is embedded in the multi-weight algorithm to assign recommended weight polarities automatically.The experimental results are conducted on two datasets to measure the over-all accuracy of proposed algorithms that achieved high results when compared to machine learning algorithms.展开更多
The COVID-19 pandemic has spread globally,resulting in financialinstability in many countries and reductions in the per capita grossdomestic product.Sentiment analysis is a cost-effective method for acquiringsentiment...The COVID-19 pandemic has spread globally,resulting in financialinstability in many countries and reductions in the per capita grossdomestic product.Sentiment analysis is a cost-effective method for acquiringsentiments based on household income loss,as expressed on social media.However,limited research has been conducted in this domain using theLexDeep approach.This study aimed to explore social trend analytics usingLexDeep,which is a hybrid sentiment analysis technique,on Twitter to capturethe risk of household income loss during the COVID-19 pandemic.First,tweet data were collected using Twint with relevant keywords before(9 March2019 to 17 March 2020)and during(18 March 2020 to 21 August 2021)thepandemic.Subsequently,the tweets were annotated using VADER(lexiconbased)and fed into deep learning classifiers,and experiments were conductedusing several embeddings,namely simple embedding,Global Vectors,andWord2Vec,to classify the sentiments expressed in the tweets.The performanceof each LexDeep model was evaluated and compared with that of a supportvector machine(SVM).Finally,the unemployment rates before and duringCOVID-19 were analysed to gain insights into the differences in unemploymentpercentages through social media input and analysis.The resultsdemonstrated that all LexDeep models with simple embedding outperformedthe SVM.This confirmed the superiority of the proposed LexDeep modelover a classical machine learning classifier in performing sentiment analysistasks for domain-specific sentiments.In terms of the risk of income loss,the unemployment issue is highly politicised on both the regional and globalscales;thus,if a country cannot combat this issue,the global economy will alsobe affected.Future research should develop a utility maximisation algorithmfor household welfare evaluation,given the percentage risk of income lossowing to COVID-19.展开更多
People utilize microblogs and other social media platforms to express their thoughts and feelings regarding current events,public products and the latest affairs.People share their thoughts and feelings about various ...People utilize microblogs and other social media platforms to express their thoughts and feelings regarding current events,public products and the latest affairs.People share their thoughts and feelings about various topics,including products,news,blogs,etc.In user reviews and tweets,sentiment analysis is used to discover opinions and feelings.Sentiment polarity is a term used to describe how sentiment is represented.Positive,neutral and negative are all examples of it.This area is still in its infancy and needs several critical upgrades.Slang and hidden emotions can detract from the accuracy of traditional techniques.Existing methods only evaluate the polarity strength of the sentiment words when dividing them into positive and negative categories.Some existing strategies are domain-specific.The proposed model incorporates aspect extraction,association rule mining and the deep learning technique Bidirectional EncoderRepresentations from Transformers(BERT).Aspects are extracted using Part of Speech Tagger and association rulemining is used to associate aspects with opinion words.Later,classification was performed using BER.The proposed approach attained an average of 89.45%accuracy,88.45%precision and 85.98%recall on different datasets of products and Twitter.The results showed that the proposed technique achieved better than state-of-the-art sentiment analysis techniques.展开更多
Millions of people are connecting and exchanging information on social media platforms,where interpersonal interactions are constantly being shared.However,due to inaccurate or misleading information about the COVID-1...Millions of people are connecting and exchanging information on social media platforms,where interpersonal interactions are constantly being shared.However,due to inaccurate or misleading information about the COVID-19 pandemic,social media platforms became the scene of tense debates between believers and doubters.Healthcare professionals and public health agencies also use social media to inform the public about COVID-19 news and updates.However,they occasionally have trouble managing massive pandemic-related rumors and frauds.One reason is that people share and engage,regardless of the information source,by assuming the content is unquestionably true.On Twitter,users use words and phrases literally to convey their views or opinion.However,other users choose to utilize idioms or proverbs that are implicit and indirect to make a stronger impression on the audience or perhaps to catch their attention.Idioms and proverbs are figurative expressions with a thematically coherent totality that cannot understand literally.Despite more than 10%of tweets containing idioms or slang,most sentiment analysis research focuses on the accuracy enhancement of various classification algorithms.However,little attention would decipher the hidden sentiments of the expressed idioms in tweets.This paper proposes a novel data expansion strategy for categorizing tweets concerning COVID-19.The following are the benefits of the suggested method:1)no transformer fine-tuning is necessary,2)the technique solves the fundamental challenge of the manual data labeling process by automating the construction and annotation of the sentiment lexicon,3)the method minimizes the error rate in annotating the lexicon,and drastically improves the tweet sentiment classification’s accuracy performance.展开更多
Sentiment Analysis(SA)is one of the Machine Learning(ML)techniques that has been investigated by several researchers in recent years,especially due to the evolution of novel data collection methods focused on social m...Sentiment Analysis(SA)is one of the Machine Learning(ML)techniques that has been investigated by several researchers in recent years,especially due to the evolution of novel data collection methods focused on social media.In literature,it has been reported that SA data is created for English language in excess of any other language.It is challenging to perform SA for Arabic Twitter data owing to informal nature and rich morphology of Arabic language.An earlier study conducted upon SA for Arabic Twitter focused mostly on automatic extraction of the features from the text.Neural word embedding has been employed in literature,since it is less labor-intensive than automatic feature engineering.By ignoring the context of sentiment,most of the word-embedding models follow syntactic data of words.The current study presents a new Dragonfly Optimization with Deep Learning Enabled Sentiment Analysis for Arabic Tweets(DFODLSAAT)model.The aim of the presented DFODL-SAAT model is to distinguish the sentiments from opinions that are tweeted in Arabic language.At first,data cleaning and pre-processing steps are performed to convert the input tweets into a useful format.In addition,TF-IDF model is exploited as a feature extractor to generate the feature vectors.Besides,Attention-based Bidirectional Long Short Term Memory(ABLSTM)technique is applied for identification and classification of sentiments.At last,the hyperparameters of ABLSTM model are optimized using DFO algorithm.The performance of the proposed DFODL-SAAT model was validated using the benchmark dataset and the outcomes were investigated under different aspects.The experimental outcomes highlight the superiority of DFODL-SAAT model over recent approaches.展开更多
Automatic extraction of the patient’s health information from the unstructured data concerning the discharge summary remains challenging.Discharge summary related documents contain various aspects of the patient heal...Automatic extraction of the patient’s health information from the unstructured data concerning the discharge summary remains challenging.Discharge summary related documents contain various aspects of the patient health condition to examine the quality of treatment and thereby help improve decision-making in the medical field.Using a sentiment dictionary and feature engineering,the researchers primarily mine semantic text features.However,choosing and designing features requires a lot of manpower.The proposed approach is an unsupervised deep learning model that learns a set of clusters embedded in the latent space.A composite model including Active Learning(AL),Convolutional Neural Network(CNN),BiGRU,and Multi-Attention,called ACBMA in this research,is designed to measure the quality of treatment based on discharge summaries text sentiment detection.CNN is utilized for extracting the set of local features of text vectors.Then BiGRU network was utilized to extract the text’s global features to solve the issues that a single CNN cannot obtain global semantic information and the traditional Recurrent Neural Network(RNN)gradient disappearance.Experiments prove that the ACBMA method can demonstrate the effectiveness of the suggested method,achieve comparable results to state-of-arts methods in sentiment detection,and outperform them with accurate benchmarks.Finally,several algorithm studies ultimately determined that the ACBMA method is more precise for discharge summaries sentiment analysis.展开更多
In order to improve the clustering results and select in the results, the ontology semantic is combined with document clustering. A new document clustering algorithm based WordNet in the phrase of document processing ...In order to improve the clustering results and select in the results, the ontology semantic is combined with document clustering. A new document clustering algorithm based WordNet in the phrase of document processing is proposed. First, every word vector by new entities is extended after the documents are represented by tf-idf. Then the feature extracting algorithm is applied for the documents. Finally, the algorithm of ontology aggregation clustering (OAC) is proposed to improve the result of document clustering. Experiments are based on the data set of Reuters 20 News Group, and experimental results are compared with the results obtained by mutual information(MI). The conclusion draws that the proposed algorithm of document clustering based on ontology is better than the other existed clustering algorithms such as MNB, CLUTO, co-clustering, etc.展开更多
Every language has its own unique ways of negation and English is no exception. More importance should be attached to when a negative English sentence is translated into its Chinese equivalent. Negation in English can...Every language has its own unique ways of negation and English is no exception. More importance should be attached to when a negative English sentence is translated into its Chinese equivalent. Negation in English can be realized in many different ways. In the first place, the different types of negation in English will be analyzed. In addition, the affixes and lexicons used to denote negation will be investigated. The last part is mainly concerning the idioms and other expressions which denote negative meanings. In order to make the views much more clearly, some Chinese equivalents of the English sentences will be offered here.展开更多
Lexicon is the foundation of any language learning. Ignorance of lexical learning will have a direct influence on the ability of learners' listening, speaking, reading and translation. In response to the current situ...Lexicon is the foundation of any language learning. Ignorance of lexical learning will have a direct influence on the ability of learners' listening, speaking, reading and translation. In response to the current situation of college English teaching ,this essay lays emphasis on the importance of lexical teaching in college English teaching and probes into several teaching theories and approaches to which college English teachers should attach importance in their lexical teaching.展开更多
This paper addresses a very general problem—the relationship between implicit and explicit forms of meaning–that is as old as scholarly attention to language in use.It first tries to define the problem.Then it prese...This paper addresses a very general problem—the relationship between implicit and explicit forms of meaning–that is as old as scholarly attention to language in use.It first tries to define the problem.Then it presents some elementary aspects of the way in which the problem has been dealt with in the pragmatic literature.This is followed by an excursion into the world of related natural-language concepts,as reflected in the English metapragmatic lexicon.Finally,the paper tries to make a contribution to a solution by proposing a threedimensional matrix to account for what might look like a one-dimensional gradable scale from implicit to explicit.An attempt is made to illustrate the potential usefulness of the suggestions.Conclusions mainly take the form of perspectives for future research.展开更多
For a long time,there exists a considerable amount of sexism in English,especially in English lexicon.In this paper,the author will discuss some presentations of sexism in English lexicon,and try to analyze some facto...For a long time,there exists a considerable amount of sexism in English,especially in English lexicon.In this paper,the author will discuss some presentations of sexism in English lexicon,and try to analyze some factors which have a great influence on the existence of sexism in English.This paper wants to arouse more and more people to realize the importance and urgency of desexism.展开更多
基金Researchers supporting Project Number(RSPD2024R576),King Saud University,Riyadh,Saudi Arabia.
文摘Sentiment analysis is becoming increasingly important in today’s digital age, with social media being a significantsource of user-generated content. The development of sentiment lexicons that can support languages other thanEnglish is a challenging task, especially for analyzing sentiment analysis in social media reviews. Most existingsentiment analysis systems focus on English, leaving a significant research gap in other languages due to limitedresources and tools. This research aims to address this gap by building a sentiment lexicon for local languages,which is then used with a machine learning algorithm for efficient sentiment analysis. In the first step, a lexiconis developed that includes five languages: Urdu, Roman Urdu, Pashto, Roman Pashto, and English. The sentimentscores from SentiWordNet are associated with each word in the lexicon to produce an effective sentiment score. Inthe second step, a naive Bayesian algorithm is applied to the developed lexicon for efficient sentiment analysis ofRoman Pashto. Both the sentiment lexicon and sentiment analysis steps were evaluated using information retrievalmetrics, with an accuracy score of 0.89 for the sentiment lexicon and 0.83 for the sentiment analysis. The resultsshowcase the potential for improving software engineering tasks related to user feedback analysis and productdevelopment.
文摘Metaphors We Live By lieve that metaphor is not only the form of human language, but the fundamental mode of human thought and behavior. This thesiswill make further analyses of the vocabularies reflecting sexism from three aspects:animal metaphor, plant metaphor and food meta-phor. In addition, this paper makes a simple analysis of the causes of the phenomenon. Thereby people can have a better under-standing of the language system and the appropriate usage of language.
基金This research was supported by the National Key Research and Development Program[2020YFB1006302].
文摘Named entity recognition(NER)is a fundamental task of information extraction(IE),and it has attracted considerable research attention in recent years.The abundant annotated English NER datasets have significantly promoted the NER research in the English field.By contrast,much fewer efforts are made to the Chinese NER research,especially in the scientific domain,due to the scarcity of Chinese NER datasets.To alleviate this problem,we present aChinese scientificNER dataset–SciCN,which contains entity annotations of titles and abstracts derived from 3,500 scientific papers.We manually annotate a total of 62,059 entities,and these entities are classified into six types.Compared to English scientific NER datasets,SciCN has a larger scale and is more diverse,for it not only contains more paper abstracts but these abstracts are derived from more research fields.To investigate the properties of SciCN and provide baselines for future research,we adapt a number of previous state-of-theart Chinese NER models to evaluate SciCN.Experimental results show that SciCN is more challenging than other Chinese NER datasets.In addition,previous studies have proven the effectiveness of using lexicons to enhance Chinese NER models.Motivated by this fact,we provide a scientific domain-specific lexicon.Validation results demonstrate that our lexicon delivers better performance gains than lexicons of other domains.We hope that the SciCN dataset and the lexicon will enable us to benchmark the NER task regarding the Chinese scientific domain and make progress for future research.The dataset and lexicon are available at:https://github.com/yangjingla/SciCN.git.
文摘Social media is an essential component of our personal and professional lives. We use it extensively to share various things, including our opinions on daily topics and feelings about different subjects. This sharing of posts provides insights into someone’s current emotions. In artificial intelligence (AI) and deep learning (DL), researchers emphasize opinion mining and analysis of sentiment, particularly on social media platforms such as Twitter (currently known as X), which has a global user base. This research work revolves explicitly around a comparison between two popular approaches: Lexicon-based and Deep learning-based Approaches. To conduct this study, this study has used a Twitter dataset called sentiment140, which contains over 1.5 million data points. The primary focus was the Long Short-Term Memory (LSTM) deep learning sequence model. In the beginning, we used particular techniques to preprocess the data. The dataset is divided into training and test data. We evaluated the performance of our model using the test data. Simultaneously, we have applied the lexicon-based approach to the same test data and recorded the outputs. Finally, we compared the two approaches by creating confusion matrices based on their respective outputs. This allows us to assess their precision, recall, and F1-Score, enabling us to determine which approach yields better accuracy. This research achieved 98% model accuracy for deep learning algorithms and 95% model accuracy for the lexicon-based approach.
文摘Speaker variability is an important source of speech variations which makes continuous speech recognition a difficult task.Adapting automatic speech recognition(ASR) models to the speaker variations is a well-known strategy to cope with the challenge.Almost all such techniques focus on developing adaptation solutions within the acoustic models of the ASR systems.Although variations of the acoustic features constitute an important portion of the inter-speaker variations,they do not cover variations at the phonetic level.Phonetic variations are known to form an important part of variations which are influenced by both micro-segmental and suprasegmental factors.Inter-speaker phonetic variations are influenced by the structure and anatomy of a speaker's articulatory system and also his/her speaking style which is driven by many speaker background characteristics such as accent,gender,age,socioeconomic and educational class.The effect of inter-speaker variations in the feature space may cause explicit phone recognition errors.These errors can be compensated later by having appropriate pronunciation variants for the lexicon entries which consider likely phone misclassifications besides pronunciation.In this paper,we introduce speaker adaptive dynamic pronunciation models,which generate different lexicons for various speaker clusters and different ranges of speech rate.The models are hybrids of speaker adapted contextual rules and dynamic generalized decision trees,which take into account word phonological structures,rate of speech,unigram probabilities and stress to generate pronunciation variants of words.Employing the set of speaker adapted dynamic lexicons in a Farsi(Persian) continuous speech recognition task results in word error rate reductions of as much as 10.1% in a speaker-dependent scenario and 7.4% in a speaker-independent scenario.
基金funded by the Deanship of Scientific Research at Jouf University under Grant No.(DSR-2021-02-0102)。
文摘Sentiment analysis is based on the orientation of user attitudes and satisfaction towards services and subjects.Different methods and techniques have been introduced to analyze sentiments for obtaining high accuracy.The sentiment analysis accuracy depends mainly on supervised and unsupervised mechanisms.Supervised mechanisms are based on machine learning algorithms that achieve moderate or high accuracy but the manual annotation of data is considered a time-consuming process.In unsupervised mechanisms,a lexicon is constructed for storing polarity terms.The accuracy of analyzing data is considered moderate or low if the lexicon contains small terms.In addition,most research methodologies analyze datasets using only 3-weight polarity that can mainly affect the performance of the analysis process.Applying both methods for obtaining high accuracy and efficiency with low user intervention during the analysis process is considered a challenging process.This paper provides a comprehensive evaluation of polarity weights and mechanisms for recent sentiment analysis research.A semi-supervised framework is applied for processing data using both lexicon and machine learning algorithms.An interactive sentiment analysis algorithm is proposed for distributing multi-weight polarities on Arabic lexicons that contain high morphological and linguistic terms.An enhanced scaling algorithm is embedded in the multi-weight algorithm to assign recommended weight polarities automatically.The experimental results are conducted on two datasets to measure the over-all accuracy of proposed algorithms that achieved high results when compared to machine learning algorithms.
基金funded by the Deanship of Scientific Research at Princess Nourah bint Abdulrahman University,through the Research Groups Program Grant no.(RGP-1443-0045).
文摘The COVID-19 pandemic has spread globally,resulting in financialinstability in many countries and reductions in the per capita grossdomestic product.Sentiment analysis is a cost-effective method for acquiringsentiments based on household income loss,as expressed on social media.However,limited research has been conducted in this domain using theLexDeep approach.This study aimed to explore social trend analytics usingLexDeep,which is a hybrid sentiment analysis technique,on Twitter to capturethe risk of household income loss during the COVID-19 pandemic.First,tweet data were collected using Twint with relevant keywords before(9 March2019 to 17 March 2020)and during(18 March 2020 to 21 August 2021)thepandemic.Subsequently,the tweets were annotated using VADER(lexiconbased)and fed into deep learning classifiers,and experiments were conductedusing several embeddings,namely simple embedding,Global Vectors,andWord2Vec,to classify the sentiments expressed in the tweets.The performanceof each LexDeep model was evaluated and compared with that of a supportvector machine(SVM).Finally,the unemployment rates before and duringCOVID-19 were analysed to gain insights into the differences in unemploymentpercentages through social media input and analysis.The resultsdemonstrated that all LexDeep models with simple embedding outperformedthe SVM.This confirmed the superiority of the proposed LexDeep modelover a classical machine learning classifier in performing sentiment analysistasks for domain-specific sentiments.In terms of the risk of income loss,the unemployment issue is highly politicised on both the regional and globalscales;thus,if a country cannot combat this issue,the global economy will alsobe affected.Future research should develop a utility maximisation algorithmfor household welfare evaluation,given the percentage risk of income lossowing to COVID-19.
文摘People utilize microblogs and other social media platforms to express their thoughts and feelings regarding current events,public products and the latest affairs.People share their thoughts and feelings about various topics,including products,news,blogs,etc.In user reviews and tweets,sentiment analysis is used to discover opinions and feelings.Sentiment polarity is a term used to describe how sentiment is represented.Positive,neutral and negative are all examples of it.This area is still in its infancy and needs several critical upgrades.Slang and hidden emotions can detract from the accuracy of traditional techniques.Existing methods only evaluate the polarity strength of the sentiment words when dividing them into positive and negative categories.Some existing strategies are domain-specific.The proposed model incorporates aspect extraction,association rule mining and the deep learning technique Bidirectional EncoderRepresentations from Transformers(BERT).Aspects are extracted using Part of Speech Tagger and association rulemining is used to associate aspects with opinion words.Later,classification was performed using BER.The proposed approach attained an average of 89.45%accuracy,88.45%precision and 85.98%recall on different datasets of products and Twitter.The results showed that the proposed technique achieved better than state-of-the-art sentiment analysis techniques.
基金This work was supported in part by the UTAR Research Fund(IPSR/RMC/U TARRF/2020-C1/R01).
文摘Millions of people are connecting and exchanging information on social media platforms,where interpersonal interactions are constantly being shared.However,due to inaccurate or misleading information about the COVID-19 pandemic,social media platforms became the scene of tense debates between believers and doubters.Healthcare professionals and public health agencies also use social media to inform the public about COVID-19 news and updates.However,they occasionally have trouble managing massive pandemic-related rumors and frauds.One reason is that people share and engage,regardless of the information source,by assuming the content is unquestionably true.On Twitter,users use words and phrases literally to convey their views or opinion.However,other users choose to utilize idioms or proverbs that are implicit and indirect to make a stronger impression on the audience or perhaps to catch their attention.Idioms and proverbs are figurative expressions with a thematically coherent totality that cannot understand literally.Despite more than 10%of tweets containing idioms or slang,most sentiment analysis research focuses on the accuracy enhancement of various classification algorithms.However,little attention would decipher the hidden sentiments of the expressed idioms in tweets.This paper proposes a novel data expansion strategy for categorizing tweets concerning COVID-19.The following are the benefits of the suggested method:1)no transformer fine-tuning is necessary,2)the technique solves the fundamental challenge of the manual data labeling process by automating the construction and annotation of the sentiment lexicon,3)the method minimizes the error rate in annotating the lexicon,and drastically improves the tweet sentiment classification’s accuracy performance.
基金The authors are thankful to the Deanship of Scientific Research at Najran University for funding this work under the National Research Priorities funding program,support under code number:NU/NRP/SERC/11/3.
文摘Sentiment Analysis(SA)is one of the Machine Learning(ML)techniques that has been investigated by several researchers in recent years,especially due to the evolution of novel data collection methods focused on social media.In literature,it has been reported that SA data is created for English language in excess of any other language.It is challenging to perform SA for Arabic Twitter data owing to informal nature and rich morphology of Arabic language.An earlier study conducted upon SA for Arabic Twitter focused mostly on automatic extraction of the features from the text.Neural word embedding has been employed in literature,since it is less labor-intensive than automatic feature engineering.By ignoring the context of sentiment,most of the word-embedding models follow syntactic data of words.The current study presents a new Dragonfly Optimization with Deep Learning Enabled Sentiment Analysis for Arabic Tweets(DFODLSAAT)model.The aim of the presented DFODL-SAAT model is to distinguish the sentiments from opinions that are tweeted in Arabic language.At first,data cleaning and pre-processing steps are performed to convert the input tweets into a useful format.In addition,TF-IDF model is exploited as a feature extractor to generate the feature vectors.Besides,Attention-based Bidirectional Long Short Term Memory(ABLSTM)technique is applied for identification and classification of sentiments.At last,the hyperparameters of ABLSTM model are optimized using DFO algorithm.The performance of the proposed DFODL-SAAT model was validated using the benchmark dataset and the outcomes were investigated under different aspects.The experimental outcomes highlight the superiority of DFODL-SAAT model over recent approaches.
基金This work was supported by the National Natural Science Foundation of China(Grant No.U1811262).
文摘Automatic extraction of the patient’s health information from the unstructured data concerning the discharge summary remains challenging.Discharge summary related documents contain various aspects of the patient health condition to examine the quality of treatment and thereby help improve decision-making in the medical field.Using a sentiment dictionary and feature engineering,the researchers primarily mine semantic text features.However,choosing and designing features requires a lot of manpower.The proposed approach is an unsupervised deep learning model that learns a set of clusters embedded in the latent space.A composite model including Active Learning(AL),Convolutional Neural Network(CNN),BiGRU,and Multi-Attention,called ACBMA in this research,is designed to measure the quality of treatment based on discharge summaries text sentiment detection.CNN is utilized for extracting the set of local features of text vectors.Then BiGRU network was utilized to extract the text’s global features to solve the issues that a single CNN cannot obtain global semantic information and the traditional Recurrent Neural Network(RNN)gradient disappearance.Experiments prove that the ACBMA method can demonstrate the effectiveness of the suggested method,achieve comparable results to state-of-arts methods in sentiment detection,and outperform them with accurate benchmarks.Finally,several algorithm studies ultimately determined that the ACBMA method is more precise for discharge summaries sentiment analysis.
基金The National Natural Science Foundation of China(No.60373099),the Natural Science Foundation for Young Scholars of Northeast Normal University (No.20061005)
文摘In order to improve the clustering results and select in the results, the ontology semantic is combined with document clustering. A new document clustering algorithm based WordNet in the phrase of document processing is proposed. First, every word vector by new entities is extended after the documents are represented by tf-idf. Then the feature extracting algorithm is applied for the documents. Finally, the algorithm of ontology aggregation clustering (OAC) is proposed to improve the result of document clustering. Experiments are based on the data set of Reuters 20 News Group, and experimental results are compared with the results obtained by mutual information(MI). The conclusion draws that the proposed algorithm of document clustering based on ontology is better than the other existed clustering algorithms such as MNB, CLUTO, co-clustering, etc.
文摘Every language has its own unique ways of negation and English is no exception. More importance should be attached to when a negative English sentence is translated into its Chinese equivalent. Negation in English can be realized in many different ways. In the first place, the different types of negation in English will be analyzed. In addition, the affixes and lexicons used to denote negation will be investigated. The last part is mainly concerning the idioms and other expressions which denote negative meanings. In order to make the views much more clearly, some Chinese equivalents of the English sentences will be offered here.
文摘Lexicon is the foundation of any language learning. Ignorance of lexical learning will have a direct influence on the ability of learners' listening, speaking, reading and translation. In response to the current situation of college English teaching ,this essay lays emphasis on the importance of lexical teaching in college English teaching and probes into several teaching theories and approaches to which college English teachers should attach importance in their lexical teaching.
文摘This paper addresses a very general problem—the relationship between implicit and explicit forms of meaning–that is as old as scholarly attention to language in use.It first tries to define the problem.Then it presents some elementary aspects of the way in which the problem has been dealt with in the pragmatic literature.This is followed by an excursion into the world of related natural-language concepts,as reflected in the English metapragmatic lexicon.Finally,the paper tries to make a contribution to a solution by proposing a threedimensional matrix to account for what might look like a one-dimensional gradable scale from implicit to explicit.An attempt is made to illustrate the potential usefulness of the suggestions.Conclusions mainly take the form of perspectives for future research.
文摘For a long time,there exists a considerable amount of sexism in English,especially in English lexicon.In this paper,the author will discuss some presentations of sexism in English lexicon,and try to analyze some factors which have a great influence on the existence of sexism in English.This paper wants to arouse more and more people to realize the importance and urgency of desexism.