Thalassemia syndrome is a genetic blood disorder induced by the reduction of normal hemoglobin production,resulting in a drop in the size of red blood cells.In severe forms,it can lead to death.This genetic disorder h...Thalassemia syndrome is a genetic blood disorder induced by the reduction of normal hemoglobin production,resulting in a drop in the size of red blood cells.In severe forms,it can lead to death.This genetic disorder has posed a major burden on public health wherein patients with severe thalassemia need periodic therapy of iron chelation and blood transfusion for survival.Therefore,controlling thalassemia is extremely important and is made by promoting screening to the general population,particularly among thalassemia carriers.Today Twitter is one of the most influential social media platforms for sharing opinions and discussing different topics like people’s health conditions and major public health affairs.Exploring individuals’sentiments in these tweets helps the research centers to formulate strategies to promote thalassemia screening to the public.An effective Lexiconbased approach has been introduced in this study by highlighting a classifier called valence aware dictionary for sentiment reasoning(VADER).In this study applied twitter intelligence tool(TWINT),Natural Language Toolkit(NLTK),and VADER constitute the three main tools.VADER represents a gold-standard sentiment lexicon,which is basically tailored to attitudes that are communicated by using social media.The contribution of this study is to introduce an effective Lexicon-based approach by highlighting a classifier calledVADERto analyze the sentiment of the general population,particularly among thalassemia carriers on the social media platform Twitter.In this study,the results showed that the proposed approach achieved 0.829,0.816,and 0.818 regarding precision,recall,together with F-score,respectively.The tweets were crawled using the search keywords,“thalassemia screening,”thalassemia test,“and thalassemia diagnosis”.Finally,results showed that India and Pakistan ranked the highest in mentions in tweets by the public’s conversations on thalassemia screening with 181 and 164 tweets,respectively.展开更多
With the increasing usage of drugs to remedy different diseases,drug safety has become crucial over the past few years.Often medicine from several companies is offered for a single disease that involves the same/simil...With the increasing usage of drugs to remedy different diseases,drug safety has become crucial over the past few years.Often medicine from several companies is offered for a single disease that involves the same/similar substances with slightly different formulae.Such diversification is both helpful and danger-ous as such medicine proves to be more effective or shows side effects to different patients.Despite clinical trials,side effects are reported when the medicine is used by the mass public,of which several such experiences are shared on social media platforms.A system capable of analyzing such reviews could be very helpful to assist healthcare professionals and companies for evaluating the safety of drugs after it has been marketed.Sentiment analysis of drug reviews has a large poten-tial for providing valuable insights into these cases.Therefore,this study proposes an approach to perform analysis on the drug safety reviews using lexicon-based and deep learning techniques.A dataset acquired from the‘Drugs.Com’contain-ing reviews of drug-related side effects and reactions,is used for experiments.A lexicon-based approach,Textblob is used to extract the positive,negative or neu-tral sentiment from the review text.Review classification is achieved using a novel hybrid deep learning model of convolutional neural networks and long short-term memory(CNN-LSTM)network.The CNN is used at thefirst level to extract the appropriate features while LSTM is used at the second level.Several well-known machine learning models including logistic regression,random for-est,decision tree,and AdaBoost are evaluated using term frequency-inverse docu-ment frequency(TF-IDF),a bag of words(BoW),feature union of(TF-IDF+BoW),and lexicon-based methods.Performance analysis with machine learning models,long short term memory and convolutional neural network models,and state-of-the-art approaches indicate that the proposed CNN-LSTM model shows superior performance with an 0.96 accuracy.We also performed a statistical sig-nificance T-test to show the significance of the proposed CNN-LSTM model in comparison with other approaches.展开更多
The feedback collection and analysis has remained an important subject matter for long.The traditional techniques for student feedback analysis are based on questionnaire-based data collection and analysis.However,the...The feedback collection and analysis has remained an important subject matter for long.The traditional techniques for student feedback analysis are based on questionnaire-based data collection and analysis.However,the student expresses their feedback opinions on online social media sites,which need to be analyzed.This study aims at the development of fuzzy-based sentiment analysis system for analyzing student feedback and satisfaction by assigning proper sentiment score to opinion words and polarity shifters present in the input reviews.Our technique computes the sentiment score of student feedback reviews and then applies a fuzzy-logic module to analyze and quantify student’s satisfaction at the fine-grained level.The experimental results reveal that the proposed work has outperformed the baseline studies as well as state-of-the-art machine learning classifiers.展开更多
Currently,the sentiment analysis research in the Malaysian context lacks in terms of the availability of the sentiment lexicon.Thus,this issue is addressed in this paper in order to enhance the accuracy of sentiment a...Currently,the sentiment analysis research in the Malaysian context lacks in terms of the availability of the sentiment lexicon.Thus,this issue is addressed in this paper in order to enhance the accuracy of sentiment analysis.In this study,a new lexicon for sentiment analysis is constructed.A detailed review of existing approaches has been conducted,and a new bilingual sentiment lexicon known as MELex(Malay-English Lexicon)has been generated.Constructing MELex involves three activities:seed words selection,polarity assignment,and synonym expansions.Our approach differs from previous works in that MELex can analyze text for the two most widely used languages in Malaysia,Malay,and English,with the accuracy achieved,is 90%.It is evaluated based on the experimentation and case study approaches where the affordable housing projects in Malaysia are selected as case projects.This finding has given an implication on the ability of MELex to analyze public sentiments in the Malaysian context.The novel aspects of this paper are two-fold.Firstly,it introduces the new technique in assigning the polarity score,and second,it improves the performance over the classification of mixed language content.展开更多
At present online shopping is very popular as it is very convenient for the customers.However,selecting smartphones from online shops is bit difficult only from the pictures and a short description about the item,and ...At present online shopping is very popular as it is very convenient for the customers.However,selecting smartphones from online shops is bit difficult only from the pictures and a short description about the item,and hence,the customers refer user reviews and star rating.Since user reviews are represented in human languages,sometimes the real semantic of the reviews and satisfaction of the customers are different than what the star rating shows.Also,reading all the reviews are not possible as typically,a smartphone gets thousands of reviews in popular online shopping platform like Amazon.Hence,this work aims to develop a recommended system for smartphones based on aspects of the phones such as screen size,resolution,camera quality,battery life etc.reviewed by users.To that end we apply hybrid approach,which includes three lexicon-based methods and three machine learning modals to analyze specific aspects of user reviews and classify the reviews into six categories--best,better,good or somewhat for positive comments and for negative comments bad or not recommended--.The lexicon-based tool called AFINN together with Random Forest prediction model provides the best classification F1-score 0.95.This system can be customized according to the required aspects of smartphones and the classification of reviews can be done accordingly.展开更多
Inspired by the concept of content-addressable retrieval from cognitive science,we propose a novel fragment-based Chinese named entity recognition(NER)model augmented with a lexicon-based memory in which both characte...Inspired by the concept of content-addressable retrieval from cognitive science,we propose a novel fragment-based Chinese named entity recognition(NER)model augmented with a lexicon-based memory in which both character-level and word-level features are combined to generate better feature representations for possible entity names.Observing that the boundary information of entity names is particularly useful to locate and classify them into pre-defined categories,position-dependent features,such as prefix and suffix,are introduced and taken into account for NER tasks in the form of distributed representations.The lexicon-based memory is built to help generate such position-dependent features and deal with the problem of out-of-vocabulary words.Experimental results show that the proposed model,called LEMON,achieved state-of-the-art performance with an increase in the Fl-score up to 3.2%over the state-of-the-art models on four different widely-used NER datasets.展开更多
基金The authors are thankful to the Deanship of Scientific Research at Najran University for funding this work under the Research Collaboration Funding program grant coder NU/RC/SERC/11/5.
文摘Thalassemia syndrome is a genetic blood disorder induced by the reduction of normal hemoglobin production,resulting in a drop in the size of red blood cells.In severe forms,it can lead to death.This genetic disorder has posed a major burden on public health wherein patients with severe thalassemia need periodic therapy of iron chelation and blood transfusion for survival.Therefore,controlling thalassemia is extremely important and is made by promoting screening to the general population,particularly among thalassemia carriers.Today Twitter is one of the most influential social media platforms for sharing opinions and discussing different topics like people’s health conditions and major public health affairs.Exploring individuals’sentiments in these tweets helps the research centers to formulate strategies to promote thalassemia screening to the public.An effective Lexiconbased approach has been introduced in this study by highlighting a classifier called valence aware dictionary for sentiment reasoning(VADER).In this study applied twitter intelligence tool(TWINT),Natural Language Toolkit(NLTK),and VADER constitute the three main tools.VADER represents a gold-standard sentiment lexicon,which is basically tailored to attitudes that are communicated by using social media.The contribution of this study is to introduce an effective Lexicon-based approach by highlighting a classifier calledVADERto analyze the sentiment of the general population,particularly among thalassemia carriers on the social media platform Twitter.In this study,the results showed that the proposed approach achieved 0.829,0.816,and 0.818 regarding precision,recall,together with F-score,respectively.The tweets were crawled using the search keywords,“thalassemia screening,”thalassemia test,“and thalassemia diagnosis”.Finally,results showed that India and Pakistan ranked the highest in mentions in tweets by the public’s conversations on thalassemia screening with 181 and 164 tweets,respectively.
文摘With the increasing usage of drugs to remedy different diseases,drug safety has become crucial over the past few years.Often medicine from several companies is offered for a single disease that involves the same/similar substances with slightly different formulae.Such diversification is both helpful and danger-ous as such medicine proves to be more effective or shows side effects to different patients.Despite clinical trials,side effects are reported when the medicine is used by the mass public,of which several such experiences are shared on social media platforms.A system capable of analyzing such reviews could be very helpful to assist healthcare professionals and companies for evaluating the safety of drugs after it has been marketed.Sentiment analysis of drug reviews has a large poten-tial for providing valuable insights into these cases.Therefore,this study proposes an approach to perform analysis on the drug safety reviews using lexicon-based and deep learning techniques.A dataset acquired from the‘Drugs.Com’contain-ing reviews of drug-related side effects and reactions,is used for experiments.A lexicon-based approach,Textblob is used to extract the positive,negative or neu-tral sentiment from the review text.Review classification is achieved using a novel hybrid deep learning model of convolutional neural networks and long short-term memory(CNN-LSTM)network.The CNN is used at thefirst level to extract the appropriate features while LSTM is used at the second level.Several well-known machine learning models including logistic regression,random for-est,decision tree,and AdaBoost are evaluated using term frequency-inverse docu-ment frequency(TF-IDF),a bag of words(BoW),feature union of(TF-IDF+BoW),and lexicon-based methods.Performance analysis with machine learning models,long short term memory and convolutional neural network models,and state-of-the-art approaches indicate that the proposed CNN-LSTM model shows superior performance with an 0.96 accuracy.We also performed a statistical sig-nificance T-test to show the significance of the proposed CNN-LSTM model in comparison with other approaches.
文摘The feedback collection and analysis has remained an important subject matter for long.The traditional techniques for student feedback analysis are based on questionnaire-based data collection and analysis.However,the student expresses their feedback opinions on online social media sites,which need to be analyzed.This study aims at the development of fuzzy-based sentiment analysis system for analyzing student feedback and satisfaction by assigning proper sentiment score to opinion words and polarity shifters present in the input reviews.Our technique computes the sentiment score of student feedback reviews and then applies a fuzzy-logic module to analyze and quantify student’s satisfaction at the fine-grained level.The experimental results reveal that the proposed work has outperformed the baseline studies as well as state-of-the-art machine learning classifiers.
文摘Currently,the sentiment analysis research in the Malaysian context lacks in terms of the availability of the sentiment lexicon.Thus,this issue is addressed in this paper in order to enhance the accuracy of sentiment analysis.In this study,a new lexicon for sentiment analysis is constructed.A detailed review of existing approaches has been conducted,and a new bilingual sentiment lexicon known as MELex(Malay-English Lexicon)has been generated.Constructing MELex involves three activities:seed words selection,polarity assignment,and synonym expansions.Our approach differs from previous works in that MELex can analyze text for the two most widely used languages in Malaysia,Malay,and English,with the accuracy achieved,is 90%.It is evaluated based on the experimentation and case study approaches where the affordable housing projects in Malaysia are selected as case projects.This finding has given an implication on the ability of MELex to analyze public sentiments in the Malaysian context.The novel aspects of this paper are two-fold.Firstly,it introduces the new technique in assigning the polarity score,and second,it improves the performance over the classification of mixed language content.
文摘At present online shopping is very popular as it is very convenient for the customers.However,selecting smartphones from online shops is bit difficult only from the pictures and a short description about the item,and hence,the customers refer user reviews and star rating.Since user reviews are represented in human languages,sometimes the real semantic of the reviews and satisfaction of the customers are different than what the star rating shows.Also,reading all the reviews are not possible as typically,a smartphone gets thousands of reviews in popular online shopping platform like Amazon.Hence,this work aims to develop a recommended system for smartphones based on aspects of the phones such as screen size,resolution,camera quality,battery life etc.reviewed by users.To that end we apply hybrid approach,which includes three lexicon-based methods and three machine learning modals to analyze specific aspects of user reviews and classify the reviews into six categories--best,better,good or somewhat for positive comments and for negative comments bad or not recommended--.The lexicon-based tool called AFINN together with Random Forest prediction model provides the best classification F1-score 0.95.This system can be customized according to the required aspects of smartphones and the classification of reviews can be done accordingly.
基金supported by the National Key Research and Development Program of China under Grant No.2018YFC0830900the National Natural Science Foundation of China under Grant No.62076068Shanghai Municipal Science and Technology Project under Grant No.21511102800。
文摘Inspired by the concept of content-addressable retrieval from cognitive science,we propose a novel fragment-based Chinese named entity recognition(NER)model augmented with a lexicon-based memory in which both character-level and word-level features are combined to generate better feature representations for possible entity names.Observing that the boundary information of entity names is particularly useful to locate and classify them into pre-defined categories,position-dependent features,such as prefix and suffix,are introduced and taken into account for NER tasks in the form of distributed representations.The lexicon-based memory is built to help generate such position-dependent features and deal with the problem of out-of-vocabulary words.Experimental results show that the proposed model,called LEMON,achieved state-of-the-art performance with an increase in the Fl-score up to 3.2%over the state-of-the-art models on four different widely-used NER datasets.