期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
A Machine Learning-Based Technique with Intelligent WordNet Lemmatize for Twitter Sentiment Analysis
1
作者 S.Saranya G.Usha 《Intelligent Automation & Soft Computing》 SCIE 2023年第4期339-352,共14页
Laterally with the birth of the Internet,the fast growth of mobile stra-tegies has democratised content production owing to the widespread usage of social media,resulting in a detonation of short informal writings.Twi... Laterally with the birth of the Internet,the fast growth of mobile stra-tegies has democratised content production owing to the widespread usage of social media,resulting in a detonation of short informal writings.Twitter is micro-blogging short text and social networking services,with posted millions of quick messages.Twitter analysis addresses the topic of interpreting users’tweets in terms of ideas,interests,and views in a range of settings andfields.This type of study can be useful for a variation of academics and applications that need knowing people’s perspectives on a given topic or event.Although sentiment examination of these texts is useful for a variety of reasons,it is typically seen as a difficult undertaking due to the fact that these messages are frequently short,informal,loud,and rich in linguistic ambiguities such as polysemy.Furthermore,most contemporary sentiment analysis algorithms are based on clean data.In this paper,we offers a machine-learning-based sentiment analysis method that extracts features from Term Frequency and Inverse Document Frequency(TF-IDF)and needs to apply deep intelligent wordnet lemmatize to improve the excellence of tweets by removing noise.We also utilise the Random Forest network to detect the emotion of a tweet.To authenticate the proposed approach performance,we conduct extensive tests on publically accessible datasets,and thefindings reveal that the suggested technique significantly outperforms sentiment classification in multi-class emotion text data. 展开更多
关键词 Random Forest sentiment analysis social media term frequency and inverse document frequency TWITTER wordnet lemmatize
下载PDF
An improved TF-IDF approach for text classification 被引量:5
2
作者 张云涛 龚玲 王永成 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2005年第1期49-55,共7页
This paper presents a new improved term frequency/inverse document frequency (TF-IDF) approach which uses confidence, support and characteristic words to enhance the recall and precision of text classification. Synony... This paper presents a new improved term frequency/inverse document frequency (TF-IDF) approach which uses confidence, support and characteristic words to enhance the recall and precision of text classification. Synonyms defined by a lexicon are processed in the improved TF-IDF approach. We detailedly discuss and analyze the relationship among confidence, recall and precision. The experiments based on science and technology gave promising results that the new TF-IDF approach improves the precision and recall of text classification compared with the conventional TF-IDF approach. 展开更多
关键词 term frequency/inverse document frequency (TF-IDF) Text classification CONFIDENCE SUPPORT Characteristic words
下载PDF
Web-Based Biomedical Literature Mining
3
作者 安建福 薛惠平 +2 位作者 陈瑛 吴建国 章鲁 《Journal of Shanghai Jiaotong university(Science)》 EI 2012年第4期494-499,共6页
With an upsurge in biomedical literature,using data-mining method to search new knowledge from literature has drawing more attention of scholars.In this study,taking the mining of non-coding gene literature from the n... With an upsurge in biomedical literature,using data-mining method to search new knowledge from literature has drawing more attention of scholars.In this study,taking the mining of non-coding gene literature from the network database of PubMed as an example,we first preprocessed the abstract data,next applied the term occurrence frequency(TF) and inverse document frequency(IDF)(TF-IDF) method to select features,and then established a biomedical literature data-mining model based on Bayesian algorithm.Finally,we assessed the model through area under the receiver operating characteristic curve(AUC),accuracy,specificity,sensitivity,precision rate and recall rate.When 1 000 features are selected,AUC,specificity,sensitivity,accuracy rate,precision rate and recall rate are 0.868 3,84.63%,89.02%,86.83%,89.02% and 98.14%,respectively.These results indicate that our method can identify the targeted literature related to a particular topic effectively. 展开更多
关键词 Bayesian algorithm term occurrence frequency(TF) and inverse document frequency(IDF)(TFIDF) DATA-MINING
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部