期刊文献+
共找到43篇文章
< 1 2 3 >
每页显示 20 50 100
An Ensemble-Based Hotel Reviews System Using Naive Bayes Classifier
1
作者 Joseph Bamidele Awotunde Sanjay Misra +1 位作者 Vikash Katta Oluwafemi Charles Adebayo 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第10期131-154,共24页
The task of classifying opinions conveyed in any form of text online is referred to as sentiment analysis.The emergence of social media usage and its spread has given room for sentiment analysis in our daily lives.Soc... The task of classifying opinions conveyed in any form of text online is referred to as sentiment analysis.The emergence of social media usage and its spread has given room for sentiment analysis in our daily lives.Social media applications and websites have become the foremost spring of data recycled for reviews for sentimentality in various fields.Various subject matter can be encountered on social media platforms,such as movie product reviews,consumer opinions,and testimonies,among others,which can be used for sentiment analysis.The rapid uncovering of these web contents contains divergence of many benefits like profit-making,which is one of the most vital of them all.According to a recent study,81%of consumers conduct online research prior to making a purchase.But the reviews available online are too huge and numerous for human brains to process and analyze.Hence,machine learning classifiers are one of the prominent tools used to classify sentiment in order to get valuable information for use in companies like hotels,game companies,and so on.Understanding the sentiments of people towards different commodities helps to improve the services for contextual promotions,referral systems,and market research.Therefore,this study proposes a sentiment-based framework detection to enable the rapid uncovering of opinionated contents of hotel reviews.A Naive Bayes classifier was used to process and analyze the dataset for the detection of the polarity of the words.The dataset from Datafiniti’s Business Database obtained from Kaggle was used for the experiments in this study.The performance evaluation of the model shows a test accuracy of 96.08%,an F1-score of 96.00%,a precision of 96.00%,and a recall of 96.00%.The results were compared with state-of-the-art classifiers and showed a promising performance andmuch better in terms of performancemetrics. 展开更多
关键词 Sentiment analysis hotel reviews naive bayes algorithm consumer opinions web 2.0 machine learning
下载PDF
Situation assessment for air combat based on novel semi-supervised naive Bayes 被引量:13
2
作者 XU Ximeng YANG Rennong FU Ying 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2018年第4期768-779,共12页
A method is proposed to resolve the typical problem of air combat situation assessment. Taking the one-to-one air combat as an example and on the basis of air combat data recorded by the air combat maneuvering instrum... A method is proposed to resolve the typical problem of air combat situation assessment. Taking the one-to-one air combat as an example and on the basis of air combat data recorded by the air combat maneuvering instrument, the problem of air combat situation assessment is equivalent to the situation classification problem of air combat data. The fuzzy C-means clustering algorithm is proposed to cluster the selected air combat sample data and the situation classification of the data is determined by the data correlation analysis in combination with the clustering results and the pilots' description of the air combat process. On the basis of semi-supervised naive Bayes classifier, an improved algorithm is proposed based on data classification confidence, through which the situation classification of air combat data is carried out. The simulation results show that the improved algorithm can assess the air combat situation effectively and the improvement of the algorithm can promote the classification performance without significantly affecting the efficiency of the classifier. 展开更多
关键词 air combat situation assessment air combat maneu-vering instrument SEMI-SUPERVISED naive bayes.
下载PDF
A Feature Weighted Mixed Naive Bayes Model for Monitoring Anomalies in the Fan System of a Thermal Power Plant 被引量:1
3
作者 Min Wang Li Sheng +1 位作者 Donghua Zhou Maoyin Chen 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2022年第4期719-727,共9页
With the increasing intelligence and integration,a great number of two-valued variables(generally stored in the form of 0 or 1)often exist in large-scale industrial processes.However,these variables cannot be effectiv... With the increasing intelligence and integration,a great number of two-valued variables(generally stored in the form of 0 or 1)often exist in large-scale industrial processes.However,these variables cannot be effectively handled by traditional monitoring methods such as linear discriminant analysis(LDA),principal component analysis(PCA)and partial least square(PLS)analysis.Recently,a mixed hidden naive Bayesian model(MHNBM)is developed for the first time to utilize both two-valued and continuous variables for abnormality monitoring.Although the MHNBM is effective,it still has some shortcomings that need to be improved.For the MHNBM,the variables with greater correlation to other variables have greater weights,which can not guarantee greater weights are assigned to the more discriminating variables.In addition,the conditional P(x j|x j′,y=k)probability must be computed based on historical data.When the training data is scarce,the conditional probability between continuous variables tends to be uniformly distributed,which affects the performance of MHNBM.Here a novel feature weighted mixed naive Bayes model(FWMNBM)is developed to overcome the above shortcomings.For the FWMNBM,the variables that are more correlated to the class have greater weights,which makes the more discriminating variables contribute more to the model.At the same time,FWMNBM does not have to calculate the conditional probability between variables,thus it is less restricted by the number of training data samples.Compared with the MHNBM,the FWMNBM has better performance,and its effectiveness is validated through numerical cases of a simulation example and a practical case of the Zhoushan thermal power plant(ZTPP),China. 展开更多
关键词 Abnormality monitoring continuous variables feature weighted mixed naive bayes model(FWMNBM) two-valued variables thermal power plant
下载PDF
Spam Filtering:Online Naive Bayes Based on TONE 被引量:1
4
作者 Guanglu Sun Hongyue Sun +1 位作者 Yingcai Ma Yuewu Shen 《ZTE Communications》 2013年第2期51-54,共4页
The naive, Bayes (NB) model has been successfully used to tackle spare, and is very accurate. However, there is still room for improwment. We use a train on or near error (TONE) method in online NB to enhance the ... The naive, Bayes (NB) model has been successfully used to tackle spare, and is very accurate. However, there is still room for improwment. We use a train on or near error (TONE) method in online NB to enhance the perfornmnee of NB and reduce the number of training emails. We conducted an experiment to determine the performanee of the improved algorithm by plotting (I-ROCA)% curves. The resuhs show that the proposed method improves the performanee of original NB. 展开更多
关键词 spare fihering online naive bayes train-on or near error
下载PDF
Social Network Rumor Recognition Based on Enhanced Naive Bayes 被引量:1
5
作者 Lei Guo 《Journal of New Media》 2021年第3期99-107,共9页
In recent years,with the increasing popularity of social networks,rumors have become more common.At present,the solution to rumors in social networks is mainly through media censorship and manual reporting,but this me... In recent years,with the increasing popularity of social networks,rumors have become more common.At present,the solution to rumors in social networks is mainly through media censorship and manual reporting,but this method requires a lot of manpower and material resources,and the cost is relatively high.Therefore,research on the characteristics of rumors and automatic identification and classification of network message text is of great significance.This paper uses the Naive Bayes algorithm combined with Laplacian smoothing to identify rumors in social network texts.The first is to segment the text and remove the stop words after the word segmentation is completed.Because of the data-sensitive nature of Naive Bayes,this paper performs text preprocessing on the input data.Then a naive Bayes classifier is constructed,and the Laplacian smoothing method is introduced to solve the problem of using the naive Bayes model to estimate the zero probability in rumor recognition.Finally,experiments show that the Naive Bayes algorithm combined with Laplace smoothing can effectively improve the accuracy of rumor recognition. 展开更多
关键词 Rumor recognition social network machine learning naive bayes laplacian smoothing
下载PDF
Mobile SMS Spam Filtering for Nepali Text Using Naive Bayesian and Support Vector Machine 被引量:2
6
作者 Tej Bahadur Shahi Abhimanu Yadav 《International Journal of Intelligence Science》 2014年第1期24-28,共5页
Spam is a universal problem with which everyone is familiar. A number of approaches are used for Spam filtering. The most common filtering technique is content-based filtering which uses the actual text of message to ... Spam is a universal problem with which everyone is familiar. A number of approaches are used for Spam filtering. The most common filtering technique is content-based filtering which uses the actual text of message to determine whether it is Spam or not. The content is very dynamic and it is very challenging to represent all information in a mathematical model of classification. For instance, in content-based Spam filtering, the characteristics used by the filter to identify Spam message are constantly changing over time. Na?ve Bayes method represents the changing nature of message using probability theory and support vector machine (SVM) represents those using different features. These two methods of classification are efficient in different domains and the case of Nepali SMS or Text classification has not yet been in consideration;these two methods do not consider the issue and it is interesting to find out the performance of both the methods in the problem of Nepali Text classification. In this paper, the Na?ve Bayes and SVM-based classification techniques are implemented to classify the Nepali SMS as Spam and non-Spam. An empirical analysis for various text cases has been done to evaluate accuracy measure of the classification methodologies used in this study. And, it is found to be 87.15% accurate in SVM and 92.74% accurate in the case of Na?ve Bayes. 展开更多
关键词 SMS Spam Filtering Classification Support Vector Machine naive bayes PREPROCESSING Feature Extraction Nepali SMS Datasets
下载PDF
Ensemble Variable Selection for Naive Bayes to Improve Customer Behaviour Analysis
7
作者 R.Siva Subramanian D.Prabha 《Computer Systems Science & Engineering》 SCIE EI 2022年第4期339-355,共17页
Executing customer analysis in a systemic way is one of the possible solutions for each enterprise to understand the behavior of consumer patterns in an efficient and in-depth manner.Further investigation of customer p... Executing customer analysis in a systemic way is one of the possible solutions for each enterprise to understand the behavior of consumer patterns in an efficient and in-depth manner.Further investigation of customer patterns helps thefirm to develop efficient decisions and in turn,helps to optimize the enter-prise’s business and maximizes consumer satisfaction correspondingly.To con-duct an effective assessment about the customers,Naive Bayes(also called Simple Bayes),a machine learning model is utilized.However,the efficacious of the simple Bayes model is utterly relying on the consumer data used,and the existence of uncertain and redundant attributes in the consumer data enables the simple Bayes model to attain the worst prediction in consumer data because of its presumption regarding the attributes applied.However,in practice,the NB pre-mise is not true in consumer data,and the analysis of these redundant attributes enables simple Bayes model to get poor prediction results.In this work,an ensem-ble attribute selection methodology is performed to overcome the problem with consumer data and to pick a steady uncorrelated attribute set to model with the NB classifier.In ensemble variable selection,two different strategies are applied:one is based upon data perturbation(or homogeneous ensemble,same feature selector is applied to a different subsamples derived from the same learning set)and the other one is based upon function perturbation(or heterogeneous ensemble different feature selector is utilized to the same learning set).Further-more,the feature set captured from both ensemble strategies is applied to NB indi-vidually and the outcome obtained is computed.Finally,the experimental outcomes show that the proposed ensemble strategies perform efficiently in choosing a steady attribute set and increasing NB classification performance efficiently. 展开更多
关键词 naive bayes or simple bayes variable selection homogeneous ensemble heterogeneous ensemble customer prediction
下载PDF
Improving naive Bayes classifier by dividing its decision regions 被引量:3
8
作者 Zhi-yong YAN Gong-fu XU Yun-he PAN 《Journal of Zhejiang University-Science C(Computers and Electronics)》 SCIE EI 2011年第8期647-657,共11页
Classification can be regarded as dividing the data space into decision regions separated by decision boundaries.In this paper we analyze decision tree algorithms and the NBTree algorithm from this perspective.Thus,a ... Classification can be regarded as dividing the data space into decision regions separated by decision boundaries.In this paper we analyze decision tree algorithms and the NBTree algorithm from this perspective.Thus,a decision tree can be regarded as a classifier tree,in which each classifier on a non-root node is trained in decision regions of the classifier on the parent node.Meanwhile,the NBTree algorithm,which generates a classifier tree with the C4.5 algorithm and the naive Bayes classifier as the root and leaf classifiers respectively,can also be regarded as training naive Bayes classifiers in decision regions of the C4.5 algorithm.We propose a second division (SD) algorithm and three soft second division (SD-soft) algorithms to train classifiers in decision regions of the naive Bayes classifier.These four novel algorithms all generate two-level classifier trees with the naive Bayes classifier as root classifiers.The SD and three SD-soft algorithms can make good use of both the information contained in instances near decision boundaries,and those that may be ignored by the naive Bayes classifier.Finally,we conduct experiments on 30 data sets from the UC Irvine (UCI) repository.Experiment results show that the SD algorithm can obtain better generali-zation abilities than the NBTree and the averaged one-dependence estimators (AODE) algorithms when using the C4.5 algorithm and support vector machine (SVM) as leaf classifiers.Further experiments indicate that our three SD-soft algorithms can achieve better generalization abilities than the SD algorithm when argument values are selected appropriately. 展开更多
关键词 naive bayes classifier Decision region NBTree C4.5 algorithm Support vector machine (SVM)
原文传递
Naive Bayes for value difference metric 被引量:3
9
作者 Chaoqun LI Liangxiao JIANG Hongwei LI 《Frontiers of Computer Science》 SCIE EI CSCD 2014年第2期255-264,共10页
The value difference metric (VDM) is one of the best-known and widely used distance functions for nominal attributes. This work applies the instance weighting technique to improve VDM. An instance weighted value dif... The value difference metric (VDM) is one of the best-known and widely used distance functions for nominal attributes. This work applies the instance weighting technique to improve VDM. An instance weighted value difference met- ric (IWVDM) is proposed here. Different from prior work, IWVDM uses naive Bayes (NB) to find weights for train- ing instances. Because early work has shown that there is a close relationship between VDM and NB, some work on NB can be applied to VDM. The weight of a training instance x, that belongs to the class c, is assigned according to the dif- ference between the estimated conditional probability P(c/x) by NB and the true conditional probability P(c/x), and the weight is adjusted iteratively. Compared with previous work, IWVDM has the advantage of reducing the time complex- ity of the process of finding weights, and simultaneously im- proving the performance of VDM. Experimental results on 36 UCI datasets validate the effectiveness of IWVDM. 展开更多
关键词 value difference metric instance weighting naive bayes distance-based learning algorithms
原文传递
Naive Bayes Classifier for Debris Flow Disaster Mitigation in Mount Merapi Volcanic Rivers,Indonesia,Using X-band Polarimetric Radar
10
作者 Ratih Indri Hapsari Bima Ahida Indaka Sugna +2 位作者 Dandung Novianto Rosa Andrie Asmara Satoru Oishi 《International Journal of Disaster Risk Science》 SCIE CSCD 2020年第6期776-789,共14页
Debris flow triggered by rainfall that accompanies a volcanic eruption is a serious secondary impact of a volcanic disaster.The probability of debris flow events can be estimated based on the prior information of rain... Debris flow triggered by rainfall that accompanies a volcanic eruption is a serious secondary impact of a volcanic disaster.The probability of debris flow events can be estimated based on the prior information of rainfall from historical and geomorphological data that are presumed to relate to debris flow occurrence.In this study,a debris flow disaster warning system was developed by applying the Na?¨ve Bayes Classifier(NBC).The spatial likelihood of the hazard is evaluated at a small subbasin scale by including high-resolution rainfall measurements from X-band polarimetric weather radar,a topographic factor,and soil type as predictors.The study was conducted in the Gendol River Basin of Mount Merapi,one of the most active volcanoes in Indonesia.Rainfall and debris flow occurrence data were collected for the upper Gendol River from October 2016 to February 2018 and divided into calibration and validation datasets.The NBC was used to estimate the status of debris flow incidences displayed in the susceptibility map that is based on the posterior probability from the predictors.The system verification was performed by quantitative dichotomous quality indices along with a contingency table.Using the validation datasets,the advantage of the NBC for estimating debris flow occurrence is confirmed.This work contributes to existing knowledge on estimating debris flow susceptibility through the data mining approach.Despite the existence of predictive uncertainty,the presented system could contribute to the improvement of debris flow countermeasures in volcanic regions. 展开更多
关键词 Debris flows Gendol River Indonesia Merapi volcano naive bayes classifier
原文传递
A Naive Bayes model on lung adenocarcinoma projection based on tumor microenvironment and weighted gene coexpression network analysis
11
作者 Zhiqiang Ye Pingping Song +2 位作者 Degao Zheng Xu Zhang Jianhong Wu 《Infectious Disease Modelling》 2022年第3期498-509,共12页
Based on the lung adenocarcinoma(LUAD)gene expression data from the cancer genome atlas(TCGA)database,the Stromal score,Immune score and Estimate score in tumor microenvironment(TME)were computed by the Estimation of ... Based on the lung adenocarcinoma(LUAD)gene expression data from the cancer genome atlas(TCGA)database,the Stromal score,Immune score and Estimate score in tumor microenvironment(TME)were computed by the Estimation of Stromal and Immune cells in Malignant Tumor tissues using Expression data(ESTIMATE)algorithm.And gene modules significantly related to the three scores were identified by weighted gene coexpression network analysis(WGCNA).Based on the correlation coefficients and P values,899 key genes affecting tumor microenvironment were obtained by selecting the two most correlated modules.It was suggested through Gene Ontology(GO)and Kyoto Encyclopedia of Genes and Genomes(KEGG)enrichment analysis that these key genes were significantly involved in immune-related or cancer-related terms.Through univariate cox regression and elastic network analysis,genes associated with prognosis of the LUAD patients were screened out and their prognostic values were further verified by the survival analysis and the University of ALabama at Birmingham CANcer(UALCAN)database.The results indicated that eight genes were significantly related to the overall survival of LUAD.Among them,six genes were found differentially expressed between tumor and control samples.And immune infiltration analysis further verified that all the six genes were significantly related to tumor purity and immune cells.Therefore,these genes were used eventually for constructing a Naive Bayes projection model of LUAD.The model was verified by the receiver operating characteristic(ROC)curve where the area under curve(AUC)reached 92.03%,which suggested that the model could discriminate the tumor samples from the normal accurately.Our study provided an effective model for LUAD projection which improved the clinical diagnosis and cure of LUAD.The result also confirmed that the six genes in the model construction could be the potential prognostic biomarkers of LUAD. 展开更多
关键词 naive bayes model Tumor microenvironment Lung adenocarcinoma Weighted gene co-expression network ANALYSIS Prognostic biomarkers
原文传递
Machine Learning Models for Heterogenous Network Security Anomaly Detection
12
作者 Mercy Diligence Ogah Joe Essien +1 位作者 Martin Ogharandukun Monday Abdullahi 《Journal of Computer and Communications》 2024年第6期38-58,共21页
The increasing amount and intricacy of network traffic in the modern digital era have worsened the difficulty of identifying abnormal behaviours that may indicate potential security breaches or operational interruptio... The increasing amount and intricacy of network traffic in the modern digital era have worsened the difficulty of identifying abnormal behaviours that may indicate potential security breaches or operational interruptions. Conventional detection approaches face challenges in keeping up with the ever-changing strategies of cyber-attacks, resulting in heightened susceptibility and significant harm to network infrastructures. In order to tackle this urgent issue, this project focused on developing an effective anomaly detection system that utilizes Machine Learning technology. The suggested model utilizes contemporary machine learning algorithms and frameworks to autonomously detect deviations from typical network behaviour. It promptly identifies anomalous activities that may indicate security breaches or performance difficulties. The solution entails a multi-faceted approach encompassing data collection, preprocessing, feature engineering, model training, and evaluation. By utilizing machine learning methods, the model is trained on a wide range of datasets that include both regular and abnormal network traffic patterns. This training ensures that the model can adapt to numerous scenarios. The main priority is to ensure that the system is functional and efficient, with a particular emphasis on reducing false positives to avoid unwanted alerts. Additionally, efforts are directed on improving anomaly detection accuracy so that the model can consistently distinguish between potentially harmful and benign activity. This project aims to greatly strengthen network security by addressing emerging cyber threats and improving their resilience and reliability. 展开更多
关键词 Cyber-Security Network Anomaly Detection Machine Learning Random Forest Decision Tree Gaussian naive bayes
下载PDF
Film and Television Website Scores Authenticity Verification Based on the Emotional Analysis
13
作者 Weiyu Tong 《Journal of Computer and Communications》 2024年第2期231-245,共15页
Sentiment analysis is a method to identify and understand the emotion in the text through NLP and text analysis. In the era of information technology, there is often a certain error between the comments on the movie w... Sentiment analysis is a method to identify and understand the emotion in the text through NLP and text analysis. In the era of information technology, there is often a certain error between the comments on the movie website and the actual score of the movie, and sentiment analysis technology provides a new way to solve this problem. In this paper, Python is used to obtain the movie review data from the Douban platform, and the model is constructed and trained by using naive Bayes and Bi-LSTM. According to the index, a better Bi-LSTM model is selected to classify the emotion of users’ movie reviews, and the classification results are scored according to the classification results, and compared with the real ratings on the website. According to the error of the final comparison results, the feasibility of this technology in the scoring direction of film reviews is being verified. By applying this technology, the phenomenon of film rating distortion in the information age can be prevented and the rights and interests of film and television works can be safeguarded. 展开更多
关键词 Bi-LSTM Model Film Review Emotion Analysis naive bayes PYTHON Data Crawl
下载PDF
基于Bayes网的软件构件分类
14
作者 白成刚 《计算机工程与应用》 CSCD 北大核心 2005年第33期17-19,共3页
对软件构件进行分类有助于人们开发高质量的软件。Naive-Bayes网在分类中已经得到成功的应用。但是Naive-Bayes网有一个基本假设:各特征节点要求条件独立。不幸的事,这在现实世界中很难成立。论文利用主成分分析的方法降低了各特征节点... 对软件构件进行分类有助于人们开发高质量的软件。Naive-Bayes网在分类中已经得到成功的应用。但是Naive-Bayes网有一个基本假设:各特征节点要求条件独立。不幸的事,这在现实世界中很难成立。论文利用主成分分析的方法降低了各特征节点的相关性,扩展了Naive-Bayes网的应用范围,并将其用于对软件构件进行分类。实例分析表明新的Bayes分类网预测精度高于一般的Naive-Bayes网。 展开更多
关键词 软件构件 naivebayes 分类器
下载PDF
Sentiment Analysis with Tweets Behaviour in Twitter Streaming API 被引量:1
15
作者 Kuldeep Chouhan Mukesh Yadav +4 位作者 Ranjeet Kumar Rout Kshira Sagar Sahoo NZ Jhanjhi Mehedi Masud Sultan Aljahdali 《Computer Systems Science & Engineering》 SCIE EI 2023年第5期1113-1128,共16页
Twitter is a radiant platform with a quick and effective technique to analyze users’perceptions of activities on social media.Many researchers and industry experts show their attention to Twitter sentiment analysis t... Twitter is a radiant platform with a quick and effective technique to analyze users’perceptions of activities on social media.Many researchers and industry experts show their attention to Twitter sentiment analysis to recognize the stakeholder group.The sentiment analysis needs an advanced level of approaches including adoption to encompass data sentiment analysis and various machine learning tools.An assessment of sentiment analysis in multiple fields that affect their elevations among the people in real-time by using Naive Bayes and Support Vector Machine(SVM).This paper focused on analysing the distinguished sentiment techniques in tweets behaviour datasets for various spheres such as healthcare,behaviour estimation,etc.In addition,the results in this work explore and validate the statistical machine learning classifiers that provide the accuracy percentages attained in terms of positive,negative and neutral tweets.In this work,we obligated Twitter Application Programming Interface(API)account and programmed in python for sentiment analysis approach for the computational measure of user’s perceptions that extract a massive number of tweets and provide market value to the Twitter account proprietor.To distinguish the results in terms of the performance evaluation,an error analysis investigates the features of various stakeholders comprising social media analytics researchers,Natural Language Processing(NLP)developers,engineering managers and experts involved to have a decision-making approach. 展开更多
关键词 Machine learning naive bayes natural language processing sentiment analysis social media analytics support vector machine Twitter application programming interface
下载PDF
Multi-Tier Sentiment Analysis of Social Media Text Using Supervised Machine Learning
16
作者 Hameedur Rahman Junaid Tariq +3 位作者 M.Ali Masood Ahmad F.Subahi Osamah Ibrahim Khalaf Youseef Alotaibi 《Computers, Materials & Continua》 SCIE EI 2023年第3期5527-5543,共17页
Sentiment Analysis(SA)is often referred to as opinion mining.It is defined as the extraction,identification,or characterization of the sentiment from text.Generally,the sentiment of a textual document is classified in... Sentiment Analysis(SA)is often referred to as opinion mining.It is defined as the extraction,identification,or characterization of the sentiment from text.Generally,the sentiment of a textual document is classified into binary classes i.e.,positive and negative.However,fine-grained classification provides a better insight into the sentiments.The downside is that fine-grained classification is more challenging as compared to binary.On the contrary,performance deteriorates significantly in the case of multi-class classification.In this study,pre-processing techniques and machine learning models for the multi-class classification of sentiments were explored.To augment the performance,a multi-layer classification model has been proposed.Owing to similitude with social media text,the movie reviews dataset has been used for the implementation.Supervised machine learning models namely Decision Tree,Support Vector Machine,and Naive Bayes models have been implemented for the task of sentiment classification.We have compared the models of single-layer architecture with multi-tier model.The results of Multi-tier model have slight improvement over the single-layer architecture.Moreover,multi-tier models have better recall which allow our proposed model to learn more context.We have discussed certain shortcomings of the model that will help researchers to design multi-tier models with more contextual information. 展开更多
关键词 Sentiment analysis machine learning multi-class classification SVM decision tree naive bayes
下载PDF
P&T-Inf: A Result Inference Method for Context-Sensitive Tasks in Crowdsourcing
17
作者 Zhifang Liao Hao Gu +2 位作者 Shichao Zhang Ronghui Mo Yan Zhang 《Intelligent Automation & Soft Computing》 SCIE 2023年第7期599-618,共20页
Context-Sensitive Task(CST)is a complex task type in crowdsourc-ing,such as handwriting recognition,route plan,and audio transcription.The current result inference algorithms can perform well in simple crowd-sourcing ... Context-Sensitive Task(CST)is a complex task type in crowdsourc-ing,such as handwriting recognition,route plan,and audio transcription.The current result inference algorithms can perform well in simple crowd-sourcing tasks,but cannot obtain high-quality inference results for CSTs.The conventional method to solve CSTs is to divide a CST into multiple independent simple subtasks for crowdsourcing,but this method ignores the context correlation among subtasks and reduces the quality of result inference.To solve this problem,we propose a result inference algorithm based on the Partially ordered set and Tree augmented naive Bayes Infer(P&T-Inf)for CSTs.Firstly,we screen the candidate results of context-sensitive tasks based on the partially ordered set.If there are parallel candidate sets,the conditional mutual information among subtasks containing context infor-mation in external knowledge(such as Google n-gram corpus,American Contemporary English corpus,etc.)will be calculated.Combined with the tree augmented naive(TAN)Bayes model,the maximum weighted spanning tree is used to model the dependencies among subtasks in each CST.We collect two crowdsourcing datasets of handwriting recognition tasks and audio transcription tasks from the real crowdsourcing platform.The experimental results show that our approach improves the quality of result inference in CSTs and reduces the time cost compared with the latest methods. 展开更多
关键词 Crowdsourcing result inference tree augmented naive bayes CONTEXT-SENSITIVE
下载PDF
Bug Prioritization Using Average One Dependence Estimator
18
作者 Kashif Saleem Rashid Naseem +3 位作者 Khalil Khan Siraj Muhammad Ikram Syed Jaehyuk Choi 《Intelligent Automation & Soft Computing》 SCIE 2023年第6期3517-3533,共17页
Automation software need to be continuously updated by addressing software bugs contained in their repositories.However,bugs have different levels of importance;hence,it is essential to prioritize bug reports based on... Automation software need to be continuously updated by addressing software bugs contained in their repositories.However,bugs have different levels of importance;hence,it is essential to prioritize bug reports based on their sever-ity and importance.Manually managing the deluge of incoming bug reports faces time and resource constraints from the development team and delays the resolu-tion of critical bugs.Therefore,bug report prioritization is vital.This study pro-poses a new model for bug prioritization based on average one dependence estimator;it prioritizes bug reports based on severity,which is determined by the number of attributes.The more the number of attributes,the more the severity.The proposed model is evaluated using precision,recall,F1-Score,accuracy,G-Measure,and Matthew’s correlation coefficient.Results of the proposed model are compared with those of the support vector machine(SVM)and Naive Bayes(NB)models.Eclipse and Mozilla datasetswere used as the sources of bug reports.The proposed model improved the bug repository management and out-performed the SVM and NB models.Additionally,the proposed model used a weaker attribute independence supposition than the former models,thereby improving prediction accuracy with minimal computational cost. 展开更多
关键词 Bug report triaging PRIORITIZATION support vector machine naive bayes
下载PDF
基于Weka平台的文本分类实验研究 被引量:1
19
作者 李梅 《楚雄师范学院学报》 2020年第3期115-119,共5页
文本分类的分类算法常用J48算法、Naive Bayes Multinomia算法和SMO算法,利用Weka平台选择路透社的数据集进行分类实验,根据查准率、查全率和F-Measure综合指标结合其他文本分类评价指标分析六次实验得到的结果,得出SMO算法优于其他两... 文本分类的分类算法常用J48算法、Naive Bayes Multinomia算法和SMO算法,利用Weka平台选择路透社的数据集进行分类实验,根据查准率、查全率和F-Measure综合指标结合其他文本分类评价指标分析六次实验得到的结果,得出SMO算法优于其他两个算法。针对选择的Naive Bayes Multinomia算法,调整了numToSelect值,对其结果进行了优化。以此实验为文本分类研究工作提供参考。 展开更多
关键词 文本分类 J48算法 naive bayes Multinomia算法 SMO算法 WEKA
下载PDF
Comparison of Text Categorization Algorithms 被引量:4
20
作者 SHIYong-feng ZHAOYan-ping 《Wuhan University Journal of Natural Sciences》 EI CAS 2004年第5期798-804,共7页
This paper summarizes several automatic text categorization algorithms in common use recently, analyzes and compares their advantages and disadvantages. It provides clues for making use of appropriate automatic classi... This paper summarizes several automatic text categorization algorithms in common use recently, analyzes and compares their advantages and disadvantages. It provides clues for making use of appropriate automatic classifying algorithms in different fields. Finally some evaluations and summaries of these algorithms are discussed, and directions to further research have been pointed out. Key words text categorization - naive bayes - KNN - SVM - neural network CLC number TP 391 Foundation item: Supported by the National Natural Science Foundation of China (70031010) and the Research Foundation of Beijing Institute of TechnologyBiography: SHI Yong-feng (1980-), male, Master candidate, research direction: web information mining. 展开更多
关键词 text categorization naive bayes KNN SVM neural network
下载PDF
上一页 1 2 3 下一页 到第
使用帮助 返回顶部