期刊文献+
共找到52篇文章
< 1 2 3 >
每页显示 20 50 100
Parallel naive Bayes algorithm for large-scale Chinese text classification based on spark 被引量:22
1
作者 LIU Peng ZHAO Hui-han +3 位作者 TENG Jia-yu YANG Yan-yan LIU Ya-feng ZHU Zong-wei 《Journal of Central South University》 SCIE EI CAS CSCD 2019年第1期1-12,共12页
The sharp increase of the amount of Internet Chinese text data has significantly prolonged the processing time of classification on these data.In order to solve this problem,this paper proposes and implements a parall... The sharp increase of the amount of Internet Chinese text data has significantly prolonged the processing time of classification on these data.In order to solve this problem,this paper proposes and implements a parallel naive Bayes algorithm(PNBA)for Chinese text classification based on Spark,a parallel memory computing platform for big data.This algorithm has implemented parallel operation throughout the entire training and prediction process of naive Bayes classifier mainly by adopting the programming model of resilient distributed datasets(RDD).For comparison,a PNBA based on Hadoop is also implemented.The test results show that in the same computing environment and for the same text sets,the Spark PNBA is obviously superior to the Hadoop PNBA in terms of key indicators such as speedup ratio and scalability.Therefore,Spark-based parallel algorithms can better meet the requirement of large-scale Chinese text data mining. 展开更多
关键词 Chinese text classification naive bayes SPARK HADOOP resilient distributed dataset PARALLELIZATION
下载PDF
Situation assessment for air combat based on novel semi-supervised naive Bayes 被引量:15
2
作者 XU Ximeng YANG Rennong FU Ying 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2018年第4期768-779,共12页
A method is proposed to resolve the typical problem of air combat situation assessment. Taking the one-to-one air combat as an example and on the basis of air combat data recorded by the air combat maneuvering instrum... A method is proposed to resolve the typical problem of air combat situation assessment. Taking the one-to-one air combat as an example and on the basis of air combat data recorded by the air combat maneuvering instrument, the problem of air combat situation assessment is equivalent to the situation classification problem of air combat data. The fuzzy C-means clustering algorithm is proposed to cluster the selected air combat sample data and the situation classification of the data is determined by the data correlation analysis in combination with the clustering results and the pilots' description of the air combat process. On the basis of semi-supervised naive Bayes classifier, an improved algorithm is proposed based on data classification confidence, through which the situation classification of air combat data is carried out. The simulation results show that the improved algorithm can assess the air combat situation effectively and the improvement of the algorithm can promote the classification performance without significantly affecting the efficiency of the classifier. 展开更多
关键词 air combat situation assessment air combat maneu-vering instrument SEMI-SUPERVISED naive bayes.
下载PDF
A Feature Weighted Mixed Naive Bayes Model for Monitoring Anomalies in the Fan System of a Thermal Power Plant 被引量:3
3
作者 Min Wang Li Sheng +1 位作者 Donghua Zhou Maoyin Chen 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2022年第4期719-727,共9页
With the increasing intelligence and integration,a great number of two-valued variables(generally stored in the form of 0 or 1)often exist in large-scale industrial processes.However,these variables cannot be effectiv... With the increasing intelligence and integration,a great number of two-valued variables(generally stored in the form of 0 or 1)often exist in large-scale industrial processes.However,these variables cannot be effectively handled by traditional monitoring methods such as linear discriminant analysis(LDA),principal component analysis(PCA)and partial least square(PLS)analysis.Recently,a mixed hidden naive Bayesian model(MHNBM)is developed for the first time to utilize both two-valued and continuous variables for abnormality monitoring.Although the MHNBM is effective,it still has some shortcomings that need to be improved.For the MHNBM,the variables with greater correlation to other variables have greater weights,which can not guarantee greater weights are assigned to the more discriminating variables.In addition,the conditional P(x j|x j′,y=k)probability must be computed based on historical data.When the training data is scarce,the conditional probability between continuous variables tends to be uniformly distributed,which affects the performance of MHNBM.Here a novel feature weighted mixed naive Bayes model(FWMNBM)is developed to overcome the above shortcomings.For the FWMNBM,the variables that are more correlated to the class have greater weights,which makes the more discriminating variables contribute more to the model.At the same time,FWMNBM does not have to calculate the conditional probability between variables,thus it is less restricted by the number of training data samples.Compared with the MHNBM,the FWMNBM has better performance,and its effectiveness is validated through numerical cases of a simulation example and a practical case of the Zhoushan thermal power plant(ZTPP),China. 展开更多
关键词 Abnormality monitoring continuous variables feature weighted mixed naive bayes model(FWMNBM) two-valued variables thermal power plant
下载PDF
Spam Filtering:Online Naive Bayes Based on TONE 被引量:1
4
作者 Guanglu Sun Hongyue Sun +1 位作者 Yingcai Ma Yuewu Shen 《ZTE Communications》 2013年第2期51-54,共4页
The naive, Bayes (NB) model has been successfully used to tackle spare, and is very accurate. However, there is still room for improwment. We use a train on or near error (TONE) method in online NB to enhance the ... The naive, Bayes (NB) model has been successfully used to tackle spare, and is very accurate. However, there is still room for improwment. We use a train on or near error (TONE) method in online NB to enhance the perfornmnee of NB and reduce the number of training emails. We conducted an experiment to determine the performanee of the improved algorithm by plotting (I-ROCA)% curves. The resuhs show that the proposed method improves the performanee of original NB. 展开更多
关键词 spare fihering online naive bayes train-on or near error
下载PDF
Social Network Rumor Recognition Based on Enhanced Naive Bayes 被引量:1
5
作者 Lei Guo 《Journal of New Media》 2021年第3期99-107,共9页
In recent years,with the increasing popularity of social networks,rumors have become more common.At present,the solution to rumors in social networks is mainly through media censorship and manual reporting,but this me... In recent years,with the increasing popularity of social networks,rumors have become more common.At present,the solution to rumors in social networks is mainly through media censorship and manual reporting,but this method requires a lot of manpower and material resources,and the cost is relatively high.Therefore,research on the characteristics of rumors and automatic identification and classification of network message text is of great significance.This paper uses the Naive Bayes algorithm combined with Laplacian smoothing to identify rumors in social network texts.The first is to segment the text and remove the stop words after the word segmentation is completed.Because of the data-sensitive nature of Naive Bayes,this paper performs text preprocessing on the input data.Then a naive Bayes classifier is constructed,and the Laplacian smoothing method is introduced to solve the problem of using the naive Bayes model to estimate the zero probability in rumor recognition.Finally,experiments show that the Naive Bayes algorithm combined with Laplace smoothing can effectively improve the accuracy of rumor recognition. 展开更多
关键词 Rumor recognition social network machine learning naive bayes laplacian smoothing
下载PDF
Mobile SMS Spam Filtering for Nepali Text Using Naive Bayesian and Support Vector Machine 被引量:2
6
作者 Tej Bahadur Shahi Abhimanu Yadav 《International Journal of Intelligence Science》 2014年第1期24-28,共5页
Spam is a universal problem with which everyone is familiar. A number of approaches are used for Spam filtering. The most common filtering technique is content-based filtering which uses the actual text of message to ... Spam is a universal problem with which everyone is familiar. A number of approaches are used for Spam filtering. The most common filtering technique is content-based filtering which uses the actual text of message to determine whether it is Spam or not. The content is very dynamic and it is very challenging to represent all information in a mathematical model of classification. For instance, in content-based Spam filtering, the characteristics used by the filter to identify Spam message are constantly changing over time. Na?ve Bayes method represents the changing nature of message using probability theory and support vector machine (SVM) represents those using different features. These two methods of classification are efficient in different domains and the case of Nepali SMS or Text classification has not yet been in consideration;these two methods do not consider the issue and it is interesting to find out the performance of both the methods in the problem of Nepali Text classification. In this paper, the Na?ve Bayes and SVM-based classification techniques are implemented to classify the Nepali SMS as Spam and non-Spam. An empirical analysis for various text cases has been done to evaluate accuracy measure of the classification methodologies used in this study. And, it is found to be 87.15% accurate in SVM and 92.74% accurate in the case of Na?ve Bayes. 展开更多
关键词 SMS Spam Filtering Classification Support Vector Machine naive bayes PREPROCESSING Feature Extraction Nepali SMS Datasets
下载PDF
An Ensemble-Based Hotel Reviews System Using Naive Bayes Classifier
7
作者 Joseph Bamidele Awotunde Sanjay Misra +1 位作者 Vikash Katta Oluwafemi Charles Adebayo 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第10期131-154,共24页
The task of classifying opinions conveyed in any form of text online is referred to as sentiment analysis.The emergence of social media usage and its spread has given room for sentiment analysis in our daily lives.Soc... The task of classifying opinions conveyed in any form of text online is referred to as sentiment analysis.The emergence of social media usage and its spread has given room for sentiment analysis in our daily lives.Social media applications and websites have become the foremost spring of data recycled for reviews for sentimentality in various fields.Various subject matter can be encountered on social media platforms,such as movie product reviews,consumer opinions,and testimonies,among others,which can be used for sentiment analysis.The rapid uncovering of these web contents contains divergence of many benefits like profit-making,which is one of the most vital of them all.According to a recent study,81%of consumers conduct online research prior to making a purchase.But the reviews available online are too huge and numerous for human brains to process and analyze.Hence,machine learning classifiers are one of the prominent tools used to classify sentiment in order to get valuable information for use in companies like hotels,game companies,and so on.Understanding the sentiments of people towards different commodities helps to improve the services for contextual promotions,referral systems,and market research.Therefore,this study proposes a sentiment-based framework detection to enable the rapid uncovering of opinionated contents of hotel reviews.A Naive Bayes classifier was used to process and analyze the dataset for the detection of the polarity of the words.The dataset from Datafiniti’s Business Database obtained from Kaggle was used for the experiments in this study.The performance evaluation of the model shows a test accuracy of 96.08%,an F1-score of 96.00%,a precision of 96.00%,and a recall of 96.00%.The results were compared with state-of-the-art classifiers and showed a promising performance andmuch better in terms of performancemetrics. 展开更多
关键词 Sentiment analysis hotel reviews naive bayes algorithm consumer opinions web 2.0 machine learning
下载PDF
Ensemble Variable Selection for Naive Bayes to Improve Customer Behaviour Analysis
8
作者 R.Siva Subramanian D.Prabha 《Computer Systems Science & Engineering》 SCIE EI 2022年第4期339-355,共17页
Executing customer analysis in a systemic way is one of the possible solutions for each enterprise to understand the behavior of consumer patterns in an efficient and in-depth manner.Further investigation of customer p... Executing customer analysis in a systemic way is one of the possible solutions for each enterprise to understand the behavior of consumer patterns in an efficient and in-depth manner.Further investigation of customer patterns helps thefirm to develop efficient decisions and in turn,helps to optimize the enter-prise’s business and maximizes consumer satisfaction correspondingly.To con-duct an effective assessment about the customers,Naive Bayes(also called Simple Bayes),a machine learning model is utilized.However,the efficacious of the simple Bayes model is utterly relying on the consumer data used,and the existence of uncertain and redundant attributes in the consumer data enables the simple Bayes model to attain the worst prediction in consumer data because of its presumption regarding the attributes applied.However,in practice,the NB pre-mise is not true in consumer data,and the analysis of these redundant attributes enables simple Bayes model to get poor prediction results.In this work,an ensem-ble attribute selection methodology is performed to overcome the problem with consumer data and to pick a steady uncorrelated attribute set to model with the NB classifier.In ensemble variable selection,two different strategies are applied:one is based upon data perturbation(or homogeneous ensemble,same feature selector is applied to a different subsamples derived from the same learning set)and the other one is based upon function perturbation(or heterogeneous ensemble different feature selector is utilized to the same learning set).Further-more,the feature set captured from both ensemble strategies is applied to NB indi-vidually and the outcome obtained is computed.Finally,the experimental outcomes show that the proposed ensemble strategies perform efficiently in choosing a steady attribute set and increasing NB classification performance efficiently. 展开更多
关键词 naive bayes or simple bayes variable selection homogeneous ensemble heterogeneous ensemble customer prediction
下载PDF
Improving naive Bayes classifier by dividing its decision regions 被引量:3
9
作者 Zhi-yong YAN Gong-fu XU Yun-he PAN 《Journal of Zhejiang University-Science C(Computers and Electronics)》 SCIE EI 2011年第8期647-657,共11页
Classification can be regarded as dividing the data space into decision regions separated by decision boundaries.In this paper we analyze decision tree algorithms and the NBTree algorithm from this perspective.Thus,a ... Classification can be regarded as dividing the data space into decision regions separated by decision boundaries.In this paper we analyze decision tree algorithms and the NBTree algorithm from this perspective.Thus,a decision tree can be regarded as a classifier tree,in which each classifier on a non-root node is trained in decision regions of the classifier on the parent node.Meanwhile,the NBTree algorithm,which generates a classifier tree with the C4.5 algorithm and the naive Bayes classifier as the root and leaf classifiers respectively,can also be regarded as training naive Bayes classifiers in decision regions of the C4.5 algorithm.We propose a second division (SD) algorithm and three soft second division (SD-soft) algorithms to train classifiers in decision regions of the naive Bayes classifier.These four novel algorithms all generate two-level classifier trees with the naive Bayes classifier as root classifiers.The SD and three SD-soft algorithms can make good use of both the information contained in instances near decision boundaries,and those that may be ignored by the naive Bayes classifier.Finally,we conduct experiments on 30 data sets from the UC Irvine (UCI) repository.Experiment results show that the SD algorithm can obtain better generali-zation abilities than the NBTree and the averaged one-dependence estimators (AODE) algorithms when using the C4.5 algorithm and support vector machine (SVM) as leaf classifiers.Further experiments indicate that our three SD-soft algorithms can achieve better generalization abilities than the SD algorithm when argument values are selected appropriately. 展开更多
关键词 naive bayes classifier Decision region NBTree C4.5 algorithm Support vector machine (SVM)
原文传递
Naive Bayes for value difference metric 被引量:3
10
作者 Chaoqun LI Liangxiao JIANG Hongwei LI 《Frontiers of Computer Science》 SCIE EI CSCD 2014年第2期255-264,共10页
The value difference metric (VDM) is one of the best-known and widely used distance functions for nominal attributes. This work applies the instance weighting technique to improve VDM. An instance weighted value dif... The value difference metric (VDM) is one of the best-known and widely used distance functions for nominal attributes. This work applies the instance weighting technique to improve VDM. An instance weighted value difference met- ric (IWVDM) is proposed here. Different from prior work, IWVDM uses naive Bayes (NB) to find weights for train- ing instances. Because early work has shown that there is a close relationship between VDM and NB, some work on NB can be applied to VDM. The weight of a training instance x, that belongs to the class c, is assigned according to the dif- ference between the estimated conditional probability P(c/x) by NB and the true conditional probability P(c/x), and the weight is adjusted iteratively. Compared with previous work, IWVDM has the advantage of reducing the time complex- ity of the process of finding weights, and simultaneously im- proving the performance of VDM. Experimental results on 36 UCI datasets validate the effectiveness of IWVDM. 展开更多
关键词 value difference metric instance weighting naive bayes distance-based learning algorithms
原文传递
Naive Bayes Classifier for Debris Flow Disaster Mitigation in Mount Merapi Volcanic Rivers,Indonesia,Using X-band Polarimetric Radar
11
作者 Ratih Indri Hapsari Bima Ahida Indaka Sugna +2 位作者 Dandung Novianto Rosa Andrie Asmara Satoru Oishi 《International Journal of Disaster Risk Science》 SCIE CSCD 2020年第6期776-789,共14页
Debris flow triggered by rainfall that accompanies a volcanic eruption is a serious secondary impact of a volcanic disaster.The probability of debris flow events can be estimated based on the prior information of rain... Debris flow triggered by rainfall that accompanies a volcanic eruption is a serious secondary impact of a volcanic disaster.The probability of debris flow events can be estimated based on the prior information of rainfall from historical and geomorphological data that are presumed to relate to debris flow occurrence.In this study,a debris flow disaster warning system was developed by applying the Na?¨ve Bayes Classifier(NBC).The spatial likelihood of the hazard is evaluated at a small subbasin scale by including high-resolution rainfall measurements from X-band polarimetric weather radar,a topographic factor,and soil type as predictors.The study was conducted in the Gendol River Basin of Mount Merapi,one of the most active volcanoes in Indonesia.Rainfall and debris flow occurrence data were collected for the upper Gendol River from October 2016 to February 2018 and divided into calibration and validation datasets.The NBC was used to estimate the status of debris flow incidences displayed in the susceptibility map that is based on the posterior probability from the predictors.The system verification was performed by quantitative dichotomous quality indices along with a contingency table.Using the validation datasets,the advantage of the NBC for estimating debris flow occurrence is confirmed.This work contributes to existing knowledge on estimating debris flow susceptibility through the data mining approach.Despite the existence of predictive uncertainty,the presented system could contribute to the improvement of debris flow countermeasures in volcanic regions. 展开更多
关键词 Debris flows Gendol River Indonesia Merapi volcano naive bayes classifier
原文传递
A Naive Bayes model on lung adenocarcinoma projection based on tumor microenvironment and weighted gene coexpression network analysis
12
作者 Zhiqiang Ye Pingping Song +2 位作者 Degao Zheng Xu Zhang Jianhong Wu 《Infectious Disease Modelling》 2022年第3期498-509,共12页
Based on the lung adenocarcinoma(LUAD)gene expression data from the cancer genome atlas(TCGA)database,the Stromal score,Immune score and Estimate score in tumor microenvironment(TME)were computed by the Estimation of ... Based on the lung adenocarcinoma(LUAD)gene expression data from the cancer genome atlas(TCGA)database,the Stromal score,Immune score and Estimate score in tumor microenvironment(TME)were computed by the Estimation of Stromal and Immune cells in Malignant Tumor tissues using Expression data(ESTIMATE)algorithm.And gene modules significantly related to the three scores were identified by weighted gene coexpression network analysis(WGCNA).Based on the correlation coefficients and P values,899 key genes affecting tumor microenvironment were obtained by selecting the two most correlated modules.It was suggested through Gene Ontology(GO)and Kyoto Encyclopedia of Genes and Genomes(KEGG)enrichment analysis that these key genes were significantly involved in immune-related or cancer-related terms.Through univariate cox regression and elastic network analysis,genes associated with prognosis of the LUAD patients were screened out and their prognostic values were further verified by the survival analysis and the University of ALabama at Birmingham CANcer(UALCAN)database.The results indicated that eight genes were significantly related to the overall survival of LUAD.Among them,six genes were found differentially expressed between tumor and control samples.And immune infiltration analysis further verified that all the six genes were significantly related to tumor purity and immune cells.Therefore,these genes were used eventually for constructing a Naive Bayes projection model of LUAD.The model was verified by the receiver operating characteristic(ROC)curve where the area under curve(AUC)reached 92.03%,which suggested that the model could discriminate the tumor samples from the normal accurately.Our study provided an effective model for LUAD projection which improved the clinical diagnosis and cure of LUAD.The result also confirmed that the six genes in the model construction could be the potential prognostic biomarkers of LUAD. 展开更多
关键词 naive bayes model Tumor microenvironment Lung adenocarcinoma Weighted gene co-expression network ANALYSIS Prognostic biomarkers
原文传递
Network-based naive Bayes model for social network
13
作者 Danyang Huang Guoyu Guan +1 位作者 Jing Zhou Hansheng Wang 《Science China Mathematics》 SCIE CSCD 2018年第4期627-640,共14页
Naive Bayes(NB) is one of the most popular classification methods. It is particularly useful when the dimension of the predictor is high and data are generated independently. In the meanwhile, social network data are ... Naive Bayes(NB) is one of the most popular classification methods. It is particularly useful when the dimension of the predictor is high and data are generated independently. In the meanwhile, social network data are becoming increasingly accessible, due to the fast development of various social network services and websites. By contrast, data generated by a social network are most likely to be dependent. The dependency is mainly determined by their social network relationships. Then, how to extend the classical NB method to social network data becomes a problem of great interest. To this end, we propose here a network-based naive Bayes(NNB) method, which generalizes the classical NB model to social network data. The key advantage of the NNB method is that it takes the network relationships into consideration. The computational efficiency makes the NNB method even feasible in large scale social networks. The statistical properties of the NNB model are theoretically investigated. Simulation studies have been conducted to demonstrate its finite sample performance.A real data example is also analyzed for illustration purpose. 展开更多
关键词 classification naive bayes Sina Weibo social network data
原文传递
FLBS: Fuzzy lion Bayes system for intrusion detection in wireless communication network
14
作者 NARENDRASINH B Gohil VDEVYAS Dwivedi 《Journal of Central South University》 SCIE EI CAS CSCD 2019年第11期3017-3033,共17页
An important problem in wireless communication networks (WCNs) is that they have a minimum number of resources, which leads to high-security threats. An approach to find and detect the attacks is the intrusion detecti... An important problem in wireless communication networks (WCNs) is that they have a minimum number of resources, which leads to high-security threats. An approach to find and detect the attacks is the intrusion detection system (IDS). In this paper, the fuzzy lion Bayes system (FLBS) is proposed for intrusion detection mechanism. Initially, the data set is grouped into a number of clusters by the fuzzy clustering algorithm. Here, the Naive Bayes classifier is integrated with the lion optimization algorithm and the new lion naive Bayes (LNB) is created for optimally generating the probability measures. Then, the LNB model is applied to each data group, and the aggregated data is generated. After generating the aggregated data, the LNB model is applied to the aggregated data, and the abnormal nodes are identified based on the posterior probability function. The performance of the proposed FLBS system is evaluated using the KDD Cup 99 data and the comparative analysis is performed by the existing methods for the evaluation metrics accuracy and false acceptance rate (FAR). From the experimental results, it can be shown that the proposed system has the maximum performance, which shows the effectiveness of the proposed system in the intrusion detection. 展开更多
关键词 intrusion detection wireless communication network fuzzy clustering naive bayes classifier lion naive bayes system
下载PDF
基于Bayes网的软件构件分类
15
作者 白成刚 《计算机工程与应用》 CSCD 北大核心 2005年第33期17-19,共3页
对软件构件进行分类有助于人们开发高质量的软件。Naive-Bayes网在分类中已经得到成功的应用。但是Naive-Bayes网有一个基本假设:各特征节点要求条件独立。不幸的事,这在现实世界中很难成立。论文利用主成分分析的方法降低了各特征节点... 对软件构件进行分类有助于人们开发高质量的软件。Naive-Bayes网在分类中已经得到成功的应用。但是Naive-Bayes网有一个基本假设:各特征节点要求条件独立。不幸的事,这在现实世界中很难成立。论文利用主成分分析的方法降低了各特征节点的相关性,扩展了Naive-Bayes网的应用范围,并将其用于对软件构件进行分类。实例分析表明新的Bayes分类网预测精度高于一般的Naive-Bayes网。 展开更多
关键词 软件构件 naivebayes 分类器
下载PDF
基于人工智能的小麦高效育种信息交互系统构建
16
作者 杨民安 孙雨 +2 位作者 王凤超 杨晶 陈进 《农业工程学报》 EI CAS CSCD 北大核心 2024年第13期117-123,共7页
小麦是人类社会重要的粮食资源之一,因此基于人工智能技术构建高效育种信息交互平台对于高质高产的小麦种植具有重要的战略价值。高效育种信息交互平台的搭建关键在于核心数据的准确识别与分类,基于此该研究提出了一种Naive Bayes(朴素... 小麦是人类社会重要的粮食资源之一,因此基于人工智能技术构建高效育种信息交互平台对于高质高产的小麦种植具有重要的战略价值。高效育种信息交互平台的搭建关键在于核心数据的准确识别与分类,基于此该研究提出了一种Naive Bayes(朴素贝叶斯)-AdaBoost策略,应用于小麦育种信息数据的分类与识别,并实现构筑交互平台。在该策略中AdaBoost主要用于对Naive Bayes的弱分类器进行迭代,形成强分类器,同时过滤并优化核心词汇,达到提高分类识别准确度的目的。结果显示,与传统Naive Bayes方法相比该方法准确率提高了12.2个百分点,识别的准确率达到99.2%,而此时基于Naive Bayes、决策树、支持向量机3种方法的准确率分别为87.0%、86.6%和85.6%。结果表明,该研究所提方法在面对复杂数据分类识别的场景中具有较大的应用潜力。 展开更多
关键词 小麦 交互平台 naive bayes ADABOOST 育种交互 核心词汇
下载PDF
DeBERTa-GRU: Sentiment Analysis for Large Language Model
17
作者 Adel Assiri Abdu Gumaei +2 位作者 Faisal Mehmood Touqeer Abbas Sami Ullah 《Computers, Materials & Continua》 SCIE EI 2024年第6期4219-4236,共18页
Modern technological advancements have made social media an essential component of daily life.Social media allow individuals to share thoughts,emotions,and ideas.Sentiment analysis plays the function of evaluating whe... Modern technological advancements have made social media an essential component of daily life.Social media allow individuals to share thoughts,emotions,and ideas.Sentiment analysis plays the function of evaluating whether the sentiment of the text is positive,negative,neutral,or any other personal emotion to understand the sentiment context of the text.Sentiment analysis is essential in business and society because it impacts strategic decision-making.Sentiment analysis involves challenges due to lexical variation,an unlabeled dataset,and text distance correlations.The execution time increases due to the sequential processing of the sequence models.However,the calculation times for the Transformer models are reduced because of the parallel processing.This study uses a hybrid deep learning strategy to combine the strengths of the Transformer and Sequence models while ignoring their limitations.In particular,the proposed model integrates the Decoding-enhanced with Bidirectional Encoder Representations from Transformers(BERT)attention(DeBERTa)and the Gated Recurrent Unit(GRU)for sentiment analysis.Using the Decoding-enhanced BERT technique,the words are mapped into a compact,semantic word embedding space,and the Gated Recurrent Unit model can capture the distance contextual semantics correctly.The proposed hybrid model achieves F1-scores of 97%on the Twitter Large Language Model(LLM)dataset,which is much higher than the performance of new techniques. 展开更多
关键词 DeBERTa GRU naive bayes LSTM sentiment analysis large language model
下载PDF
Fine-Tuning Cyber Security Defenses: Evaluating Supervised Machine Learning Classifiers for Windows Malware Detection
18
作者 Islam Zada Mohammed Naif Alatawi +4 位作者 Syed Muhammad Saqlain Abdullah Alshahrani Adel Alshamran Kanwal Imran Hessa Alfraihi 《Computers, Materials & Continua》 SCIE EI 2024年第8期2917-2939,共23页
Malware attacks on Windows machines pose significant cybersecurity threats,necessitating effective detection and prevention mechanisms.Supervised machine learning classifiers have emerged as promising tools for malwar... Malware attacks on Windows machines pose significant cybersecurity threats,necessitating effective detection and prevention mechanisms.Supervised machine learning classifiers have emerged as promising tools for malware detection.However,there remains a need for comprehensive studies that compare the performance of different classifiers specifically for Windows malware detection.Addressing this gap can provide valuable insights for enhancing cybersecurity strategies.While numerous studies have explored malware detection using machine learning techniques,there is a lack of systematic comparison of supervised classifiers for Windows malware detection.Understanding the relative effectiveness of these classifiers can inform the selection of optimal detection methods and improve overall security measures.This study aims to bridge the research gap by conducting a comparative analysis of supervised machine learning classifiers for detecting malware on Windows systems.The objectives include Investigating the performance of various classifiers,such as Gaussian Naïve Bayes,K Nearest Neighbors(KNN),Stochastic Gradient Descent Classifier(SGDC),and Decision Tree,in detecting Windows malware.Evaluating the accuracy,efficiency,and suitability of each classifier for real-world malware detection scenarios.Identifying the strengths and limitations of different classifiers to provide insights for cybersecurity practitioners and researchers.Offering recommendations for selecting the most effective classifier for Windows malware detection based on empirical evidence.The study employs a structured methodology consisting of several phases:exploratory data analysis,data preprocessing,model training,and evaluation.Exploratory data analysis involves understanding the dataset’s characteristics and identifying preprocessing requirements.Data preprocessing includes cleaning,feature encoding,dimensionality reduction,and optimization to prepare the data for training.Model training utilizes various supervised classifiers,and their performance is evaluated using metrics such as accuracy,precision,recall,and F1 score.The study’s outcomes comprise a comparative analysis of supervised machine learning classifiers for Windows malware detection.Results reveal the effectiveness and efficiency of each classifier in detecting different types of malware.Additionally,insights into their strengths and limitations provide practical guidance for enhancing cybersecurity defenses.Overall,this research contributes to advancing malware detection techniques and bolstering the security posture of Windows systems against evolving cyber threats. 展开更多
关键词 Security and privacy challenges in the context of requirements engineering supervisedmachine learning malware detection windows systems comparative analysis Gaussian naive bayes K Nearest Neighbors Stochastic Gradient Descent Classifier Decision Tree
下载PDF
Machine Learning Models for Heterogenous Network Security Anomaly Detection
19
作者 Mercy Diligence Ogah Joe Essien +1 位作者 Martin Ogharandukun Monday Abdullahi 《Journal of Computer and Communications》 2024年第6期38-58,共21页
The increasing amount and intricacy of network traffic in the modern digital era have worsened the difficulty of identifying abnormal behaviours that may indicate potential security breaches or operational interruptio... The increasing amount and intricacy of network traffic in the modern digital era have worsened the difficulty of identifying abnormal behaviours that may indicate potential security breaches or operational interruptions. Conventional detection approaches face challenges in keeping up with the ever-changing strategies of cyber-attacks, resulting in heightened susceptibility and significant harm to network infrastructures. In order to tackle this urgent issue, this project focused on developing an effective anomaly detection system that utilizes Machine Learning technology. The suggested model utilizes contemporary machine learning algorithms and frameworks to autonomously detect deviations from typical network behaviour. It promptly identifies anomalous activities that may indicate security breaches or performance difficulties. The solution entails a multi-faceted approach encompassing data collection, preprocessing, feature engineering, model training, and evaluation. By utilizing machine learning methods, the model is trained on a wide range of datasets that include both regular and abnormal network traffic patterns. This training ensures that the model can adapt to numerous scenarios. The main priority is to ensure that the system is functional and efficient, with a particular emphasis on reducing false positives to avoid unwanted alerts. Additionally, efforts are directed on improving anomaly detection accuracy so that the model can consistently distinguish between potentially harmful and benign activity. This project aims to greatly strengthen network security by addressing emerging cyber threats and improving their resilience and reliability. 展开更多
关键词 Cyber-Security Network Anomaly Detection Machine Learning Random Forest Decision Tree Gaussian naive bayes
下载PDF
Using machine learning to identify primary features in choosing electric vehicles based on income levels
20
作者 Mingjun Ma Eugene Pinsky 《Data Science and Management》 2024年第1期1-6,共6页
An electric vehicle is becoming one of the popular choices when choosing a vehicle.People are generally impressed with electric vehicles’zero-emission and smooth drives,while unstable battery duration keeps people aw... An electric vehicle is becoming one of the popular choices when choosing a vehicle.People are generally impressed with electric vehicles’zero-emission and smooth drives,while unstable battery duration keeps people away.This study tries to identify the primary factors that affect the likelihood of owning an electric vehicle based on different income levels.We divide the dataset into three subgroups by household income from$50,000 to$150,000 or low-medium income level,$150,000 to$250,000 or medium-high income level,and$250,000 or above,the high-income level.We considered several machine learning classifiers,and naive Bayes gave us a relatively higher accuracy than other algorithms in terms of overall accuracy and F1 scores.Based on the probability analysis,we found that for each of these groups,one-way commuting distance is the most important for all three income levels. 展开更多
关键词 Unbabalced data Electric vehicle Machine learning Sampling with replacement Supervised learning naive bayes
下载PDF
上一页 1 2 3 下一页 到第
使用帮助 返回顶部