期刊文献+
共找到68篇文章
< 1 2 4 >
每页显示 20 50 100
New Spam Filtering Method with Hadoop Tuning-Based MapReduce Naïve Bayes
1
作者 Keungyeup Ji Youngmi Kwon 《Computer Systems Science & Engineering》 SCIE EI 2023年第4期201-214,共14页
As the importance of email increases,the amount of malicious email is also increasing,so the need for malicious email filtering is growing.Since it is more economical to combine commodity hardware consisting of a medi... As the importance of email increases,the amount of malicious email is also increasing,so the need for malicious email filtering is growing.Since it is more economical to combine commodity hardware consisting of a medium server or PC with a virtual environment to use as a single server resource and filter malicious email using machine learning techniques,we used a Hadoop MapReduce framework and Naïve Bayes among machine learning methods for malicious email filtering.Naïve Bayes was selected because it is one of the top machine learning methods(Support Vector Machine(SVM),Naïve Bayes,K-Nearest Neighbor(KNN),and Decision Tree)in terms of execution time and accuracy.Malicious email was filtered with MapReduce programming using the Naïve Bayes technique,which is a supervised machine learning method,in a Hadoop framework with optimized performance and also with the Python program technique with the Naïve Bayes technique applied in a bare metal server environment with the Hadoop environment not applied.According to the results of a comparison of the accuracy and predictive error rates of the two methods,the Hadoop MapReduce Naïve Bayes method improved the accuracy of spam and ham email identification 1.11 times and the prediction error rate 14.13 times compared to the non-Hadoop Python Naïve Bayes method. 展开更多
关键词 HADOOP hadoop distributed file system(HDFS) MAPREDUCE configuration parameter malicious email filtering Naïve bayes
下载PDF
结合特征和非特征信息改进Nave Bayes及其应用 被引量:2
2
作者 赵静 刘培玉 陈孝礼 《计算机应用研究》 CSCD 北大核心 2011年第2期514-516,共3页
朴素贝叶斯算法是一种常见的基于内容的垃圾邮件过滤算法,但是,传统朴素贝叶斯过滤存在判断内容的不确定性和邮件表示不完整性等问题。分析邮件信头各域在正常邮件和垃圾邮件中表现出的不同属性,提取非特征信息,结合特征信息和非特征信... 朴素贝叶斯算法是一种常见的基于内容的垃圾邮件过滤算法,但是,传统朴素贝叶斯过滤存在判断内容的不确定性和邮件表示不完整性等问题。分析邮件信头各域在正常邮件和垃圾邮件中表现出的不同属性,提取非特征信息,结合特征信息和非特征信息改进朴素贝叶斯算法。实验结果表明,改进的朴素贝叶斯分类方法与单纯使用特征信息的方法相比,垃圾邮件的召回率和准确率更高,凸显了该方法涵盖邮件信息、克服内容判断缺陷的优势。 展开更多
关键词 邮件过滤 非特征信息 特征信息 朴素贝叶斯算法
下载PDF
基于改进的Nave Bayes和BP神经网络的垃圾邮件过滤 被引量:1
3
作者 方莹 《兰州理工大学学报》 CAS 北大核心 2011年第2期98-101,共4页
不同用户对垃圾邮件的判定有所差别,考虑到同一用户的自认垃圾邮件相似度较大,提出对特定用户进行针对性的垃圾邮件过滤方法.系统除重点利用邮件正文信息外,还尝试加入发件人、群发信息和主题相关度信息,改进朴素贝叶斯公式用于邮件正... 不同用户对垃圾邮件的判定有所差别,考虑到同一用户的自认垃圾邮件相似度较大,提出对特定用户进行针对性的垃圾邮件过滤方法.系统除重点利用邮件正文信息外,还尝试加入发件人、群发信息和主题相关度信息,改进朴素贝叶斯公式用于邮件正文的概率计算,基于BP神经网络构造垃圾邮件判别系统.实验表明,改进的朴素贝叶斯公式用于本文的系统是可行的,基于BP神经网络的垃圾邮件过滤系统能有效综合以上四项数值进行全局判别,进而对特定用户的邮件产生不错的过滤效果. 展开更多
关键词 垃圾邮件 朴素贝叶斯 BP神经网络 平滑 过滤
下载PDF
Bayes-Q-Learning Algorithm in Edge Computing for Waste Tracking
4
作者 D.Palanikkumar R.Ramesh Kumar +2 位作者 Mehedi Masud Mrim M.Alnfiai Mohamed Abouhawwash 《Intelligent Automation & Soft Computing》 SCIE 2023年第5期2425-2440,共16页
The major environmental hazard in this pandemic is the unhygienic dis-posal of medical waste.Medical wastage is not properly managed it will become a hazard to the environment and humans.Managing medical wastage is a ... The major environmental hazard in this pandemic is the unhygienic dis-posal of medical waste.Medical wastage is not properly managed it will become a hazard to the environment and humans.Managing medical wastage is a major issue in the city,municipalities in the aspects of the environment,and logistics.An efficient supply chain with edge computing technology is used in managing medical waste.The supply chain operations include processing of waste collec-tion,transportation,and disposal of waste.Many research works have been applied to improve the management of wastage.The main issues in the existing techniques are ineffective and expensive and centralized edge computing which leads to failure in providing security,trustworthiness,and transparency.To over-come these issues,in this paper we implement an efficient Naive Bayes classifier algorithm and Q-Learning algorithm in decentralized edge computing technology with a binary bat optimization algorithm(NBQ-BBOA).This proposed work is used to track,detect,and manage medical waste.To minimize the transferring cost of medical wastage from various nodes,the Q-Learning algorithm is used.The accuracy obtained for the Naïve Bayes algorithm is 88%,the Q-Learning algo-rithm is 82%and NBQ-BBOA is 98%.The error rate of Root Mean Square Error(RMSE)and Mean Error(MAE)for the proposed work NBQ-BBOA are 0.012 and 0.045. 展开更多
关键词 Binary bat algorithm naïve bayes supply chain EDGE medical wastage
下载PDF
Attribute Weighted Naïve Bayes Classifier 被引量:1
5
作者 Lee-Kien Foo Sook-Ling Chua Neveen Ibrahim 《Computers, Materials & Continua》 SCIE EI 2022年第4期1945-1957,共13页
The naïve Bayes classifier is one of the commonly used data mining methods for classification.Despite its simplicity,naïve Bayes is effective and computationally efficient.Although the strong attribute indep... The naïve Bayes classifier is one of the commonly used data mining methods for classification.Despite its simplicity,naïve Bayes is effective and computationally efficient.Although the strong attribute independence assumption in the naïve Bayes classifier makes it a tractable method for learning,this assumption may not hold in real-world applications.Many enhancements to the basic algorithm have been proposed in order to alleviate the violation of attribute independence assumption.While these methods improve the classification performance,they do not necessarily retain the mathematical structure of the naïve Bayes model and some at the expense of computational time.One approach to reduce the naïvetéof the classifier is to incorporate attribute weights in the conditional probability.In this paper,we proposed a method to incorporate attribute weights to naïve Bayes.To evaluate the performance of our method,we used the public benchmark datasets.We compared our method with the standard naïve Bayes and baseline attribute weighting methods.Experimental results show that our method to incorporate attribute weights improves the classification performance compared to both standard naïve Bayes and baseline attribute weighting methods in terms of classification accuracy and F1,especially when the independence assumption is strongly violated,which was validated using the Chi-square test of independence. 展开更多
关键词 Attribute weighting naïve bayes Kullback-Leibler information gain CLASSIFICATION
下载PDF
Improved Bearing Fault Diagnosis by Feature Extraction Based on GLCM, Fusion of Selection Methods, and Multiclass-Naïve Bayes Classification 被引量:1
6
作者 Mireille Pouyap Laurent Bitjoka +1 位作者 Etienne Mfoumou Denis Toko 《Journal of Signal and Information Processing》 2021年第4期71-85,共15页
<span style="font-family:Verdana;">The presence of bearing faults reduces the efficiency of rotating machines and thus increases energy consumption or even the total stoppage of the machine. </span&... <span style="font-family:Verdana;">The presence of bearing faults reduces the efficiency of rotating machines and thus increases energy consumption or even the total stoppage of the machine. </span><span style="font-family:Verdana;">It becomes essential to correctly diagnose the fault caused by the bearing.</span><span style="font-family:Verdana;"> Hence the importance of determining an effective features extraction method that best describes the fault. The vision of this paper is to merge the features selection methods in order to define the most relevant featuresin the texture </span><span style="font-family:Verdana;">of the vibration signal images. In this study, the Gray Level Co-occurrence </span><span style="font-family:Verdana;">Matrix (GLCM) in texture analysis is applied on the vibration signal represented in images. Features</span><span><span><span style="font-family:;" "=""> </span></span></span><span><span><span style="font-family:;" "=""><span style="font-family:Verdana;">selection based on the merge of PCA (Principal component Analysis) method and SFE (Sequential Features Extraction) method is </span><span style="font-family:Verdana;">done to obtain the most relevant features. The multiclass-Na<span style="white-space:nowrap;">?</span>ve Bayesclassifi</span><span style="font-family:Verdana;">er is used to test the proposed approach. The success rate of this classification is 98.27%. The relevant features obtained give promising results and are more efficient than the methods observed in the literature.</span></span></span></span> 展开更多
关键词 GLCM PCA SFE Naïve bayes Relevant Features
下载PDF
Integration of Expectation Maximization using Gaussian Mixture Models and Naïve Bayes for Intrusion Detection
7
作者 Loka Raj Ghimire Roshan Chitrakar 《Journal of Computer Science Research》 2021年第2期1-10,共10页
Intrusion detection is the investigation process of information about the system activities or its data to detect any malicious behavior or unauthorized activity.Most of the IDS implement K-means clustering technique ... Intrusion detection is the investigation process of information about the system activities or its data to detect any malicious behavior or unauthorized activity.Most of the IDS implement K-means clustering technique due to its linear complexity and fast computing ability.Nonetheless,it is Naïve use of the mean data value for the cluster core that presents a major drawback.The chances of two circular clusters having different radius and centering at the same mean will occur.This condition cannot be addressed by the K-means algorithm because the mean value of the various clusters is very similar together.However,if the clusters are not spherical,it fails.To overcome this issue,a new integrated hybrid model by integrating expectation maximizing(EM)clustering using a Gaussian mixture model(GMM)and naïve Bays classifier have been proposed.In this model,GMM give more flexibility than K-Means in terms of cluster covariance.Also,they use probabilities function and soft clustering,that’s why they can have multiple cluster for a single data.In GMM,we can define the cluster form in GMM by two parameters:the mean and the standard deviation.This means that by using these two parameters,the cluster can take any kind of elliptical shape.EM-GMM will be used to cluster data based on data activity into the corresponding category. 展开更多
关键词 Anomaly detection Clustering EM classification Expectation maximization(EM) Gaussian mixture model(GMM) GMM classification Intrusion detection Naïve bayes classification
下载PDF
Naïve Bayes Algorithm for Large Scale Text Classification
8
作者 Pirunthavi SIVAKUMAR Jayalath EKANAYAKE 《Instrumentation》 2021年第4期55-62,共8页
This paper proposed an improved Naïve Bayes Classifier for sentimental analysis from a large-scale dataset such as in YouTube.YouTube contains large unstructured and unorganized comments and reactions,which carry... This paper proposed an improved Naïve Bayes Classifier for sentimental analysis from a large-scale dataset such as in YouTube.YouTube contains large unstructured and unorganized comments and reactions,which carry important information.Organizing large amounts of data and extracting useful information is a challenging task.The extracted information can be considered as new knowledge and can be used for deci sion-making.We extract comments from YouTube on videos and categorized them in domain-specific,and then apply the Naïve Bayes classifier with improved techniques.Our method provided a decent 80%accuracy in classifying those comments.This experiment shows that the proposed method provides excellent adaptability for large-scale text classification. 展开更多
关键词 Naïve bayes Text Classification YOUTUBE Sentimental Analysis
下载PDF
一种增量式Bayes文本分类算法 被引量:3
9
作者 高洁 吉根林 《南京师范大学学报(工程技术版)》 CAS 2004年第3期49-52,共4页
文本自动分类是数据挖掘和机器学习中非常重要的研究领域 .针对难以获得大量有类标签的训练集问题 ,提出了基于小规模标注语料的增量式Bayes文本分类算法 .该算法分两种情况处理 :第一种情况是新增样本有类标签 ,可直接重新计算样本属... 文本自动分类是数据挖掘和机器学习中非常重要的研究领域 .针对难以获得大量有类标签的训练集问题 ,提出了基于小规模标注语料的增量式Bayes文本分类算法 .该算法分两种情况处理 :第一种情况是新增样本有类标签 ,可直接重新计算样本属于某类别的条件概率 .第二种情况是新增样本无类标签 ,则利用现有分类器为其训练类标签 ,然后利用新样本来修正分类器 .实验结果表明 ,该算法是可行有效的 ,比Na veBayes文本分类算法有更高的精度 . 展开更多
关键词 文本分类 增量学习 Naüve bayes
下载PDF
基于加权Bayes分类算法的不完备信息系统数据挖掘研究
10
作者 李莉 赵晋强 《电脑知识与技术》 2007年第9期1408-1409,1480,共3页
基于相似粗集理论模型,对加权朴素Bayes算法进行了扩展,同时改进了传统不完备信息系统中缺失信息的弥补方法,并由此提出了基于不完备信息系统的加权Bayes分类算法,阐述了其对于不完备系统数据挖掘的重要意义,通过计算机仿真实验验证了... 基于相似粗集理论模型,对加权朴素Bayes算法进行了扩展,同时改进了传统不完备信息系统中缺失信息的弥补方法,并由此提出了基于不完备信息系统的加权Bayes分类算法,阐述了其对于不完备系统数据挖掘的重要意义,通过计算机仿真实验验证了该方法的有效性。 展开更多
关键词 粗集理论 加权朴素bayes 不完备信息系统 数据挖掘
下载PDF
Perspicacious Apprehension of HDTbNB Algorithm Opposed to Security Contravention
11
作者 Shyla Vishal Bhatnagar 《Intelligent Automation & Soft Computing》 SCIE 2023年第2期2431-2447,共17页
The exponential pace of the spread of the digital world has served as one of the assisting forces to generate an enormous amount of informationflow-ing over the network.The data will always remain under the threat of t... The exponential pace of the spread of the digital world has served as one of the assisting forces to generate an enormous amount of informationflow-ing over the network.The data will always remain under the threat of technolo-gical suffering where intruders and hackers consistently try to breach the security systems by gaining personal information insights.In this paper,the authors pro-posed the HDTbNB(Hybrid Decision Tree-based Naïve Bayes)algorithm tofind the essential features without data scaling to maximize the model’s performance by reducing the false alarm rate and training period to reduce zero frequency with enhanced accuracy of IDS(Intrusion Detection System)and to further analyze the performance execution of distinct machine learning algorithms as Naïve Bayes,Decision Tree,K-Nearest Neighbors and Logistic Regression over KDD 99 data-set.The performance of algorithm is evaluated by making a comparative analysis of computed parameters as accuracy,macro average,and weighted average.Thefindings were concluded as a percentage increase in accuracy,precision,sensitiv-ity,specificity,and a decrease in misclassification as 9.3%,6.4%,12.5%,5.2%and 81%. 展开更多
关键词 Naïve bayes decision tree k-nearest neighbors logistic regression neighbors classifier
下载PDF
Deep Learning-Based Classification of Rotten Fruits and Identification of Shelf Life
12
作者 S.Sofana Reka Ankita Bagelikar +2 位作者 Prakash Venugopal V.Ravi Harimurugan Devarajan 《Computers, Materials & Continua》 SCIE EI 2024年第1期781-794,共14页
The freshness of fruits is considered to be one of the essential characteristics for consumers in determining their quality,flavor and nutritional value.The primary need for identifying rotten fruits is to ensure that... The freshness of fruits is considered to be one of the essential characteristics for consumers in determining their quality,flavor and nutritional value.The primary need for identifying rotten fruits is to ensure that only fresh and high-quality fruits are sold to consumers.The impact of rotten fruits can foster harmful bacteria,molds and other microorganisms that can cause food poisoning and other illnesses to the consumers.The overall purpose of the study is to classify rotten fruits,which can affect the taste,texture,and appearance of other fresh fruits,thereby reducing their shelf life.The agriculture and food industries are increasingly adopting computer vision technology to detect rotten fruits and forecast their shelf life.Hence,this research work mainly focuses on the Convolutional Neural Network’s(CNN)deep learning model,which helps in the classification of rotten fruits.The proposed methodology involves real-time analysis of a dataset of various types of fruits,including apples,bananas,oranges,papayas and guavas.Similarly,machine learningmodels such as GaussianNaïve Bayes(GNB)and random forest are used to predict the fruit’s shelf life.The results obtained from the various pre-trained models for rotten fruit detection are analysed based on an accuracy score to determine the best model.In comparison to other pre-trained models,the visual geometry group16(VGG16)obtained a higher accuracy score of 95%.Likewise,the random forest model delivers a better accuracy score of 88% when compared with GNB in forecasting the fruit’s shelf life.By developing an accurate classification model,only fresh and safe fruits reach consumers,reducing the risks associated with contaminated produce.Thereby,the proposed approach will have a significant impact on the food industry for efficient fruit distribution and also benefit customers to purchase fresh fruits. 展开更多
关键词 Rotten fruit detection shelf life deep learning convolutional neural network machine learning gaussian naïve bayes random forest visual geometry group16
下载PDF
RP-NBSR: A Novel Network Attack Detection Model Based on Machine Learning 被引量:2
13
作者 Zihao Shen Hui Wang +3 位作者 Kun Liu Peiqian Liu Menglong Ba MengYao Zhao 《Computer Systems Science & Engineering》 SCIE EI 2021年第4期121-133,共13页
The rapid progress of the Internet has exposed networks to an increasednumber of threats. Intrusion detection technology can effectively protect networksecurity against malicious attacks. In this paper, we propose a R... The rapid progress of the Internet has exposed networks to an increasednumber of threats. Intrusion detection technology can effectively protect networksecurity against malicious attacks. In this paper, we propose a ReliefF-P-NaiveBayes and softmax regression (RP-NBSR) model based on machine learningfor network attack detection to improve the false detection rate and F1 score ofunknown intrusion behavior. In the proposed model, the Pearson correlation coef-ficient is introduced to compensate for deficiencies in correlation analysis betweenfeatures by the ReliefF feature selection algorithm, and a ReliefF-Pearson correlation coefficient (ReliefF-P) algorithm is proposed. Then, the Relief-P algorithm isused to preprocess the UNSW-NB15 dataset to remove irrelevant features andobtain a new feature subset. Finally, naïve Bayes and softmax regression (NBSR)classifier is constructed by cascading the naïve Bayes classifier and softmaxregression classifier, and an attack detection model based on RP-NBSR is established. The experimental results on the UNSW-NB15 dataset show that the attackdetection model based on RP-NBSR has a lower false detection rate and higherF1 score than other detection models. 展开更多
关键词 Naïve bayes softmax regression machine learning ReliefF-P attack detection
下载PDF
Cyberbullying Detection and Recognition with Type Determination Based on Machine Learning
14
作者 Khalid M.O.Nahar Mohammad Alauthman +1 位作者 Saud Yonbawi Ammar Almomani 《Computers, Materials & Continua》 SCIE EI 2023年第6期5307-5319,共13页
Social media networks are becoming essential to our daily activities,and many issues are due to this great involvement in our lives.Cyberbullying is a social media network issue,a global crisis affecting the victims a... Social media networks are becoming essential to our daily activities,and many issues are due to this great involvement in our lives.Cyberbullying is a social media network issue,a global crisis affecting the victims and society as a whole.It results from a misunderstanding regarding freedom of speech.In this work,we proposed a methodology for detecting such behaviors(bullying,harassment,and hate-related texts)using supervised machine learning algo-rithms(SVM,Naïve Bayes,Logistic regression,and random forest)and for predicting a topic associated with these text data using unsupervised natural language processing,such as latent Dirichlet allocation.In addition,we used accuracy,precision,recall,and F1 score to assess prior classifiers.Results show that the use of logistic regression,support vector machine,random forest model,and Naïve Bayes has 95%,94.97%,94.66%,and 93.1%accuracy,respectively. 展开更多
关键词 CYBERBULLYING social media naïve bayes support vector machine natural language processing LDA
下载PDF
Lung Cancer Prediction from Elvira Biomedical Dataset Using Ensemble Classifier with Principal Component Analysis
15
作者 Teresa Kwamboka Abuya 《Journal of Data Analysis and Information Processing》 2023年第2期175-199,共25页
Machine learning algorithms (MLs) can potentially improve disease diagnostics, leading to early detection and treatment of these diseases. As a malignant tumor whose primary focus is located in the bronchial mucosal e... Machine learning algorithms (MLs) can potentially improve disease diagnostics, leading to early detection and treatment of these diseases. As a malignant tumor whose primary focus is located in the bronchial mucosal epithelium, lung cancer has the highest mortality and morbidity among cancer types, threatening health and life of patients suffering from the disease. Machine learning algorithms such as Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbor (KNN) and Naïve Bayes (NB) have been used for lung cancer prediction. However they still face challenges such as high dimensionality of the feature space, over-fitting, high computational complexity, noise and missing data, low accuracies, low precision and high error rates. Ensemble learning, which combines classifiers, may be helpful to boost prediction on new data. However, current ensemble ML techniques rarely consider comprehensive evaluation metrics to evaluate the performance of individual classifiers. The main purpose of this study was to develop an ensemble classifier that improves lung cancer prediction. An ensemble machine learning algorithm is developed based on RF, SVM, NB, and KNN. Feature selection is done based on Principal Component Analysis (PCA) and Analysis of Variance (ANOVA). This algorithm is then executed on lung cancer data and evaluated using execution time, true positives (TP), true negatives (TN), false positives (FP), false negatives (FN), false positive rate (FPR), recall (R), precision (P) and F-measure (FM). Experimental results show that the proposed ensemble classifier has the best classification of 0.9825% with the lowest error rate of 0.0193. This is followed by SVM in which the probability of having the best classification is 0.9652% at an error rate of 0.0206. On the other hand, NB had the worst performance of 0.8475% classification at 0.0738 error rate. 展开更多
关键词 ACCURACY False Positive Rate Naïve bayes Random Forest Lung Cancer Prediction Principal Component Analysis Support vector Machine K-Nearest Neighbor
下载PDF
Predicting crash injury severity at unsignalized intersections using support vector machines and naïve Bayes classifiers
16
作者 Stephen A.Arhin Adam Gatiba 《Transportation Safety and Environment》 EI 2020年第2期120-132,共13页
The Washington,DC crash statistic report for the period from 2013 to 2015 shows that the city recorded about 41789 crashes at unsignalized intersections,which resulted in 14168 injuries and 51 fatalities.The economic ... The Washington,DC crash statistic report for the period from 2013 to 2015 shows that the city recorded about 41789 crashes at unsignalized intersections,which resulted in 14168 injuries and 51 fatalities.The economic cost of these fatalities has been estimated to be in the millions of dollars.It is therefore necessary to investigate the predictability of the occurrence of theses crashes,based on pertinent factors,in order to provide mitigating measures.This research focused on the development of models to predict the injury severity of crashes using support vector machines(SVMs)and Gaussian naïve Bayes classifiers(GNBCs).The models were developed based on 3307 crashes that occurred from 2008 to 2015.Eight SVM models and a GNBC model were developed.The most accurate model was the SVM with a radial basis kernel function.This model predicted the severity of an injury sustained in a crash with an accuracy of approximately 83.2%.The GNBC produced the worst-performing model with an accuracy of 48.5%.These models will enable transport officials to identify crash-prone unsignalized intersections to provide the necessary countermeasures beforehand. 展开更多
关键词 crashes unsignalized intersection support vector machines Gaussian naïve bayes classifier injury severity
原文传递
Identification of maize(Zea mays L.)progeny genotypes based on two probabilistic approaches:Logistic regression and naïve Bayes
17
作者 D.Seka B.S.Bonny +2 位作者 A.N.Yoboué S.R.Sié B.A.Adopo-Gourène 《Artificial Intelligence in Agriculture》 2019年第1期9-13,共5页
Weused two probabilisticmethods,Gaussian Naïve Bayes and Logistic Regression to predict the genotypes of the offspring of two maize strains,the BLC and the JNE genotypes,based on the phenotypic traits of the pare... Weused two probabilisticmethods,Gaussian Naïve Bayes and Logistic Regression to predict the genotypes of the offspring of two maize strains,the BLC and the JNE genotypes,based on the phenotypic traits of the parents.We determined the prediction performance of the two models with the overall accuracy and the area under the receiver operating curve(AUC).The overall accuracy for both models ranged between 82%and 87%.The values of the area under the receiver operating curvewere 0.90 or higher for Logistic Regression models,and 0.85 or higher for Gaussian Naïve Bayesmodels.These statistics indicated that the two models were very effective in predicting the genotypes of the offspring.Furthermore,bothmodels predicted the BLC genotype with higher accuracy than they did the JNE genotype.The BLC genotype appeared more homogeneous and more predictable.A Chi-square test for the homogeneity of the confusionmatrices showed that in all cases the twomodels produced similar prediction results.That finding was in line with the assertion by Mitchell(2010)who theoretically showed that the twomodels are essentially the same.With logistic regression,each subset of the original data or its corresponding principal components produced exactly the same prediction results.The AUC value may be viewed as a criterion for parent-offspring resemblance for each set of phenotypic traits considered in the analysis. 展开更多
关键词 Gaussian naïve bayes Logistic regression Maize genotype Prediction Selection
原文传递
基于动作图的视角无关动作识别 被引量:5
18
作者 杨跃东 郝爱民 +2 位作者 褚庆军 赵沁平 王莉莉 《软件学报》 EI CSCD 北大核心 2009年第10期2679-2691,共13页
针对视角无关的动作识别,提出加权字典向量描述方法和动作图识别模型.将视频中的局部兴趣点特征和全局形状描述有机结合,形成加权字典向量的描述方法,该方法既具有兴趣点抗噪声强的优点,又可克服兴趣点无法识别静态动作的缺点.根据运动... 针对视角无关的动作识别,提出加权字典向量描述方法和动作图识别模型.将视频中的局部兴趣点特征和全局形状描述有机结合,形成加权字典向量的描述方法,该方法既具有兴趣点抗噪声强的优点,又可克服兴趣点无法识别静态动作的缺点.根据运动捕获、点云等三维运动数据构建能量曲线,提取关键姿势,生成基本运动单元,并通过自连接、向前连接和向后连接3种连接方式构成有向图,称为本质图.本质图向各个方向投影,根据节点近邻规则建立的有向图称为动作图.通过Na?veBayes训练动作图模型,采用Viterbi算法计算视频与动作图的匹配度,根据最大匹配度标定视频序列.动作图具有多角度投影和投影平滑过渡等特点,因此可识别任意角度、任意运动方向的视频序列.实验结果表明,该算法具有较好的识别效果,可识别单目视频、多目视频和多动作视频. 展开更多
关键词 动作识别 角度无关 动作图 兴趣点 Nave bayes
下载PDF
基于粗糙集的加权朴素贝叶斯邮件过滤方法 被引量:21
19
作者 邓维斌 王国胤 洪智勇 《计算机科学》 CSCD 北大核心 2011年第2期218-221,共4页
邮件过滤中有两个关键问题,一是如何选择有效的邮件特征集,二是设计较好的邮件过滤算法。在对邮件特性进行分析的基础上,综合邮件头及邮件内容的主要形象特征给出了一种新的邮件特征集提取方法。用粗糙集的信息观点度量了各属性的重要性... 邮件过滤中有两个关键问题,一是如何选择有效的邮件特征集,二是设计较好的邮件过滤算法。在对邮件特性进行分析的基础上,综合邮件头及邮件内容的主要形象特征给出了一种新的邮件特征集提取方法。用粗糙集的信息观点度量了各属性的重要性,并以此为权重进行加权朴素贝叶斯垃圾邮件过滤,有效地解决了朴素贝叶斯分类中的条件依赖性问题。通过在中英文邮件集上的测试实验,证明了所提出的邮件过滤方法的有效性。 展开更多
关键词 垃圾邮件过滤 特征选择 粗糙集 加权朴素贝叶斯
下载PDF
基于机器学习的维吾尔文文本分类研究 被引量:20
20
作者 阿力木江·艾沙 吐尔根·依布拉音 +1 位作者 艾山·吾买尔 马尔哈巴·艾力 《计算机工程与应用》 CSCD 2012年第5期110-112,共3页
随着Internet上维吾尔文信息的迅速发展,维吾尔文文本分类成为处理和组织这些大量文本数据的关键技术。研究维吾尔文文本分类相关技术和方法,针对维吾尔文文本在向量空间模型(VSM)表示下的高维性,采用词干提取和IG相结合的方法对表示空... 随着Internet上维吾尔文信息的迅速发展,维吾尔文文本分类成为处理和组织这些大量文本数据的关键技术。研究维吾尔文文本分类相关技术和方法,针对维吾尔文文本在向量空间模型(VSM)表示下的高维性,采用词干提取和IG相结合的方法对表示空间进行降维。采用基于机器学习的分类算法(kNN和Nave Bayes)对维吾尔文文本语料进行了分类实验并分析了实验结果。 展开更多
关键词 文本分类 朴素贝叶斯方法 k-最近邻方法(kNN) 维吾尔语 特征选择
下载PDF
上一页 1 2 4 下一页 到第
使用帮助 返回顶部