期刊文献+
共找到262篇文章
< 1 2 14 >
每页显示 20 50 100
Parallel naive Bayes algorithm for large-scale Chinese text classification based on spark 被引量:22
1
作者 LIU Peng ZHAO Hui-han +3 位作者 TENG Jia-yu YANG Yan-yan LIU Ya-feng ZHU Zong-wei 《Journal of Central South University》 SCIE EI CAS CSCD 2019年第1期1-12,共12页
The sharp increase of the amount of Internet Chinese text data has significantly prolonged the processing time of classification on these data.In order to solve this problem,this paper proposes and implements a parall... The sharp increase of the amount of Internet Chinese text data has significantly prolonged the processing time of classification on these data.In order to solve this problem,this paper proposes and implements a parallel naive Bayes algorithm(PNBA)for Chinese text classification based on Spark,a parallel memory computing platform for big data.This algorithm has implemented parallel operation throughout the entire training and prediction process of naive Bayes classifier mainly by adopting the programming model of resilient distributed datasets(RDD).For comparison,a PNBA based on Hadoop is also implemented.The test results show that in the same computing environment and for the same text sets,the Spark PNBA is obviously superior to the Hadoop PNBA in terms of key indicators such as speedup ratio and scalability.Therefore,Spark-based parallel algorithms can better meet the requirement of large-scale Chinese text data mining. 展开更多
关键词 Chinese text classification naive bayes SPARK HADOOP resilient distributed dataset PARALLELIZATION
下载PDF
Decision Tree and Naive Bayes Algorithm for Classification and Generation of Actionable Knowledge for Direct Marketing
2
作者 Masud Karim Rashedur M.Rahman 《Journal of Software Engineering and Applications》 2013年第4期196-206,共11页
Many companies like credit card, insurance, bank, retail industry require direct marketing. Data mining can help those institutes to set marketing goal. Data mining techniques have good prospects in their target audie... Many companies like credit card, insurance, bank, retail industry require direct marketing. Data mining can help those institutes to set marketing goal. Data mining techniques have good prospects in their target audiences and improve the likelihood of response. In this work we have investigated two data mining techniques: the Naive Bayes and the C4.5 decision tree algorithms. The goal of this work is to predict whether a client will subscribe a term deposit. We also made comparative study of performance of those two algorithms. Publicly available UCI data is used to train and test the performance of the algorithms. Besides, we extract actionable knowledge from decision tree that focuses to take interesting and important decision in business area. 展开更多
关键词 CRM Actionable KNOWLEDGE Data Mining C4.5 naive bayes ROC classification
下载PDF
DDoS Attack Detection Using Heuristics Clustering Algorithm and Naive Bayes Classification
3
作者 Sharmila Bista Roshan Chitrakar 《Journal of Information Security》 2018年第1期33-44,共12页
In recent times among the multitude of attacks present in network system, DDoS attacks have emerged to be the attacks with the most devastating effects. The main objective of this paper is to propose a system that eff... In recent times among the multitude of attacks present in network system, DDoS attacks have emerged to be the attacks with the most devastating effects. The main objective of this paper is to propose a system that effectively detects DDoS attacks appearing in any networked system using the clustering technique of data mining followed by classification. This method uses a Heuristics Clustering Algorithm (HCA) to cluster the available data and Na?ve Bayes (NB) classification to classify the data and detect the attacks created in the system based on some network attributes of the data packet. The clustering algorithm is based in unsupervised learning technique and is sometimes unable to detect some of the attack instances and few normal instances, therefore classification techniques are also used along with clustering to overcome this classification problem and to enhance the accuracy. Na?ve Bayes classifiers are based on very strong independence assumptions with fairly simple construction to derive the conditional probability for each relationship. A series of experiment is performed using “The CAIDA UCSD DDoS Attack 2007 Dataset” and “DARPA 2000 Dataset” and the efficiency of the proposed system has been tested based on the following performance parameters: Accuracy, Detection Rate and False Positive Rate and the result obtained from the proposed system has been found that it has enhanced accuracy and detection rate with low false positive rate. 展开更多
关键词 DDOS Attacks Heuristic Clustering Algorithm naive bayes classification CAIDA UCSD DARPA 2000
下载PDF
基于K-means和naive Bayes的数据库用户行为异常检测研究 被引量:8
4
作者 王旭仁 冯安然 +2 位作者 何发镁 马慧珍 杨杰 《计算机应用研究》 CSCD 北大核心 2020年第4期1128-1131,共4页
针对数据库用户行为异常导致数据库泄露问题,提出了一种基于K-means和naive Bayes算法的数据库用户异常检测方法。首先,利用数据库历史审计日志中用户的查询语句与查询结果,采用K-means聚类方法得到用户的分组;然后,使用naive Bayes分... 针对数据库用户行为异常导致数据库泄露问题,提出了一种基于K-means和naive Bayes算法的数据库用户异常检测方法。首先,利用数据库历史审计日志中用户的查询语句与查询结果,采用K-means聚类方法得到用户的分组;然后,使用naive Bayes分类算法构造用户异常检测模型。与单独使用naive Bayes分类法构造的模型相比,在数据预处理时其精简了用户行为轮廓的表示方法,降低了计算冗余,减少了81%的训练时间;利用K-means聚类方法得到用户组别,使检测的精确率提高了7.06%,F 1值提高了3.33%。实验证明,所提方法大幅降低了训练时间,取得了良好的检测效果。 展开更多
关键词 数据库 用户行为 异常检测 K-MEANS聚类 naive bayes分类算法
下载PDF
基于Naive Bayes的CLIF_NB文本分类学习方法 被引量:1
5
作者 刘丽珍 宋瀚涛 陆玉昌 《小型微型计算机系统》 CSCD 北大核心 2005年第9期1575-1577,共3页
针对NaiveBayes方法中条件独立性假设常常与实际相违背的情况,提出了CLIF-NB文本分类学习方法,利用互信息理论,计算特征属性之间的最大相关性概率,用变量集组合替代线性不可分属性,改善条件独立性假设的限制,并通过学习一系列分类器,缩... 针对NaiveBayes方法中条件独立性假设常常与实际相违背的情况,提出了CLIF-NB文本分类学习方法,利用互信息理论,计算特征属性之间的最大相关性概率,用变量集组合替代线性不可分属性,改善条件独立性假设的限制,并通过学习一系列分类器,缩小训练集中的分类错误,综合得出分类准确率较高的CLIF-NB分类器. 展开更多
关键词 文本分类 naive bayes 条件独立性假设
下载PDF
基于Naive Bayes的维吾尔文文本分类算法及其性能分析 被引量:7
6
作者 艾海麦提江.阿布来提 吐尔地.托合提 艾斯卡尔.艾木都拉 《计算机应用与软件》 CSCD 北大核心 2012年第12期27-29,共3页
以大规模网络维吾尔文文本的自动分类技术研究为背景,设计模块化结构的维吾尔文本分类系统,在深入调研基础上选择Naive Bayes算法为分类引擎,用C#实现分类系统。预处理中,结合维吾尔语的词法特征,通过引入词干提取方法大大降低特征维数... 以大规模网络维吾尔文文本的自动分类技术研究为背景,设计模块化结构的维吾尔文本分类系统,在深入调研基础上选择Naive Bayes算法为分类引擎,用C#实现分类系统。预处理中,结合维吾尔语的词法特征,通过引入词干提取方法大大降低特征维数。在包含10大类共计3 000多个较大规模文本语料库基础上给出分类实验结果,再通过x2统计方法选择不同数目的特征,也分别给出分类实验结果。结果表明,预处理后的维吾尔文特征空间中只有1%-3%特征是最佳的,因而进一步确定哪些是最佳特征或降低特征空间维数是有可能的。 展开更多
关键词 维吾尔文 文本分类 naive bayes词干提取 停用词
下载PDF
Mobile SMS Spam Filtering for Nepali Text Using Naive Bayesian and Support Vector Machine 被引量:2
7
作者 Tej Bahadur Shahi Abhimanu Yadav 《International Journal of Intelligence Science》 2014年第1期24-28,共5页
Spam is a universal problem with which everyone is familiar. A number of approaches are used for Spam filtering. The most common filtering technique is content-based filtering which uses the actual text of message to ... Spam is a universal problem with which everyone is familiar. A number of approaches are used for Spam filtering. The most common filtering technique is content-based filtering which uses the actual text of message to determine whether it is Spam or not. The content is very dynamic and it is very challenging to represent all information in a mathematical model of classification. For instance, in content-based Spam filtering, the characteristics used by the filter to identify Spam message are constantly changing over time. Na?ve Bayes method represents the changing nature of message using probability theory and support vector machine (SVM) represents those using different features. These two methods of classification are efficient in different domains and the case of Nepali SMS or Text classification has not yet been in consideration;these two methods do not consider the issue and it is interesting to find out the performance of both the methods in the problem of Nepali Text classification. In this paper, the Na?ve Bayes and SVM-based classification techniques are implemented to classify the Nepali SMS as Spam and non-Spam. An empirical analysis for various text cases has been done to evaluate accuracy measure of the classification methodologies used in this study. And, it is found to be 87.15% accurate in SVM and 92.74% accurate in the case of Na?ve Bayes. 展开更多
关键词 SMS Spam Filtering classification Support Vector Machine naive bayes PREPROCESSING Feature Extraction Nepali SMS Datasets
下载PDF
基于Naive Bayes的P2P平台评论研究 被引量:1
8
作者 曾政多 《现代计算机》 2019年第20期10-13,共4页
随着支付宝余额宝、腾讯理财通等网络金融的发展,投资者对于网络投资的热情逐年递增,出现大量高收益的P2P网贷投资平台,由于发展的速度过快,且各平台良莠不齐,许多相关问题因此而生。对于这个现象,从各平台的用户评论入手,评论信息中不... 随着支付宝余额宝、腾讯理财通等网络金融的发展,投资者对于网络投资的热情逐年递增,出现大量高收益的P2P网贷投资平台,由于发展的速度过快,且各平台良莠不齐,许多相关问题因此而生。对于这个现象,从各平台的用户评论入手,评论信息中不仅可以反映民众对金融平台的关注程度,也反映公众表现出来的各类情感价值和思想动态,基于朴素贝叶斯(Naive Bayes)分类器,应用Python中的SnowNLP库,在已有的数据集上经过数据处理,建立模型,数据挖掘与分析对评论中的用户观点进行研究,为P2P投资的用户提供建议,同时也为P2P平台的监管与风险预测提供借鉴。 展开更多
关键词 朴素贝叶斯分类 PYTHON 中文评论情感分析 P2P平台
下载PDF
Spark框架下利用分布式NBC的大数据文本分类方法 被引量:6
9
作者 臧艳辉 赵雪章 席运江 《计算机应用研究》 CSCD 北大核心 2019年第12期3705-3708,3712,共5页
针对现有面向大数据的计算框架在可扩展性机器学习研究中面临的挑战,提出了基于MapReduce和Apache Spark框架的分布式朴素贝叶斯文本分类方法。通过研究MapReduce和Apache Spark框架的适应性来探索朴素贝叶斯分类器(NBC),并研究了现有... 针对现有面向大数据的计算框架在可扩展性机器学习研究中面临的挑战,提出了基于MapReduce和Apache Spark框架的分布式朴素贝叶斯文本分类方法。通过研究MapReduce和Apache Spark框架的适应性来探索朴素贝叶斯分类器(NBC),并研究了现有面向大数据的计算框架。首先,基于朴素贝叶斯文本分类模型将训练样本数据集分为m类;进一步在训练阶段中,将前一个MapReduce的输出作为后一个MapReduce的输入,采用四个MapReduce作业得出模型。该设计过程充分利用了MapReduce的并行优势,最后在分类器测试时取出最大值所属的类标签值。在Newgroups数据集进行实验,在所有五类新闻数据组上的分类都取得了99%以上的结果,并且均高于对比算法,证明了提出方法的准确性。 展开更多
关键词 文本分类 MAPREDUCE Spark框架 分布式 朴素贝叶斯分类器 机器学习
下载PDF
NBCC:一种数据流上变化的挖掘算法 被引量:1
10
作者 马瑞民 王小龙 《计算机工程与应用》 CSCD 北大核心 2006年第7期166-168,共3页
针对数据流上变化的挖掘问题,提出了算法NBCC,首先利用精确抽样的方法对数据流构建概要数据结构,然后借鉴经典朴素贝叶斯分类方法的思想,将训练样本集分成Ci类,i=1,2,…,m。对测试样本集设定一个阈值!:当P(Ci|X)<!时,即当样本X属于... 针对数据流上变化的挖掘问题,提出了算法NBCC,首先利用精确抽样的方法对数据流构建概要数据结构,然后借鉴经典朴素贝叶斯分类方法的思想,将训练样本集分成Ci类,i=1,2,…,m。对测试样本集设定一个阈值!:当P(Ci|X)<!时,即当样本X属于任何已知类别Ci的概率都小于设定的!时,表明有变化发生,并且保留该变化,记为新类Cm+1,并重复使用该方法。 展开更多
关键词 数据流 变化概要数据结构 精确抽样 朴素贝叶斯分类 阈值
下载PDF
基于Bayes的一种改良垃圾邮件过滤模型 被引量:2
11
作者 龚伟 《微计算机信息》 北大核心 2007年第3期104-106,共3页
文章首先分析了垃圾邮件的产生机理,介绍了目前比较常见的几种垃圾邮件过滤技术,然后从朴素贝叶斯的理论依据出发,针对当前应用于重要商业领域的垃圾邮件过滤系统的不足,设计了一种应用多级邮件策略的新模型,并通过实验比较证明新模型... 文章首先分析了垃圾邮件的产生机理,介绍了目前比较常见的几种垃圾邮件过滤技术,然后从朴素贝叶斯的理论依据出发,针对当前应用于重要商业领域的垃圾邮件过滤系统的不足,设计了一种应用多级邮件策略的新模型,并通过实验比较证明新模型的应用在一定程度上提高了垃圾邮件过滤系统的查全率和查准率。 展开更多
关键词 垃圾邮件 过滤 实时黑名单 朴素贝叶斯 邮件分级
下载PDF
融合NBC与PNN的网络异常分类
12
作者 周明伟 刘渊 《计算机工程与应用》 CSCD 2013年第17期89-93,共5页
对网络异常进行分类有利于管理员更好地管理网络,然而单一的分类器存在对各类异常的分类效果不均衡,不够全面等问题。鉴于此在研究了常用于分类的概率神经网络(Probability Neural Network,PNN)算法和朴素贝叶斯分类器(Naive Bayes Clas... 对网络异常进行分类有利于管理员更好地管理网络,然而单一的分类器存在对各类异常的分类效果不均衡,不够全面等问题。鉴于此在研究了常用于分类的概率神经网络(Probability Neural Network,PNN)算法和朴素贝叶斯分类器(Naive Bayes Classifier,NBC)算法的基础上提出了一种融合NBC与PNN的网络异常分类模型。该模型将PNN与NBC对各类网络异常的分类精度作为权值,通过计算得出未知流量所属各类别的概率,最大值为预测结果,通过KDD99数据集对该模型进行测试,实验结果表明,提出的新模型相对于仅使用PNN或者NBC的单分类器,其对各类异常的分类效果具有更好的均衡性和更高的分类精度。 展开更多
关键词 网络异常 概率神经网络 朴素贝叶斯分类器 融合 异常分类
下载PDF
Automatic Classification of Swedish Metadata Using Dewey Decimal Classification:A Comparison of Approaches 被引量:2
13
作者 Koraljka Golub Johan Hagelback Anders Ardo 《Journal of Data and Information Science》 CSCD 2020年第1期18-38,共21页
Purpose:With more and more digital collections of various information resources becoming available,also increasing is the challenge of assigning subject index terms and classes from quality knowledge organization syst... Purpose:With more and more digital collections of various information resources becoming available,also increasing is the challenge of assigning subject index terms and classes from quality knowledge organization systems.While the ultimate purpose is to understand the value of automatically produced Dewey Decimal Classification(DDC)classes for Swedish digital collections,the paper aims to evaluate the performance of six machine learning algorithms as well as a string-matching algorithm based on characteristics of DDC.Design/methodology/approach:State-of-the-art machine learning algorithms require at least 1,000 training examples per class.The complete data set at the time of research involved 143,838 records which had to be reduced to top three hierarchical levels of DDC in order to provide sufficient training data(totaling 802 classes in the training and testing sample,out of 14,413 classes at all levels).Findings:Evaluation shows that Support Vector Machine with linear kernel outperforms other machine learning algorithms as well as the string-matching algorithm on average;the string-matching algorithm outperforms machine learning for specific classes when characteristics of DDC are most suitable for the task.Word embeddings combined with different types of neural networks(simple linear network,standard neural network,1 D convolutional neural network,and recurrent neural network)produced worse results than Support Vector Machine,but reach close results,with the benefit of a smaller representation size.Impact of features in machine learning shows that using keywords or combining titles and keywords gives better results than using only titles as input.Stemming only marginally improves the results.Removed stop-words reduced accuracy in most cases,while removing less frequent words increased it marginally.The greatest impact is produced by the number of training examples:81.90%accuracy on the training set is achieved when at least 1,000 records per class are available in the training set,and 66.13%when too few records(often less than A Comparison of Approaches100 per class)on which to train are available—and these hold only for top 3 hierarchical levels(803 instead of 14,413 classes).Research limitations:Having to reduce the number of hierarchical levels to top three levels of DDC because of the lack of training data for all classes,skews the results so that they work in experimental conditions but barely for end users in operational retrieval systems.Practical implications:In conclusion,for operative information retrieval systems applying purely automatic DDC does not work,either using machine learning(because of the lack of training data for the large number of DDC classes)or using string-matching algorithm(because DDC characteristics perform well for automatic classification only in a small number of classes).Over time,more training examples may become available,and DDC may be enriched with synonyms in order to enhance accuracy of automatic classification which may also benefit information retrieval performance based on DDC.In order for quality information services to reach the objective of highest possible precision and recall,automatic classification should never be implemented on its own;instead,machine-aided indexing that combines the efficiency of automatic suggestions with quality of human decisions at the final stage should be the way for the future.Originality/value:The study explored machine learning on a large classification system of over 14,000 classes which is used in operational information retrieval systems.Due to lack of sufficient training data across the entire set of classes,an approach complementing machine learning,that of string matching,was applied.This combination should be explored further since it provides the potential for real-life applications with large target classification systems. 展开更多
关键词 LIBRIS Dewey Decimal classification Automatic classification Machine learning Support Vector Machine Multinomial naive bayes Simple linear network Standard neural network 1D convolutional neural network Recurrent neural network Word embeddings String matching
下载PDF
Automatically Constructing an Effective Domain Ontology for Document Classification 被引量:2
14
作者 Yi-Hsing Chang 《Computer Technology and Application》 2011年第3期182-189,共8页
An effective domain ontology automatically constructed is proposed in this paper. The main concept is using the Formal Concept Analysis to automatically establish domain ontology. Finally, the ontology is acted as the... An effective domain ontology automatically constructed is proposed in this paper. The main concept is using the Formal Concept Analysis to automatically establish domain ontology. Finally, the ontology is acted as the base for the Naive Bayes classifier to approve the effectiveness of the domain ontology for document classification. The 1752 documents divided into 10 categories are used to assess the effectiveness of the ontology, where 1252 and 500 documents are the training and testing documents, respectively. The Fl-measure is as the assessment criteria and the following three results are obtained. The average recall of Naive Bayes classifier is 0.94. Therefore, in recall, the performance of Naive Bayes classifier is excellent based on the automatically constructed ontology. The average precision of Naive Bayes classifier is 0.81. Therefore, in precision, the performance of Naive Bayes classifier is gored based on the automatically constructed ontology. The average Fl-measure for 10 categories by Naive Bayes classifier is 0.86. Therefore, the performance of Naive Bayes classifier is effective based on the automatically constructed ontology in the point of F 1-measure. Thus, the domain ontology automatically constructed could indeed be acted as the document categories to reach the effectiveness for document classification. 展开更多
关键词 naive bayes classifier ONTOLOGY formal concept analysis document classification.
下载PDF
Classification of epilepsy using computational intelligence techniques 被引量:3
15
作者 Khurram I. Qazi H.K. Lam +2 位作者 Bo Xiao Gaoxiang Ouyang Xunhe Yin 《CAAI Transactions on Intelligence Technology》 2016年第2期137-149,共13页
This paper deals with a real-life application of epilepsy classification, where three phases of absence seizure, namely pre-seizure, seizure and seizure-free, are classified using real clinical data. Artificial neural... This paper deals with a real-life application of epilepsy classification, where three phases of absence seizure, namely pre-seizure, seizure and seizure-free, are classified using real clinical data. Artificial neural network (ANN) and support vector machines (SVMs) combined with su- pervised learning algorithms, and k-means clustering (k-MC) combined with unsupervised techniques are employed to classify the three seizure phases. Different techniques to combine binary SVMs, namely One Vs One (OvO), One Vs All (OVA) and Binary Decision Tree (BDT), are employed for multiclass classification. Comparisons are performed with two traditional classification methods, namely, k-Nearest Neighbour (k- NN) and Naive Bayes classifier. It is concluded that SVM-based classifiers outperform the traditional ones in terms of recognition accuracy and robustness property when the original clinical data is distorted with noise. Furthermore, SVM-based classifier with OvO provides the highest recognition accuracy, whereas ANN-based classifier overtakes by demonstrating maximum accuracy in the presence of noise. 展开更多
关键词 Absence seizure Discrete wavelet transform Epilepsy classification Feature extraction k-means clustering k-nearest neighbours naive bayes NEURALNETWORKS Support vector machines
下载PDF
基于增量式Bayes的中文网页自动分类技术
16
作者 高洁 赵俊荣 《电脑知识与技术》 2006年第5期45-46,68,共3页
本文提出了基于未标记的中文网页的增量式Bayes自动分类算法,实验结果表明,该算法是可行的和有效的。
关键词 中文网页分类 增量学习 naive bayes
下载PDF
基于Na?ve Bayes和TF-IDF的真假新闻分类
17
作者 蔡扬 付小斌 《电脑知识与技术》 2018年第2期184-186,共3页
信息爆炸的时代,大量的新闻每天充斥的我们的生活,海量的新闻总是能够引导着人们对社会中发生的事件做出自己的判断。假新闻的错误引导将会对社会起到消极的作用,于是该文提出对真假新闻进行分类的方法。该文结合TFIDF算法和朴素贝叶斯... 信息爆炸的时代,大量的新闻每天充斥的我们的生活,海量的新闻总是能够引导着人们对社会中发生的事件做出自己的判断。假新闻的错误引导将会对社会起到消极的作用,于是该文提出对真假新闻进行分类的方法。该文结合TFIDF算法和朴素贝叶斯算法,对新闻中的词条进行加权,之后重新定义朴素贝叶斯分类器,并对新闻进行分类。最后,我们进行了多组实验,并取得了多组实验的平均值作为本次实验的最终结论。 展开更多
关键词 真假新闻 TF-IDF 朴素贝叶斯 分类
下载PDF
基于双信号融合的主轴/刀柄结合面刚度退化程度预测
18
作者 吴石 张勇 +1 位作者 王宇鹏 王春风 《中国机械工程》 EI CAS CSCD 北大核心 2024年第8期1449-1461,共13页
为了预测主轴/刀柄结合面刚度退化程度,提出了一种基于激励和响应信号融合的主轴/刀柄结合面刚度退化程度预测方法。首先进行钛合金矩形工件侧铣实验,采集瞬时铣削力信号和主轴/刀柄结合面附近的响应振动信号,构建反映主轴/刀柄结合面... 为了预测主轴/刀柄结合面刚度退化程度,提出了一种基于激励和响应信号融合的主轴/刀柄结合面刚度退化程度预测方法。首先进行钛合金矩形工件侧铣实验,采集瞬时铣削力信号和主轴/刀柄结合面附近的响应振动信号,构建反映主轴/刀柄结合面刚度退化的数据库。然后根据数据库中瞬时铣削力和振动信号各方向的时域、频域和时频域特征,基于相关性分析优选出瞬时铣削力信号和振动信号的时域均值、频域中心频率、时频域一阶小波包能量3个特征,分别使用低频滤波卷积核和高频滤波卷积核对优选后的特征矩阵进行双通道卷积池化处理,获取深度融合的主轴/刀柄结合面刚度退化程度特征向量。最后以支持向量机模型(SVM)的概率模式转化为朴素贝叶斯分类器(NBC)的条件概率,构建混合分类器模型(NBC-SVM),提高了分类器的分类性能。在主轴/刀柄结合面刚度退化数据库的基础上,基于双通道卷积池化的特征融合方法(CP-FF)和NBC-SVM模型实现了主轴/刀柄结合面刚度退化程度的预测,预测精度达96%。 展开更多
关键词 主轴/刀柄结合面 刚度退化 特征融合 朴素贝叶斯分类器支持向量机模型
下载PDF
一种应用于智能分诊的改进朴素贝叶斯方法 被引量:1
19
作者 鲍琪琪 孙超仁 《现代医院》 2024年第3期424-427,共4页
针对朴素贝叶斯分类方法(naive bayesian model,NBM)在应用于门诊智能分诊时,无法有效区分不同类型的症状涉及的疾病学科范围不同问题,提出了一种朴素贝叶斯分类方法的改进算法,引入IDF因子,为不同的症状类型提供相应的权重。首先,基于... 针对朴素贝叶斯分类方法(naive bayesian model,NBM)在应用于门诊智能分诊时,无法有效区分不同类型的症状涉及的疾病学科范围不同问题,提出了一种朴素贝叶斯分类方法的改进算法,引入IDF因子,为不同的症状类型提供相应的权重。首先,基于权威医疗文献,收集整理诊断学相关的语料作为训练数据集,然后,基于朴素贝叶斯分类方法计算先验概率、类条件概率,训练生成不同症状的IDF因子,最后,在进行分类判断时对不同的症状组合引入IDF因子,平滑不同类型症状的重要程度。在智能分诊准确性对比实验中,改进后的算法召回率提升约11%,明显高于朴素贝叶斯分类方法。 展开更多
关键词 智能分诊 朴素贝叶斯 IDF 多类别分类 有监督学习
下载PDF
基于扩展贝叶斯分类算法的土地利用时空变化信息库去冗方法
20
作者 陈少玲 吴继万 《工程勘察》 2024年第12期69-74,共6页
本文提出一种基于扩展贝叶斯分类算法的土地利用时空变化信息库去冗方法,通过遥感技术获取土地利用图像,结合人工目视判读法,分析土地利用时空变化指标,构建土地利用时空变化信息库。利用基于属性关联性的朴素贝叶斯分类算法,约简信息属... 本文提出一种基于扩展贝叶斯分类算法的土地利用时空变化信息库去冗方法,通过遥感技术获取土地利用图像,结合人工目视判读法,分析土地利用时空变化指标,构建土地利用时空变化信息库。利用基于属性关联性的朴素贝叶斯分类算法,约简信息属性,去除冗余信息属性,达到信息库去冗的目的。将扩展弧添加在非同种类别的父子结点间,形成扩展贝叶斯分类算法,提高冗余信息去除的准确性。实验结果表明,该方法能够有效去除土地利用时空变化信息库中的冗余信息,节约存储开销,提供完整的土地利用信息,具有较好的分类性能。 展开更多
关键词 扩展贝叶斯 分类算法 土地利用 时空变化 信息库去冗 朴素贝叶斯
下载PDF
上一页 1 2 14 下一页 到第
使用帮助 返回顶部