期刊文献+
共找到988篇文章
< 1 2 50 >
每页显示 20 50 100
Mobile SMS Spam Filtering for Nepali Text Using Naive Bayesian and Support Vector Machine 被引量:2
1
作者 Tej Bahadur Shahi Abhimanu Yadav 《International Journal of Intelligence Science》 2014年第1期24-28,共5页
Spam is a universal problem with which everyone is familiar. A number of approaches are used for Spam filtering. The most common filtering technique is content-based filtering which uses the actual text of message to ... Spam is a universal problem with which everyone is familiar. A number of approaches are used for Spam filtering. The most common filtering technique is content-based filtering which uses the actual text of message to determine whether it is Spam or not. The content is very dynamic and it is very challenging to represent all information in a mathematical model of classification. For instance, in content-based Spam filtering, the characteristics used by the filter to identify Spam message are constantly changing over time. Na?ve Bayes method represents the changing nature of message using probability theory and support vector machine (SVM) represents those using different features. These two methods of classification are efficient in different domains and the case of Nepali SMS or Text classification has not yet been in consideration;these two methods do not consider the issue and it is interesting to find out the performance of both the methods in the problem of Nepali Text classification. In this paper, the Na?ve Bayes and SVM-based classification techniques are implemented to classify the Nepali SMS as Spam and non-Spam. An empirical analysis for various text cases has been done to evaluate accuracy measure of the classification methodologies used in this study. And, it is found to be 87.15% accurate in SVM and 92.74% accurate in the case of Na?ve Bayes. 展开更多
关键词 sMs spam Filtering Classification support Vector Machine naive bayes PREPROCEssING Feature Extraction Nepali sMs Datasets
下载PDF
DDoS Attack Detection Using Heuristics Clustering Algorithm and Naive Bayes Classification
2
作者 Sharmila Bista Roshan Chitrakar 《Journal of Information Security》 2018年第1期33-44,共12页
In recent times among the multitude of attacks present in network system, DDoS attacks have emerged to be the attacks with the most devastating effects. The main objective of this paper is to propose a system that eff... In recent times among the multitude of attacks present in network system, DDoS attacks have emerged to be the attacks with the most devastating effects. The main objective of this paper is to propose a system that effectively detects DDoS attacks appearing in any networked system using the clustering technique of data mining followed by classification. This method uses a Heuristics Clustering Algorithm (HCA) to cluster the available data and Na?ve Bayes (NB) classification to classify the data and detect the attacks created in the system based on some network attributes of the data packet. The clustering algorithm is based in unsupervised learning technique and is sometimes unable to detect some of the attack instances and few normal instances, therefore classification techniques are also used along with clustering to overcome this classification problem and to enhance the accuracy. Na?ve Bayes classifiers are based on very strong independence assumptions with fairly simple construction to derive the conditional probability for each relationship. A series of experiment is performed using “The CAIDA UCSD DDoS Attack 2007 Dataset” and “DARPA 2000 Dataset” and the efficiency of the proposed system has been tested based on the following performance parameters: Accuracy, Detection Rate and False Positive Rate and the result obtained from the proposed system has been found that it has enhanced accuracy and detection rate with low false positive rate. 展开更多
关键词 DDOs Attacks Heuristic Clustering Algorithm naive bayEs Classification CAIDA UCsD DARPA 2000
下载PDF
Ensemble Variable Selection for Naive Bayes to Improve Customer Behaviour Analysis
3
作者 R.Siva Subramanian D.Prabha 《Computer Systems Science & Engineering》 SCIE EI 2022年第4期339-355,共17页
Executing customer analysis in a systemic way is one of the possible solutions for each enterprise to understand the behavior of consumer patterns in an efficient and in-depth manner.Further investigation of customer p... Executing customer analysis in a systemic way is one of the possible solutions for each enterprise to understand the behavior of consumer patterns in an efficient and in-depth manner.Further investigation of customer patterns helps thefirm to develop efficient decisions and in turn,helps to optimize the enter-prise’s business and maximizes consumer satisfaction correspondingly.To con-duct an effective assessment about the customers,Naive Bayes(also called Simple Bayes),a machine learning model is utilized.However,the efficacious of the simple Bayes model is utterly relying on the consumer data used,and the existence of uncertain and redundant attributes in the consumer data enables the simple Bayes model to attain the worst prediction in consumer data because of its presumption regarding the attributes applied.However,in practice,the NB pre-mise is not true in consumer data,and the analysis of these redundant attributes enables simple Bayes model to get poor prediction results.In this work,an ensem-ble attribute selection methodology is performed to overcome the problem with consumer data and to pick a steady uncorrelated attribute set to model with the NB classifier.In ensemble variable selection,two different strategies are applied:one is based upon data perturbation(or homogeneous ensemble,same feature selector is applied to a different subsamples derived from the same learning set)and the other one is based upon function perturbation(or heterogeneous ensemble different feature selector is utilized to the same learning set).Further-more,the feature set captured from both ensemble strategies is applied to NB indi-vidually and the outcome obtained is computed.Finally,the experimental outcomes show that the proposed ensemble strategies perform efficiently in choosing a steady attribute set and increasing NB classification performance efficiently. 展开更多
关键词 naive bayes or simple bayes variable selection homogeneous ensemble heterogeneous ensemble customer prediction
下载PDF
Situation assessment for air combat based on novel semi-supervised naive Bayes 被引量:15
4
作者 XU Ximeng YANG Rennong FU Ying 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2018年第4期768-779,共12页
A method is proposed to resolve the typical problem of air combat situation assessment. Taking the one-to-one air combat as an example and on the basis of air combat data recorded by the air combat maneuvering instrum... A method is proposed to resolve the typical problem of air combat situation assessment. Taking the one-to-one air combat as an example and on the basis of air combat data recorded by the air combat maneuvering instrument, the problem of air combat situation assessment is equivalent to the situation classification problem of air combat data. The fuzzy C-means clustering algorithm is proposed to cluster the selected air combat sample data and the situation classification of the data is determined by the data correlation analysis in combination with the clustering results and the pilots' description of the air combat process. On the basis of semi-supervised naive Bayes classifier, an improved algorithm is proposed based on data classification confidence, through which the situation classification of air combat data is carried out. The simulation results show that the improved algorithm can assess the air combat situation effectively and the improvement of the algorithm can promote the classification performance without significantly affecting the efficiency of the classifier. 展开更多
关键词 air combat situation assessment air combat maneu-vering instrument sEMI-sUPERVIsED naive bayes.
下载PDF
Parallel naive Bayes algorithm for large-scale Chinese text classification based on spark 被引量:22
5
作者 LIU Peng ZHAO Hui-han +3 位作者 TENG Jia-yu YANG Yan-yan LIU Ya-feng ZHU Zong-wei 《Journal of Central South University》 SCIE EI CAS CSCD 2019年第1期1-12,共12页
The sharp increase of the amount of Internet Chinese text data has significantly prolonged the processing time of classification on these data.In order to solve this problem,this paper proposes and implements a parall... The sharp increase of the amount of Internet Chinese text data has significantly prolonged the processing time of classification on these data.In order to solve this problem,this paper proposes and implements a parallel naive Bayes algorithm(PNBA)for Chinese text classification based on Spark,a parallel memory computing platform for big data.This algorithm has implemented parallel operation throughout the entire training and prediction process of naive Bayes classifier mainly by adopting the programming model of resilient distributed datasets(RDD).For comparison,a PNBA based on Hadoop is also implemented.The test results show that in the same computing environment and for the same text sets,the Spark PNBA is obviously superior to the Hadoop PNBA in terms of key indicators such as speedup ratio and scalability.Therefore,Spark-based parallel algorithms can better meet the requirement of large-scale Chinese text data mining. 展开更多
关键词 Chinese text classification naive bayes sPARK HADOOP resilient distributed dataset PARALLELIZATION
下载PDF
An Ensemble-Based Hotel Reviews System Using Naive Bayes Classifier
6
作者 Joseph Bamidele Awotunde Sanjay Misra +1 位作者 Vikash Katta Oluwafemi Charles Adebayo 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第10期131-154,共24页
The task of classifying opinions conveyed in any form of text online is referred to as sentiment analysis.The emergence of social media usage and its spread has given room for sentiment analysis in our daily lives.Soc... The task of classifying opinions conveyed in any form of text online is referred to as sentiment analysis.The emergence of social media usage and its spread has given room for sentiment analysis in our daily lives.Social media applications and websites have become the foremost spring of data recycled for reviews for sentimentality in various fields.Various subject matter can be encountered on social media platforms,such as movie product reviews,consumer opinions,and testimonies,among others,which can be used for sentiment analysis.The rapid uncovering of these web contents contains divergence of many benefits like profit-making,which is one of the most vital of them all.According to a recent study,81%of consumers conduct online research prior to making a purchase.But the reviews available online are too huge and numerous for human brains to process and analyze.Hence,machine learning classifiers are one of the prominent tools used to classify sentiment in order to get valuable information for use in companies like hotels,game companies,and so on.Understanding the sentiments of people towards different commodities helps to improve the services for contextual promotions,referral systems,and market research.Therefore,this study proposes a sentiment-based framework detection to enable the rapid uncovering of opinionated contents of hotel reviews.A Naive Bayes classifier was used to process and analyze the dataset for the detection of the polarity of the words.The dataset from Datafiniti’s Business Database obtained from Kaggle was used for the experiments in this study.The performance evaluation of the model shows a test accuracy of 96.08%,an F1-score of 96.00%,a precision of 96.00%,and a recall of 96.00%.The results were compared with state-of-the-art classifiers and showed a promising performance andmuch better in terms of performancemetrics. 展开更多
关键词 sentiment analysis hotel reviews naive bayes algorithm consumer opinions web 2.0 machine learning
下载PDF
A Feature Weighted Mixed Naive Bayes Model for Monitoring Anomalies in the Fan System of a Thermal Power Plant 被引量:5
7
作者 Min Wang Li Sheng +1 位作者 Donghua Zhou Maoyin Chen 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2022年第4期719-727,共9页
With the increasing intelligence and integration,a great number of two-valued variables(generally stored in the form of 0 or 1)often exist in large-scale industrial processes.However,these variables cannot be effectiv... With the increasing intelligence and integration,a great number of two-valued variables(generally stored in the form of 0 or 1)often exist in large-scale industrial processes.However,these variables cannot be effectively handled by traditional monitoring methods such as linear discriminant analysis(LDA),principal component analysis(PCA)and partial least square(PLS)analysis.Recently,a mixed hidden naive Bayesian model(MHNBM)is developed for the first time to utilize both two-valued and continuous variables for abnormality monitoring.Although the MHNBM is effective,it still has some shortcomings that need to be improved.For the MHNBM,the variables with greater correlation to other variables have greater weights,which can not guarantee greater weights are assigned to the more discriminating variables.In addition,the conditional P(x j|x j′,y=k)probability must be computed based on historical data.When the training data is scarce,the conditional probability between continuous variables tends to be uniformly distributed,which affects the performance of MHNBM.Here a novel feature weighted mixed naive Bayes model(FWMNBM)is developed to overcome the above shortcomings.For the FWMNBM,the variables that are more correlated to the class have greater weights,which makes the more discriminating variables contribute more to the model.At the same time,FWMNBM does not have to calculate the conditional probability between variables,thus it is less restricted by the number of training data samples.Compared with the MHNBM,the FWMNBM has better performance,and its effectiveness is validated through numerical cases of a simulation example and a practical case of the Zhoushan thermal power plant(ZTPP),China. 展开更多
关键词 Abnormality monitoring continuous variables feature weighted mixed naive bayes model(FWMNBM) two-valued variables thermal power plant
下载PDF
Social Network Rumor Recognition Based on Enhanced Naive Bayes 被引量:1
8
作者 Lei Guo 《Journal of New Media》 2021年第3期99-107,共9页
In recent years,with the increasing popularity of social networks,rumors have become more common.At present,the solution to rumors in social networks is mainly through media censorship and manual reporting,but this me... In recent years,with the increasing popularity of social networks,rumors have become more common.At present,the solution to rumors in social networks is mainly through media censorship and manual reporting,but this method requires a lot of manpower and material resources,and the cost is relatively high.Therefore,research on the characteristics of rumors and automatic identification and classification of network message text is of great significance.This paper uses the Naive Bayes algorithm combined with Laplacian smoothing to identify rumors in social network texts.The first is to segment the text and remove the stop words after the word segmentation is completed.Because of the data-sensitive nature of Naive Bayes,this paper performs text preprocessing on the input data.Then a naive Bayes classifier is constructed,and the Laplacian smoothing method is introduced to solve the problem of using the naive Bayes model to estimate the zero probability in rumor recognition.Finally,experiments show that the Naive Bayes algorithm combined with Laplace smoothing can effectively improve the accuracy of rumor recognition. 展开更多
关键词 Rumor recognition social network machine learning naive bayes laplacian smoothing
下载PDF
基于K-means和naive Bayes的数据库用户行为异常检测研究 被引量:8
9
作者 王旭仁 冯安然 +2 位作者 何发镁 马慧珍 杨杰 《计算机应用研究》 CSCD 北大核心 2020年第4期1128-1131,共4页
针对数据库用户行为异常导致数据库泄露问题,提出了一种基于K-means和naive Bayes算法的数据库用户异常检测方法。首先,利用数据库历史审计日志中用户的查询语句与查询结果,采用K-means聚类方法得到用户的分组;然后,使用naive Bayes分... 针对数据库用户行为异常导致数据库泄露问题,提出了一种基于K-means和naive Bayes算法的数据库用户异常检测方法。首先,利用数据库历史审计日志中用户的查询语句与查询结果,采用K-means聚类方法得到用户的分组;然后,使用naive Bayes分类算法构造用户异常检测模型。与单独使用naive Bayes分类法构造的模型相比,在数据预处理时其精简了用户行为轮廓的表示方法,降低了计算冗余,减少了81%的训练时间;利用K-means聚类方法得到用户组别,使检测的精确率提高了7.06%,F 1值提高了3.33%。实验证明,所提方法大幅降低了训练时间,取得了良好的检测效果。 展开更多
关键词 数据库 用户行为 异常检测 K-MEANs聚类 naive bayes分类算法
下载PDF
Spam Filtering:Online Naive Bayes Based on TONE 被引量:1
10
作者 Guanglu Sun Hongyue Sun +1 位作者 Yingcai Ma Yuewu Shen 《ZTE Communications》 2013年第2期51-54,共4页
The naive, Bayes (NB) model has been successfully used to tackle spare, and is very accurate. However, there is still room for improwment. We use a train on or near error (TONE) method in online NB to enhance the ... The naive, Bayes (NB) model has been successfully used to tackle spare, and is very accurate. However, there is still room for improwment. We use a train on or near error (TONE) method in online NB to enhance the perfornmnee of NB and reduce the number of training emails. We conducted an experiment to determine the performanee of the improved algorithm by plotting (I-ROCA)% curves. The resuhs show that the proposed method improves the performanee of original NB. 展开更多
关键词 spare fihering online naive bayes train-on or near error
下载PDF
最小风险的Naive Bayes技术在反垃圾邮件系统中的应用 被引量:3
11
作者 张健 陈拓 +3 位作者 韩益亮 畅雄杰 李彩霞 潘峰 《微电子学与计算机》 CSCD 北大核心 2005年第12期139-141,共3页
文章提出了一套更好地过滤中文垃圾邮件的方案,这套方案将利用垃圾邮件规律的规则过滤和最小风险的NaiveBayes内容过滤算法结合了起来,并根据垃圾邮件的特性做了必要的改进。并且这套方案也在Linux/So-laris系统平台下基本上完成了大部... 文章提出了一套更好地过滤中文垃圾邮件的方案,这套方案将利用垃圾邮件规律的规则过滤和最小风险的NaiveBayes内容过滤算法结合了起来,并根据垃圾邮件的特性做了必要的改进。并且这套方案也在Linux/So-laris系统平台下基本上完成了大部分功能的软件编程。实际电子邮件服务器上对本方案进行了测试,结果显示这套方案取得了很好的过滤效果。最小风险的NaiveBayes技术是目前最重要的反垃圾邮件技术之一。 展开更多
关键词 垃圾邮件 规则过滤 naive bayEs
下载PDF
基于Naive Bayes的CLIF_NB文本分类学习方法 被引量:1
12
作者 刘丽珍 宋瀚涛 陆玉昌 《小型微型计算机系统》 CSCD 北大核心 2005年第9期1575-1577,共3页
针对NaiveBayes方法中条件独立性假设常常与实际相违背的情况,提出了CLIF-NB文本分类学习方法,利用互信息理论,计算特征属性之间的最大相关性概率,用变量集组合替代线性不可分属性,改善条件独立性假设的限制,并通过学习一系列分类器,缩... 针对NaiveBayes方法中条件独立性假设常常与实际相违背的情况,提出了CLIF-NB文本分类学习方法,利用互信息理论,计算特征属性之间的最大相关性概率,用变量集组合替代线性不可分属性,改善条件独立性假设的限制,并通过学习一系列分类器,缩小训练集中的分类错误,综合得出分类准确率较高的CLIF-NB分类器. 展开更多
关键词 文本分类 naive bayEs 条件独立性假设
下载PDF
基于模糊聚类和Naive Bayes方法的文本分类器 被引量:1
13
作者 杨岳湘 田艳芳 王韶红 《计算机工程与科学》 CSCD 2002年第5期18-21,共4页
本文提出一种文本分类的新方法 ,该方法将模糊聚类与基于NaiveBayes的EM分类算法相结合 ,从而大大提高了EM分类算法的准确性 ,并解决了使用字符匹配引起的不完整性和不准确性问题。该方法首先给出每个类的一些关键词 ,并把这些关键词作... 本文提出一种文本分类的新方法 ,该方法将模糊聚类与基于NaiveBayes的EM分类算法相结合 ,从而大大提高了EM分类算法的准确性 ,并解决了使用字符匹配引起的不完整性和不准确性问题。该方法首先给出每个类的一些关键词 ,并把这些关键词作为聚类中心进行聚类 。 展开更多
关键词 模糊聚类 naive bayEs方法 文本分类器 聚类中心 神经网络
下载PDF
基于Naive Bayes的中文人名识别研究 被引量:2
14
作者 曾辉 王俊 熊李艳 《科学技术与工程》 北大核心 2015年第6期83-86,98,共5页
在传统的只统计人名用字的Naive Bayes分类算法的基础上,将人名上下文边界融入其中,并利用从大规模语料库中统计的人名用字、边界模板频率对人名定界,再通过扩散操作召回遗漏人名。该方法简单易行,并能取得很好的效果。实验结果表明,其... 在传统的只统计人名用字的Naive Bayes分类算法的基础上,将人名上下文边界融入其中,并利用从大规模语料库中统计的人名用字、边界模板频率对人名定界,再通过扩散操作召回遗漏人名。该方法简单易行,并能取得很好的效果。实验结果表明,其F值达到了93.28%。 展开更多
关键词 naive bayEs 分类算法 边界模板 人名识别
下载PDF
基于Naive Bayes的藏文人名性别自动识别 被引量:2
15
作者 贡保才让 色差甲 +2 位作者 慈祯嘉措 桑杰端珠 才让加 《青海师范大学学报(自然科学版)》 2017年第4期11-15,共5页
藏文指代消解是藏文信息处理的重要内容也是难点之一.本文利用Naive Bayes模型实现了藏文人名性别的自动识别,从而达到人称代词消解的目的.本方法根据人名的结构和用字(syllable)信息,利用Naive Bayes模型进行机器学习,对3463个藏族人... 藏文指代消解是藏文信息处理的重要内容也是难点之一.本文利用Naive Bayes模型实现了藏文人名性别的自动识别,从而达到人称代词消解的目的.本方法根据人名的结构和用字(syllable)信息,利用Naive Bayes模型进行机器学习,对3463个藏族人名数据进行开放语料的测试,男女综合人名的准确率达到了99.31%. 展开更多
关键词 藏族人名 机器学习 naive bayEs 自动识别
下载PDF
基于Naive Bayes算法的大豆病害诊断研究
16
作者 时雷 虎晓红 席磊 《安徽农业科学》 CAS 北大核心 2009年第11期5320-5320,5323,共2页
介绍了Naive Bayes算法的基本理论。以UCI数据库中的大豆数据集为实例,研究了Naive Bayes算法在大豆病害诊断中的应用。试验结果表明,Naive Bayes算法的预测精度优于决策树C4.5算法和最近邻INN算法。
关键词 naivebayes 大豆 病害诊断
下载PDF
基于二十维组合特征的Naive Bayes Classifier预测金属离子配体的结合残基
17
作者 刘柳 张晓瑾 +1 位作者 胡秀珍 王珊 《内蒙古工业大学学报(自然科学版)》 2018年第5期325-331,共7页
蛋白质与金属离子配体的相互作用在生命进程中扮演着非常重要的角色.预测金属离子配体的结合残基对于理解细胞机制和设计分子药物有重要意义.文中使用Naive Bayes Classifier对十种金属离子配体Zn^(2+)、Fe^(2+)、Fe^(3+)、Cu^(2+)、Mn^... 蛋白质与金属离子配体的相互作用在生命进程中扮演着非常重要的角色.预测金属离子配体的结合残基对于理解细胞机制和设计分子药物有重要意义.文中使用Naive Bayes Classifier对十种金属离子配体Zn^(2+)、Fe^(2+)、Fe^(3+)、Cu^(2+)、Mn^(2+)、Co^(2+)、Ca^(2+)、Mg^(2+)、Na^+和K^+的结合残基进行预测,五交叉检验下得到了较好的预测结果. 展开更多
关键词 naive bayEs CLAssIFIER 金属离子配体 结合残基 特征参数
下载PDF
Bayesian and hierarchical Bayesian analysis of response - time data with concomitant variables
18
作者 Dinesh Kumar 《Journal of Biomedical Science and Engineering》 2010年第7期711-718,共8页
This paper considers the Bayes and hierarchical Bayes approaches for analyzing clinical data on response times with available values for one or more concomitant variables. Response times are assumed to follow simple e... This paper considers the Bayes and hierarchical Bayes approaches for analyzing clinical data on response times with available values for one or more concomitant variables. Response times are assumed to follow simple exponential distributions, with a different parameter for each patient. The analyses are carried out in case of progressive censoring assuming squared error loss function and gamma distribution as priors and hyperpriors. The possibilities of using the methodology in more general situations like dose- response modeling have also been explored. Bayesian estimators derived in this paper are applied to lung cancer data set with concomitant variables. 展开更多
关键词 bayEs EsTIMATOR bayEsIAN Posterior DENsITY Gamma Prior DENsITY (GPD) HIERARCHICAL bayEs EsTIMATOR Hyperprior Noninformative Prior Quasi-Density (NPQD) Progressive Censoring squared Error Loss FUNCTION (sELF) Whittaker FUNCTION W s1 s2 (.).
下载PDF
改进的Naive Bayes技术在反垃圾邮件系统中的应用 被引量:6
19
作者 段宏斌 张健 《西北大学学报(自然科学版)》 CAS CSCD 北大核心 2006年第5期737-740,共4页
目的简化并改进中文反垃圾邮件系统。方法改进Naive Bayes算法且结合利用垃圾邮件规律的规则过滤,这套方案也在Linux/Solaris系统平台下的软件编程。结果在邮件服务器上对本方案进行了测试,结果显示这套方案取得了很好的过滤效果。结论... 目的简化并改进中文反垃圾邮件系统。方法改进Naive Bayes算法且结合利用垃圾邮件规律的规则过滤,这套方案也在Linux/Solaris系统平台下的软件编程。结果在邮件服务器上对本方案进行了测试,结果显示这套方案取得了很好的过滤效果。结论改进的Naive Bayes技术可提高反垃圾邮件过滤器的速度和效率。 展开更多
关键词 规则过滤 内容过滤 改进的naive bayEs
下载PDF
基于Naive Bayes的维吾尔文文本分类算法及其性能分析 被引量:7
20
作者 艾海麦提江.阿布来提 吐尔地.托合提 艾斯卡尔.艾木都拉 《计算机应用与软件》 CSCD 北大核心 2012年第12期27-29,共3页
以大规模网络维吾尔文文本的自动分类技术研究为背景,设计模块化结构的维吾尔文本分类系统,在深入调研基础上选择Naive Bayes算法为分类引擎,用C#实现分类系统。预处理中,结合维吾尔语的词法特征,通过引入词干提取方法大大降低特征维数... 以大规模网络维吾尔文文本的自动分类技术研究为背景,设计模块化结构的维吾尔文本分类系统,在深入调研基础上选择Naive Bayes算法为分类引擎,用C#实现分类系统。预处理中,结合维吾尔语的词法特征,通过引入词干提取方法大大降低特征维数。在包含10大类共计3 000多个较大规模文本语料库基础上给出分类实验结果,再通过x2统计方法选择不同数目的特征,也分别给出分类实验结果。结果表明,预处理后的维吾尔文特征空间中只有1%-3%特征是最佳的,因而进一步确定哪些是最佳特征或降低特征空间维数是有可能的。 展开更多
关键词 维吾尔文 文本分类 naive bayes词干提取 停用词
下载PDF
上一页 1 2 50 下一页 到第
使用帮助 返回顶部