期刊文献+
共找到115篇文章
< 1 2 6 >
每页显示 20 50 100
Pruned fuzzy K-nearest neighbor classifier for beat classification 被引量:2
1
作者 Muhammad Arif Muhammad Usman Akram Fayyaz-ul-Afsar Amir Minhas 《Journal of Biomedical Science and Engineering》 2010年第4期380-389,共10页
Arrhythmia beat classification is an active area of research in ECG based clinical decision support systems. In this paper, Pruned Fuzzy K-nearest neighbor (PFKNN) classifier is proposed to classify six types of beats... Arrhythmia beat classification is an active area of research in ECG based clinical decision support systems. In this paper, Pruned Fuzzy K-nearest neighbor (PFKNN) classifier is proposed to classify six types of beats present in the MIT-BIH Arrhythmia database. We have tested our classifier on ~ 103100 beats for six beat types present in the database. Fuzzy KNN (FKNN) can be implemented very easily but large number of training examples used for classification can be very time consuming and requires large storage space. Hence, we have proposed a time efficient Arif-Fayyaz pruning algorithm especially suitable for FKNN which can maintain good classification accuracy with appropriate retained ratio of training data. By using Arif-Fayyaz pruning algorithm with Fuzzy KNN, we have achieved a beat classification accuracy of 97% and geometric mean of sensitivity of 94.5% with only 19% of the total training examples. The accuracy and sensitivity is comparable to FKNN when all the training data is used. Principal Component Analysis is used to further reduce the dimension of feature space from eleven to six without compromising the accuracy and sensitivity. PFKNN was found to robust against noise present in the ECG data. 展开更多
关键词 ARRHYTHMIA ECG k-nearest neighbor PRUNING FUZZY classification
下载PDF
Enhancing Cancer Classification through a Hybrid Bio-Inspired Evolutionary Algorithm for Biomarker Gene Selection 被引量:1
2
作者 Hala AlShamlan Halah AlMazrua 《Computers, Materials & Continua》 SCIE EI 2024年第4期675-694,共20页
In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selec... In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selection.Themotivation for utilizingGWOandHHOstems fromtheir bio-inspired nature and their demonstrated success in optimization problems.We aimto leverage the strengths of these algorithms to enhance the effectiveness of feature selection in microarray-based cancer classification.We selected leave-one-out cross-validation(LOOCV)to evaluate the performance of both two widely used classifiers,k-nearest neighbors(KNN)and support vector machine(SVM),on high-dimensional cancer microarray data.The proposed method is extensively tested on six publicly available cancer microarray datasets,and a comprehensive comparison with recently published methods is conducted.Our hybrid algorithm demonstrates its effectiveness in improving classification performance,Surpassing alternative approaches in terms of precision.The outcomes confirm the capability of our method to substantially improve both the precision and efficiency of cancer classification,thereby advancing the development ofmore efficient treatment strategies.The proposed hybridmethod offers a promising solution to the gene selection problem in microarray-based cancer classification.It improves the accuracy and efficiency of cancer diagnosis and treatment,and its superior performance compared to other methods highlights its potential applicability in realworld cancer classification tasks.By harnessing the complementary search mechanisms of GWO and HHO,we leverage their bio-inspired behavior to identify informative genes relevant to cancer diagnosis and treatment. 展开更多
关键词 Bio-inspired algorithms BIOINFORMATICS cancer classification evolutionary algorithm feature selection gene expression grey wolf optimizer harris hawks optimization k-nearest neighbor support vector machine
下载PDF
Characteristics,classification and KNN-based evaluation of paleokarst carbonate reservoirs:A case study of Feixianguan Formation in northeastern Sichuan Basin,China
3
作者 Yang Ren Wei Wei +3 位作者 Peng Zhu Xiuming Zhang Keyong Chen Yisheng Liu 《Energy Geoscience》 2023年第3期113-126,共14页
The Feixianguan Formation reservoirs in northeastern Sichuan are mainly a suite of carbonate platform deposits.The reservoir types are diverse with high heterogeneity and complex genetic mechanisms.Pores,vugs and frac... The Feixianguan Formation reservoirs in northeastern Sichuan are mainly a suite of carbonate platform deposits.The reservoir types are diverse with high heterogeneity and complex genetic mechanisms.Pores,vugs and fractures of different genetic mechanisms and scales are often developed in association,and it is difficult to classify reservoir types merely based on static data such as outcrop observation,and cores and logging data.In the study,the reservoirs in the Feixianguan Formation are grouped into five types by combining dynamic and static data,that is,karst breccia-residual vuggy type,solution-enhanced vuggy type,fractured-vuggy type,fractured type and matrix type(non-reservoir).Based on conventional logging data,core data and formation microscanner image(FMI)data of the Qilibei block,northeastern Sichuan Basin,the reservoirs are classified in accordance with fracture-vug matching relationship.Based on the principle of cluster analysis,K-Nearest Neighbor(KNN)classification templates are established,and the applicability of the model is verified by using the reservoir data from wells uninvolved in modeling.Following the analysis of the results of reservoir type discrimination and the production of corresponding reservoir intervals,the contributions of various reservoir types to production are evaluated and the reliability of reservoir type classification is verified.The results show that the solution-enhanced vuggy type is of high-quality sweet spot reservoir in the study area with good physical property and high gas production,followed by the fractured-vuggy type,and the fractured and karst breccia-residual vuggy types are the least promising. 展开更多
关键词 Carbonate reservoir Reservoir type Cluster analysis k-nearest neighbor(knn) Feixianguan Formation Sichuan basin
下载PDF
Using Deep Learning for Soybean Pest and Disease Classification in Farmland 被引量:3
4
作者 Si Meng-min Deng Ming-hui Han Ye 《Journal of Northeast Agricultural University(English Edition)》 CAS 2019年第1期64-72,共9页
To accurately identify soybean pests and diseases, in this paper, a kind of deep convolution network model was used to determine whether or not a soybean crop possessed pests and diseases. The proposed deep convolutio... To accurately identify soybean pests and diseases, in this paper, a kind of deep convolution network model was used to determine whether or not a soybean crop possessed pests and diseases. The proposed deep convolution network could learn the highdimensional feature representation of images by using their depth. An inception module was used to construct a neural network. In the inception module, multiscale convolution kernels were used to extract the distributed characteristics of soybean pests and diseases at different scales and to perform cascade fusion. The model then trained the SoftMax classifier in a uniformed framework. This realized the model of soybean pests and diseases so as to verify the effectiveness of this method. In this study, 800 images of soybean leaf images were taken as the experimental objects. Of these 800 images, 400 were selected for network training, and the remaining 400 images were used for the network test. Furthermore, the classical convolutional neural network was optimized. The accuracies before and after optimization were 96.25% and 95.81%, respectively, in terms of extracting image features. This type of research might be applied to achieve a degree of automation in agricultural field management. 展开更多
关键词 deep learning support VECTOR machine(SVM) k-nearest neighbor(knn) SOYBEAN PEST and disease
下载PDF
Computational Intelligence Prediction Model Integrating Empirical Mode Decomposition,Principal Component Analysis,and Weighted k-Nearest Neighbor 被引量:2
5
作者 Li Tang He-Ping Pan Yi-Yong Yao 《Journal of Electronic Science and Technology》 CAS CSCD 2020年第4期341-349,共9页
On the basis of machine leaning,suitable algorithms can make advanced time series analysis.This paper proposes a complex k-nearest neighbor(KNN)model for predicting financial time series.This model uses a complex feat... On the basis of machine leaning,suitable algorithms can make advanced time series analysis.This paper proposes a complex k-nearest neighbor(KNN)model for predicting financial time series.This model uses a complex feature extraction process integrating a forward rolling empirical mode decomposition(EMD)for financial time series signal analysis and principal component analysis(PCA)for the dimension reduction.The information-rich features are extracted then input to a weighted KNN classifier where the features are weighted with PCA loading.Finally,prediction is generated via regression on the selected nearest neighbors.The structure of the model as a whole is original.The test results on real historical data sets confirm the effectiveness of the models for predicting the Chinese stock index,an individual stock,and the EUR/USD exchange rate. 展开更多
关键词 Empirical mode decomposition(EMD) k-nearest neighbor(knn) principal component analysis(PCA) time series
下载PDF
Comparison of wrist motion classification methods using surface electromyogram 被引量:1
6
作者 JEONG Eui-chul KIM Seo-jun +1 位作者 SONG Young-rok LEE Sang-min 《Journal of Central South University》 SCIE EI CAS 2013年第4期960-968,共9页
The Gaussian mixture model (GMM), k-nearest neighbor (k-NN), quadratic discriminant analysis (QDA), and linear discriminant analysis (LDA) were compared to classify wrist motions using surface electromyogram (EMG). Ef... The Gaussian mixture model (GMM), k-nearest neighbor (k-NN), quadratic discriminant analysis (QDA), and linear discriminant analysis (LDA) were compared to classify wrist motions using surface electromyogram (EMG). Effect of feature selection in EMG signal processing was also verified by comparing classification accuracy of each feature, and the enhancement of classification accuracy by normalization was confirmed. EMG signals were acquired from two electrodes placed on the forearm of twenty eight healthy subjects and used for recognition of wrist motion. Features were extracted from the obtained EMG signals in the time domain and were applied to classification methods. The difference absolute mean value (DAMV), difference absolute standard deviation value (DASDV), mean absolute value (MAV), root mean square (RMS) were used for composing 16 double features which were combined of two channels. In the classification methods, the highest accuracy of classification showed in the GMM. The most effective combination of classification method and double feature was (MAV, DAMV) of GMM and its classification accuracy was 96.85%. The results of normalization were better than those of non-normalization in GMM, k-NN, and LDA. 展开更多
关键词 Gaussian mixture model k-nearest neighbor quadratic discriminant analysis linear discriminant analysis electromyogram (EMG) pattern classification feature extraction
下载PDF
Active learning accelerated Monte-Carlo simulation based on the modified K-nearest neighbors algorithm and its application to reliability estimations
7
作者 Zhifeng Xu Jiyin Cao +2 位作者 Gang Zhang Xuyong Chen Yushun Wu 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2023年第10期306-313,共8页
This paper proposes an active learning accelerated Monte-Carlo simulation method based on the modified K-nearest neighbors algorithm.The core idea of the proposed method is to judge whether or not the output of a rand... This paper proposes an active learning accelerated Monte-Carlo simulation method based on the modified K-nearest neighbors algorithm.The core idea of the proposed method is to judge whether or not the output of a random input point can be postulated through a classifier implemented through the modified K-nearest neighbors algorithm.Compared to other active learning methods resorting to experimental designs,the proposed method is characterized by employing Monte-Carlo simulation for sampling inputs and saving a large portion of the actual evaluations of outputs through an accurate classification,which is applicable for most structural reliability estimation problems.Moreover,the validity,efficiency,and accuracy of the proposed method are demonstrated numerically.In addition,the optimal value of K that maximizes the computational efficiency is studied.Finally,the proposed method is applied to the reliability estimation of the carbon fiber reinforced silicon carbide composite specimens subjected to random displacements,which further validates its practicability. 展开更多
关键词 Active learning Monte-carlo simulation k-nearest neighbors Reliability estimation classification
下载PDF
LF-CNN:Deep Learning-Guided Small Sample Target Detection for Remote Sensing Classification
8
作者 Chengfan Li Lan Liu +1 位作者 Junjuan Zhao Xuefeng Liu 《Computer Modeling in Engineering & Sciences》 SCIE EI 2022年第4期429-444,共16页
Target detection of small samples with a complex background is always difficult in the classification of remote sensing images.We propose a new small sample target detection method combining local features and a convo... Target detection of small samples with a complex background is always difficult in the classification of remote sensing images.We propose a new small sample target detection method combining local features and a convolutional neural network(LF-CNN)with the aim of detecting small numbers of unevenly distributed ground object targets in remote sensing images.The k-nearest neighbor method is used to construct the local neighborhood of each point and the local neighborhoods of the features are extracted one by one from the convolution layer.All the local features are aggregated by maximum pooling to obtain global feature representation.The classification probability of each category is then calculated and classified using the scaled expected linear units function and the full connection layer.The experimental results show that the proposed LF-CNN method has a high accuracy of target detection and classification for hyperspectral imager remote sensing data under the condition of small samples.Despite drawbacks in both time and complexity,the proposed LF-CNN method can more effectively integrate the local features of ground object samples and improve the accuracy of target identification and detection in small samples of remote sensing images than traditional target detection methods. 展开更多
关键词 Small samples local features convolutional neural network(CNN) k-nearest neighbor(knn) target detection
下载PDF
Diagnosis of Disc Space Variation Fault Degree of Transformer Winding Based on K-Nearest Neighbor Algorithm
9
作者 Song Wang Fei Xie +3 位作者 Fengye Yang Shengxuan Qiu Chuang Liu Tong Li 《Energy Engineering》 EI 2023年第10期2273-2285,共13页
Winding is one of themost important components in power transformers.Ensuring the health state of the winding is of great importance to the stable operation of the power system.To efficiently and accurately diagnose t... Winding is one of themost important components in power transformers.Ensuring the health state of the winding is of great importance to the stable operation of the power system.To efficiently and accurately diagnose the disc space variation(DSV)fault degree of transformer winding,this paper presents a diagnostic method of winding fault based on the K-Nearest Neighbor(KNN)algorithmand the frequency response analysis(FRA)method.First,a laboratory winding model is used,and DSV faults with four different degrees are achieved by changing disc space of the discs in the winding.Then,a series of FRA tests are conducted to obtain the FRA results and set up the FRA dataset.Second,ten different numerical indices are utilized to obtain features of FRA curves of faulted winding.Third,the 10-fold cross-validation method is employed to determine the optimal k-value of KNN.In addition,to improve the accuracy of the KNN model,a comparative analysis is made between the accuracy of the KNN algorithm and k-value under four distance functions.After getting the most appropriate distance metric and kvalue,the fault classificationmodel based on theKNN and FRA is constructed and it is used to classify the degrees of DSV faults.The identification accuracy rate of the proposed model is up to 98.30%.Finally,the performance of the model is presented by comparing with the support vector machine(SVM),SVM optimized by the particle swarmoptimization(PSO-SVM)method,and randomforest(RF).The results show that the diagnosis accuracy of the proposed model is the highest and the model can be used to accurately diagnose the DSV fault degrees of the winding. 展开更多
关键词 Transformer winding frequency response analysis(FRA)method k-nearest neighbor(knn) disc space variation(DSV)
下载PDF
Efficient Parallel Processing of k-Nearest Neighbor Queries by Using a Centroid-based and Hierarchical Clustering Algorithm
10
作者 Elaheh Gavagsaz 《Artificial Intelligence Advances》 2022年第1期26-41,共16页
The k-Nearest Neighbor method is one of the most popular techniques for both classification and regression purposes.Because of its operation,the application of this classification may be limited to problems with a cer... The k-Nearest Neighbor method is one of the most popular techniques for both classification and regression purposes.Because of its operation,the application of this classification may be limited to problems with a certain number of instances,particularly,when run time is a consideration.However,the classification of large amounts of data has become a fundamental task in many real-world applications.It is logical to scale the k-Nearest Neighbor method to large scale datasets.This paper proposes a new k-Nearest Neighbor classification method(KNN-CCL)which uses a parallel centroid-based and hierarchical clustering algorithm to separate the sample of training dataset into multiple parts.The introduced clustering algorithm uses four stages of successive refinements and generates high quality clusters.The k-Nearest Neighbor approach subsequently makes use of them to predict the test datasets.Finally,sets of experiments are conducted on the UCI datasets.The experimental results confirm that the proposed k-Nearest Neighbor classification method performs well with regard to classification accuracy and performance. 展开更多
关键词 classification k-nearest neighbor Big data CLUSTERING Parallel processing
下载PDF
基于粗糙集的快速KNN文本分类算法 被引量:22
11
作者 孙荣宗 苗夺谦 +1 位作者 卫志华 李文 《计算机工程》 CAS CSCD 北大核心 2010年第24期175-177,共3页
传统K最近邻一个明显缺陷是样本相似度的计算量很大,在具有大量高维样本的文本分类中,由于复杂度太高而缺乏实用性。为此,将粗糙集理论引入到文本分类中,利用上下近似概念刻画各类训练样本的分布,并在训练过程中计算出各类上下近似的范... 传统K最近邻一个明显缺陷是样本相似度的计算量很大,在具有大量高维样本的文本分类中,由于复杂度太高而缺乏实用性。为此,将粗糙集理论引入到文本分类中,利用上下近似概念刻画各类训练样本的分布,并在训练过程中计算出各类上下近似的范围。在分类过程中根据待分类文本向量在样本空间中的分布位置,改进算法可以直接判定一些文本的归属,缩小K最近邻搜索范围。实验表明,该算法可以在保持K最近邻分类性能基本不变的情况下,显著提高分类效率。 展开更多
关键词 文本分类 K最近邻 粗糙集
下载PDF
改进型加权KNN算法的不平衡数据集分类 被引量:26
12
作者 王超学 潘正茂 +2 位作者 马春森 董丽丽 张涛 《计算机工程》 CAS CSCD 2012年第20期160-163,168,共5页
K最邻近(KNN)算法对不平衡数据集进行分类时分类判决总会倾向于多数类。为此,提出一种加权KNN算法GAK-KNN。定义新的权重分配模型,综合考虑类间分布不平衡及类内分布不均匀的不良影响,采用基于遗传算法的K-means算法对训练样本集进行聚... K最邻近(KNN)算法对不平衡数据集进行分类时分类判决总会倾向于多数类。为此,提出一种加权KNN算法GAK-KNN。定义新的权重分配模型,综合考虑类间分布不平衡及类内分布不均匀的不良影响,采用基于遗传算法的K-means算法对训练样本集进行聚类,按照权重分配模型计算各训练样本的权重,通过改进的KNN算法对测试样本进行分类。基于UCI数据集的大量实验结果表明,GAK-KNN算法的识别率和整体性能都优于传统KNN算法及其他改进算法。 展开更多
关键词 不平衡数据集 分类 K最邻近算法 权重分配模型 遗传算法 K-MEANS算法
下载PDF
基于KNN的特征自适应加权自然图像分类研究 被引量:17
13
作者 侯玉婷 彭进业 +1 位作者 郝露微 王瑞 《计算机应用研究》 CSCD 北大核心 2014年第3期957-960,共4页
针对自然图像类型广泛、结构复杂、分类精度不高的实际问题,提出了一种为自然图像不同特征自动加权值的K-近邻(K-nearest neighbors,KNN)分类方法。通过分析自然图像的不同特征对于分类结果的影响,采用基因遗传算法求得一组最优分类权... 针对自然图像类型广泛、结构复杂、分类精度不高的实际问题,提出了一种为自然图像不同特征自动加权值的K-近邻(K-nearest neighbors,KNN)分类方法。通过分析自然图像的不同特征对于分类结果的影响,采用基因遗传算法求得一组最优分类权值向量解,利用该最优权值对自然图像纹理和颜色两个特征分别进行加权,最后用自适应加权K-近邻算法实现对自然图像的分类。实验结果表明,在用户给定分类精度需求和低时间复杂度的约束下,算法能快速、高精度地进行自然图像分类。提出的自适应加权K-近邻分类方法对于门类繁多的自然图像具有普遍适用性,可以有效地提高自然图像的分类性能。 展开更多
关键词 K-近邻算法 基因算法 自然图像分类 特征加权
下载PDF
基于密度的kNN分类器训练样本裁剪方法的改进 被引量:13
14
作者 熊忠阳 杨营辉 张玉芳 《计算机应用》 CSCD 北大核心 2010年第3期799-801,817,共4页
在文本分类中,训练集的分布状态会直接影响k-近邻(kNN)分类器的效率和准确率。通过分析基于密度的kNN文本分类器训练样本的裁剪方法,发现它存在两大不足:一是裁剪之后的均匀状态只是以ε为半径的球形区域意义上的均匀状态,而非最理想的... 在文本分类中,训练集的分布状态会直接影响k-近邻(kNN)分类器的效率和准确率。通过分析基于密度的kNN文本分类器训练样本的裁剪方法,发现它存在两大不足:一是裁剪之后的均匀状态只是以ε为半径的球形区域意义上的均匀状态,而非最理想的均匀状态即两两样本之间的距离相等;二是未对低密度区域的样本做任何处理,裁剪之后仍存在大量不均匀的区域。针对这两处不足,提出了以下两点改进:一是优化了裁剪策略,使裁剪之后的训练集更趋于理想的均匀状态;二是实现了对低密度区域样本的补充。通过实验对比,改进后的方法在稳定性和准确率方面都有明显提高。 展开更多
关键词 文本分类 K-近邻 快速分类 样本裁剪 样本补充
下载PDF
一种基于中心文档的KNN中文文本分类算法 被引量:17
15
作者 鲁婷 王浩 姚宏亮 《计算机工程与应用》 CSCD 北大核心 2011年第2期127-130,共4页
在浩瀚的数据资源中,为了实现对特定主题的搜索或提取,文本自动分类技术已经成为目前研究的热点。KNN是一种重要的文本自动分类方法,KNN能够处理大规模数据,且具有较高的稳定性,但面临分类速度较慢的问题。以KNN方法为基础,引入特征项... 在浩瀚的数据资源中,为了实现对特定主题的搜索或提取,文本自动分类技术已经成为目前研究的热点。KNN是一种重要的文本自动分类方法,KNN能够处理大规模数据,且具有较高的稳定性,但面临分类速度较慢的问题。以KNN方法为基础,引入特征项间的语义关系,并根据语义关系进行聚类生成中心文档,减少了KNN要搜索的文档数,提高了分类速度。仿真实验表明,该算法在不损失分类精度的情况下,显著提高了分类的速度。 展开更多
关键词 中文文本分类 k最邻近 中心文档 语义相似度 聚类
下载PDF
基于k-最近邻图的小样本KNN分类算法 被引量:27
16
作者 刘应东 牛惠民 《计算机工程》 CAS CSCD 北大核心 2011年第9期198-200,共3页
提出一种基于k-最近邻图的小样本KNN分类算法。通过划分k-最近邻图,形成多个相似度较高的簇,根据簇内已有标记的数据对象来标识同簇中未标记的数据对象,同时剔除原样本集中的噪声数据,从而扩展样本集,利用该新样本集对类标号未知数据对... 提出一种基于k-最近邻图的小样本KNN分类算法。通过划分k-最近邻图,形成多个相似度较高的簇,根据簇内已有标记的数据对象来标识同簇中未标记的数据对象,同时剔除原样本集中的噪声数据,从而扩展样本集,利用该新样本集对类标号未知数据对象进行类别标识。采用标准数据集进行测试,结果表明该算法在小样本情况下能够提高KNN的分类精度,减小最近邻阈值k对分类效果的影响。 展开更多
关键词 knn算法 k-最近邻图 小样本 图划分 分类算法
下载PDF
一种改进的KNN Web文本分类方法 被引量:9
17
作者 吴春颖 王士同 《计算机应用研究》 CSCD 北大核心 2008年第11期3275-3277,共3页
KNN方法存在两个不足:a)计算量巨大,它要求计算未知文本与所有训练样本间的相似度进而得到k个最近邻样本;b)当类别间有较多共性,即训练样本间有较多特征交叉现象时,KNN分类的精度将下降。针对这两个问题,提出了一种改进的KNN方法,该方... KNN方法存在两个不足:a)计算量巨大,它要求计算未知文本与所有训练样本间的相似度进而得到k个最近邻样本;b)当类别间有较多共性,即训练样本间有较多特征交叉现象时,KNN分类的精度将下降。针对这两个问题,提出了一种改进的KNN方法,该方法先通过Rocchio分类快速得到k0个最有可能的候选类别;然后在k0个类别训练文档中抽取部分代表样本采用KNN算法;最后由一种改进的相似度计算方法决定最终的文本所属类别。实验表明,改进的KNN方法在Web文本分类中能够获得较好的分类效果。 展开更多
关键词 WEB文本分类 K最近邻 快速分类
下载PDF
一种高效的K值自适应的SA-KNN算法 被引量:6
18
作者 孙可 龚永红 邓振云 《计算机工程与科学》 CSCD 北大核心 2015年第10期1965-1970,共6页
传统的K近邻(KNN)分类算法在实际应用过程中存在一些缺陷:没有考虑去除噪声样本,也没有考虑到在样本数据空间变换过程中保持样本数据本身的流形学结构,并且没有使用样本间属性的相关性。为此,提出引入稀疏学习理论,利用训练样本重构测... 传统的K近邻(KNN)分类算法在实际应用过程中存在一些缺陷:没有考虑去除噪声样本,也没有考虑到在样本数据空间变换过程中保持样本数据本身的流形学结构,并且没有使用样本间属性的相关性。为此,提出引入稀疏学习理论,利用训练样本重构测试样本的方法,重构过程使用了样本间的相关性,也用到局部保持投影LPP保持数据结构不变,同时引入l2,1范数用于去除噪声样本的方法来寻找投影变换矩阵W,进而利用W确定KNN算法中K值的SA-KNN算法。在UCI数据集上的仿真实验结果表明,该方法比传统的KNN分类算法和Entropy-KNN算法有更高的分类准确度。 展开更多
关键词 K近邻分类 相关性 去除噪声样本 局部保持投影 稀疏学习
下载PDF
分布式KNN算法在微信公众号分类中的应用 被引量:4
19
作者 肖斌 王锦阳 任启强 《计算机应用》 CSCD 北大核心 2017年第A01期295-299,共5页
针对微信公众号数据量大幅增长与从事微信活动的人们对其有效信息获取效率低下的问题,提出对微信公众号信息进行梳理并快速并行化分类以及打标签的方法。首先,该方法在介绍微信公众号实际应用的前提下,以经典K最近邻(KNN)分类算法为基础... 针对微信公众号数据量大幅增长与从事微信活动的人们对其有效信息获取效率低下的问题,提出对微信公众号信息进行梳理并快速并行化分类以及打标签的方法。首先,该方法在介绍微信公众号实际应用的前提下,以经典K最近邻(KNN)分类算法为基础,实践并分析了单机KNN算法在效率上的不足;然后,采用Hadoop平台实现了基于MapReduce模型的KNN算法,对比了单机与分布式的效率以及对K值的调优,实验中的样本训练集通过人为指定,文本相似度的判别分为分词、特征词提取、权重计算、测试向量与训练向量夹角计算等步骤。在24个类别基础上,通过对1 000万条公众号数据分类实验,为每个公众号打上了单标签或多标签,优化后的分类准确率达到82%,其中与生活相关的公众号数量占比达70%以上。研究表明使用分类后的结果,信息针对特定人群传播,传播的转化率有所提升;分布式KNN算法在微信公众号数据处理方面比单机算法具有更高的效率和鲁棒性。 展开更多
关键词 微信公众号 HADOOP平台 MAPREDUCE模型 K最近邻 分类
下载PDF
离散型增强烟花算法和kNN在特征选择中的研究 被引量:4
20
作者 黄欣 莫海淼 +1 位作者 赵志刚 曾敏 《计算机工程与应用》 CSCD 北大核心 2020年第16期112-117,共6页
特征选择是从原始特征集中选取特征子集,并且降低特征维度和减少冗余信息,从而达到提高分类准确度的效果。为了达到此效果,提出了新的特征选择算法。该算法使用经过离散化处理之后的增强烟花算法来搜索特征子集,同时将特征子集和经过惩... 特征选择是从原始特征集中选取特征子集,并且降低特征维度和减少冗余信息,从而达到提高分类准确度的效果。为了达到此效果,提出了新的特征选择算法。该算法使用经过离散化处理之后的增强烟花算法来搜索特征子集,同时将特征子集和经过惩罚因子处理之后约束条件融入到目标函数中,然后将搜索到的特征子集的数据放到kNN分类器进行训练和预测,最后使用十折交叉验证来检验分类的准确性。使用UCI数据进行仿真实验,仿真结果表明:与引导型烟花算法、烟花算法、蝙蝠算法、乌鸦算法、自适应粒子群算法相比,所提算法的总体性能优于其他五种算法。 展开更多
关键词 离散型增强烟花算法 特征选择 降维 分类 k近邻(knn)
下载PDF
上一页 1 2 6 下一页 到第
使用帮助 返回顶部