期刊文献+
共找到285篇文章
< 1 2 15 >
每页显示 20 50 100
Active learning accelerated Monte-Carlo simulation based on the modified K-nearest neighbors algorithm and its application to reliability estimations
1
作者 Zhifeng Xu Jiyin Cao +2 位作者 Gang Zhang Xuyong Chen Yushun Wu 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2023年第10期306-313,共8页
This paper proposes an active learning accelerated Monte-Carlo simulation method based on the modified K-nearest neighbors algorithm.The core idea of the proposed method is to judge whether or not the output of a rand... This paper proposes an active learning accelerated Monte-Carlo simulation method based on the modified K-nearest neighbors algorithm.The core idea of the proposed method is to judge whether or not the output of a random input point can be postulated through a classifier implemented through the modified K-nearest neighbors algorithm.Compared to other active learning methods resorting to experimental designs,the proposed method is characterized by employing Monte-Carlo simulation for sampling inputs and saving a large portion of the actual evaluations of outputs through an accurate classification,which is applicable for most structural reliability estimation problems.Moreover,the validity,efficiency,and accuracy of the proposed method are demonstrated numerically.In addition,the optimal value of K that maximizes the computational efficiency is studied.Finally,the proposed method is applied to the reliability estimation of the carbon fiber reinforced silicon carbide composite specimens subjected to random displacements,which further validates its practicability. 展开更多
关键词 Active learning Monte-carlo simulation k-nearest neighbors Reliability estimation CLASSIFICATION
下载PDF
GHM-FKNN:a generalized Heronian mean based fuzzy k-nearest neighbor classifier for the stock trend prediction
2
作者 吴振峰 WANG Mengmeng +1 位作者 LAN Tian ZHANG Anyuan 《High Technology Letters》 EI CAS 2023年第2期122-129,共8页
Stock trend prediction is a challenging problem because it involves many variables.Aiming at the problem that some existing machine learning techniques, such as random forest(RF), probabilistic random forest(PRF), k-n... Stock trend prediction is a challenging problem because it involves many variables.Aiming at the problem that some existing machine learning techniques, such as random forest(RF), probabilistic random forest(PRF), k-nearest neighbor(KNN), and fuzzy KNN(FKNN), have difficulty in accurately predicting the stock trend(uptrend or downtrend) for a given date, a generalized Heronian mean(GHM) based FKNN predictor named GHM-FKNN was proposed.GHM-FKNN combines GHM aggregation function with the ideas of the classical FKNN approach.After evaluation, the comparison results elucidated that GHM-FKNN outperformed the other best existing methods RF, PRF, KNN and FKNN on independent test datasets corresponding to three stocks, namely AAPL, AMZN and NFLX.Compared with RF, PRF, KNN and FKNN, GHM-FKNN achieved the best performance with accuracy of 62.37% for AAPL, 58.25% for AMZN, and 64.10% for NFLX. 展开更多
关键词 stock trend prediction Heronian mean fuzzy k-nearest neighbor(FKNN)
下载PDF
Diagnosis of Disc Space Variation Fault Degree of Transformer Winding Based on K-Nearest Neighbor Algorithm
3
作者 Song Wang Fei Xie +3 位作者 Fengye Yang Shengxuan Qiu Chuang Liu Tong Li 《Energy Engineering》 EI 2023年第10期2273-2285,共13页
Winding is one of themost important components in power transformers.Ensuring the health state of the winding is of great importance to the stable operation of the power system.To efficiently and accurately diagnose t... Winding is one of themost important components in power transformers.Ensuring the health state of the winding is of great importance to the stable operation of the power system.To efficiently and accurately diagnose the disc space variation(DSV)fault degree of transformer winding,this paper presents a diagnostic method of winding fault based on the K-Nearest Neighbor(KNN)algorithmand the frequency response analysis(FRA)method.First,a laboratory winding model is used,and DSV faults with four different degrees are achieved by changing disc space of the discs in the winding.Then,a series of FRA tests are conducted to obtain the FRA results and set up the FRA dataset.Second,ten different numerical indices are utilized to obtain features of FRA curves of faulted winding.Third,the 10-fold cross-validation method is employed to determine the optimal k-value of KNN.In addition,to improve the accuracy of the KNN model,a comparative analysis is made between the accuracy of the KNN algorithm and k-value under four distance functions.After getting the most appropriate distance metric and kvalue,the fault classificationmodel based on theKNN and FRA is constructed and it is used to classify the degrees of DSV faults.The identification accuracy rate of the proposed model is up to 98.30%.Finally,the performance of the model is presented by comparing with the support vector machine(SVM),SVM optimized by the particle swarmoptimization(PSO-SVM)method,and randomforest(RF).The results show that the diagnosis accuracy of the proposed model is the highest and the model can be used to accurately diagnose the DSV fault degrees of the winding. 展开更多
关键词 Transformer winding frequency response analysis(FRA)method k-nearest neighbor(KNN) disc space variation(DSV)
下载PDF
A LAW OF THE ITERATED LOGARITHM FOR NEAREST NEIGHBOR ESTIMATION OF MULTIVARIATE DENSITY FUNCTION
4
作者 洪圣岩 陈规景 +1 位作者 孔繁超 高集体 《Acta Mathematica Scientia》 SCIE CSCD 1992年第4期472-478,共7页
Let X be a d-dimensional random vector with unknown density function f(z) = f (z1, ..., z(d)), and let f(n) be teh nearest neighbor estimator of f proposed by Loftsgaarden and Quesenberry (1965). In this paper, we est... Let X be a d-dimensional random vector with unknown density function f(z) = f (z1, ..., z(d)), and let f(n) be teh nearest neighbor estimator of f proposed by Loftsgaarden and Quesenberry (1965). In this paper, we established the law of the iterated logarithm of f(n) for general case of d greater-than-or-equal-to 1, which gives the exact pointwise strong convergence rate of f(n). 展开更多
关键词 A LAW OF THE ITERATED LOGARITHM FOR NEAREST neighbor estimation OF MULTIVARIATE DENSITY FUNCTION exp
下载PDF
基于不规则区域划分方法的k-Nearest Neighbor查询算法 被引量:1
5
作者 张清清 李长云 +3 位作者 李旭 周玲芳 胡淑新 邹豪杰 《计算机系统应用》 2015年第9期186-190,共5页
随着越来越多的数据累积,对数据处理能力和分析能力的要求也越来越高.传统k-Nearest Neighbor(k NN)查询算法由于其容易导致计算负载整体不均衡的规则区域划分方法及其单个进程或单台计算机运行环境的较低数据处理能力.本文提出并详细... 随着越来越多的数据累积,对数据处理能力和分析能力的要求也越来越高.传统k-Nearest Neighbor(k NN)查询算法由于其容易导致计算负载整体不均衡的规则区域划分方法及其单个进程或单台计算机运行环境的较低数据处理能力.本文提出并详细介绍了一种基于不规则区域划分方法的改进型k NN查询算法,并利用对大规模数据集进行分布式并行计算的模型Map Reduce对该算法加以实现.实验结果与分析表明,Map Reduce框架下基于不规则区域划分方法的k NN查询算法可以获得较高的数据处理效率,并可以较好的支持大数据环境下数据的高效查询. 展开更多
关键词 k-nearest neighbor(k NN)查询算法 不规则区域划分方法 MAP REDUCE 大数据
下载PDF
Fault prediction of fighter based on nonparametric density estimation 被引量:3
6
作者 Zhang Zhengdao Hu Shousong 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2005年第4期831-836,共6页
Fighters and other complex engineering systems have many characteristics such as difficult modeling and testing, multiple working situations, and high cost. Aim at these points, a new kind of real-time fault predictor... Fighters and other complex engineering systems have many characteristics such as difficult modeling and testing, multiple working situations, and high cost. Aim at these points, a new kind of real-time fault predictor is designed based on an improved k-nearest neighbor method, which needs neither the math model of system nc, the training data and prior knowledge. It can study and predict while system's running, so that it can overcome the difficulty of data acquirement. Besides, this predictor has a fast prediction speed, and the false alarm rate and missing alarm rate can be adjusted randomly. The method is simple and universalizable. The result of simulation on fighter F-16 proved the effidency. 展开更多
关键词 FIGHTER fault prediction k-nearest neighbor method.
下载PDF
Mapping aboveground biomass by integrating geospatial and forest inventory data through a k-nearest neighbor strategy in North Central Mexico 被引量:3
7
作者 Carlos A AGUIRRE-SALADO Eduardo J TREVIO-GARZA +7 位作者 Oscar A AGUIRRE-CALDERóN Javier JIMNEZ-PREZ Marco A GONZLEZ-TAGLE José R VALDZ-LAZALDE Guillermo SNCHEZ-DíAZ Reija HAAPANEN Alejandro I AGUIRRE-SALADO Liliana MIRANDA-ARAGóN 《Journal of Arid Land》 SCIE CSCD 2014年第1期80-96,共17页
As climate change negotiations progress,monitoring biomass and carbon stocks is becoming an important part of the current forest research.Therefore,national governments are interested in developing forest-monitoring s... As climate change negotiations progress,monitoring biomass and carbon stocks is becoming an important part of the current forest research.Therefore,national governments are interested in developing forest-monitoring strategies using geospatial technology.Among statistical methods for mapping biomass,there is a nonparametric approach called k-nearest neighbor(kNN).We compared four variations of distance metrics of the kNN for the spatially-explicit estimation of aboveground biomass in a portion of the Mexican north border of the intertropical zone.Satellite derived,climatic,and topographic predictor variables were combined with the Mexican National Forest Inventory(NFI)data to accomplish the purpose.Performance of distance metrics applied into the kNN algorithm was evaluated using a cross validation leave-one-out technique.The results indicate that the Most Similar Neighbor(MSN)approach maximizes the correlation between predictor and response variables(r=0.9).Our results are in agreement with those reported in the literature.These findings confirm the predictive potential of the MSN approach for mapping forest variables at pixel level under the policy of Reducing Emission from Deforestation and Forest Degradation(REDD+). 展开更多
关键词 k-nearest neighbor Mahalanobis most similar neighbor MODIS BRDF-adjusted reflectance forest inventory the policy of Reducing Emission from Deforestation and Forest Degradation
下载PDF
Real-Time Spreading Thickness Monitoring of High-core Rockfill Dam Based on K-nearest Neighbor Algorithm 被引量:4
8
作者 Denghua Zhong Rongxiang Du +2 位作者 Bo Cui Binping Wu Tao Guan 《Transactions of Tianjin University》 EI CAS 2018年第3期282-289,共8页
During the storehouse surface rolling construction of a core rockfilldam, the spreading thickness of dam face is an important factor that affects the construction quality of the dam storehouse' rolling surface and... During the storehouse surface rolling construction of a core rockfilldam, the spreading thickness of dam face is an important factor that affects the construction quality of the dam storehouse' rolling surface and the overallquality of the entire dam. Currently, the method used to monitor and controlspreading thickness during the dam construction process is artificialsampling check after spreading, which makes it difficult to monitor the entire dam storehouse surface. In this paper, we present an in-depth study based on real-time monitoring and controltheory of storehouse surface rolling construction and obtain the rolling compaction thickness by analyzing the construction track of the rolling machine. Comparatively, the traditionalmethod can only analyze the rolling thickness of the dam storehouse surface after it has been compacted and cannot determine the thickness of the dam storehouse surface in realtime. To solve these problems, our system monitors the construction progress of the leveling machine and employs a real-time spreading thickness monitoring modelbased on the K-nearest neighbor algorithm. Taking the LHK core rockfilldam in Southwest China as an example, we performed real-time monitoring for the spreading thickness and conducted real-time interactive queries regarding the spreading thickness. This approach provides a new method for controlling the spreading thickness of the core rockfilldam storehouse surface. 展开更多
关键词 Core rockfill dam Dam storehouse surface construction Spreading thickness k-nearest neighbor algorithm Real-time monitor
下载PDF
Real-time road traffic states estimation based on kernel-KNN matching of road traffic spatial characteristics 被引量:2
9
作者 XU Dong-wei 《Journal of Central South University》 SCIE EI CAS CSCD 2016年第9期2453-2464,共12页
The accurate estimation of road traffic states can provide decision making for travelers and traffic managers. In this work,an algorithm based on kernel-k nearest neighbor(KNN) matching of road traffic spatial charact... The accurate estimation of road traffic states can provide decision making for travelers and traffic managers. In this work,an algorithm based on kernel-k nearest neighbor(KNN) matching of road traffic spatial characteristics is presented to estimate road traffic states. Firstly, the representative road traffic state data were extracted to establish the reference sequences of road traffic running characteristics(RSRTRC). Secondly, the spatial road traffic state data sequence was selected and the kernel function was constructed, with which the spatial road traffic data sequence could be mapped into a high dimensional feature space. Thirdly, the referenced and current spatial road traffic data sequences were extracted and the Euclidean distances in the feature space between them were obtained. Finally, the road traffic states were estimated from weighted averages of the selected k road traffic states, which corresponded to the nearest Euclidean distances. Several typical links in Beijing were adopted for case studies. The final results of the experiments show that the accuracy of this algorithm for estimating speed and volume is 95.27% and 91.32% respectively, which prove that this road traffic states estimation approach based on kernel-KNN matching of road traffic spatial characteristics is feasible and can achieve a high accuracy. 展开更多
关键词 road traffic kernel function k nearest neighbor (KNN) state estimation spatial characteristics
下载PDF
Pruned fuzzy K-nearest neighbor classifier for beat classification 被引量:2
10
作者 Muhammad Arif Muhammad Usman Akram Fayyaz-ul-Afsar Amir Minhas 《Journal of Biomedical Science and Engineering》 2010年第4期380-389,共10页
Arrhythmia beat classification is an active area of research in ECG based clinical decision support systems. In this paper, Pruned Fuzzy K-nearest neighbor (PFKNN) classifier is proposed to classify six types of beats... Arrhythmia beat classification is an active area of research in ECG based clinical decision support systems. In this paper, Pruned Fuzzy K-nearest neighbor (PFKNN) classifier is proposed to classify six types of beats present in the MIT-BIH Arrhythmia database. We have tested our classifier on ~ 103100 beats for six beat types present in the database. Fuzzy KNN (FKNN) can be implemented very easily but large number of training examples used for classification can be very time consuming and requires large storage space. Hence, we have proposed a time efficient Arif-Fayyaz pruning algorithm especially suitable for FKNN which can maintain good classification accuracy with appropriate retained ratio of training data. By using Arif-Fayyaz pruning algorithm with Fuzzy KNN, we have achieved a beat classification accuracy of 97% and geometric mean of sensitivity of 94.5% with only 19% of the total training examples. The accuracy and sensitivity is comparable to FKNN when all the training data is used. Principal Component Analysis is used to further reduce the dimension of feature space from eleven to six without compromising the accuracy and sensitivity. PFKNN was found to robust against noise present in the ECG data. 展开更多
关键词 ARRHYTHMIA ECG k-nearest neighbor PRUNING FUZZY Classification
下载PDF
Computational Intelligence Prediction Model Integrating Empirical Mode Decomposition,Principal Component Analysis,and Weighted k-Nearest Neighbor 被引量:2
11
作者 Li Tang He-Ping Pan Yi-Yong Yao 《Journal of Electronic Science and Technology》 CAS CSCD 2020年第4期341-349,共9页
On the basis of machine leaning,suitable algorithms can make advanced time series analysis.This paper proposes a complex k-nearest neighbor(KNN)model for predicting financial time series.This model uses a complex feat... On the basis of machine leaning,suitable algorithms can make advanced time series analysis.This paper proposes a complex k-nearest neighbor(KNN)model for predicting financial time series.This model uses a complex feature extraction process integrating a forward rolling empirical mode decomposition(EMD)for financial time series signal analysis and principal component analysis(PCA)for the dimension reduction.The information-rich features are extracted then input to a weighted KNN classifier where the features are weighted with PCA loading.Finally,prediction is generated via regression on the selected nearest neighbors.The structure of the model as a whole is original.The test results on real historical data sets confirm the effectiveness of the models for predicting the Chinese stock index,an individual stock,and the EUR/USD exchange rate. 展开更多
关键词 Empirical mode decomposition(EMD) k-nearest neighbor(KNN) principal component analysis(PCA) time series
下载PDF
A Short-Term Traffic Flow Forecasting Method Based on a Three-Layer K-Nearest Neighbor Non-Parametric Regression Algorithm 被引量:7
12
作者 Xiyu Pang Cheng Wang Guolin Huang 《Journal of Transportation Technologies》 2016年第4期200-206,共7页
Short-term traffic flow is one of the core technologies to realize traffic flow guidance. In this article, in view of the characteristics that the traffic flow changes repeatedly, a short-term traffic flow forecasting... Short-term traffic flow is one of the core technologies to realize traffic flow guidance. In this article, in view of the characteristics that the traffic flow changes repeatedly, a short-term traffic flow forecasting method based on a three-layer K-nearest neighbor non-parametric regression algorithm is proposed. Specifically, two screening layers based on shape similarity were introduced in K-nearest neighbor non-parametric regression method, and the forecasting results were output using the weighted averaging on the reciprocal values of the shape similarity distances and the most-similar-point distance adjustment method. According to the experimental results, the proposed algorithm has improved the predictive ability of the traditional K-nearest neighbor non-parametric regression method, and greatly enhanced the accuracy and real-time performance of short-term traffic flow forecasting. 展开更多
关键词 Three-Layer Traffic Flow Forecasting k-nearest neighbor Non-Parametric Regression
下载PDF
Discharge estimation based on machine learning
13
作者 Zhu JIANG Hui-yan WANG Wen-wu SONG 《Water Science and Engineering》 EI CAS CSCD 2013年第2期145-152,共8页
To overcome the limitations of the traditional stage-discharge models in describing the dynamic characteristics of a river, a machine learning method of non-parametric regression, the locally weighted regression metho... To overcome the limitations of the traditional stage-discharge models in describing the dynamic characteristics of a river, a machine learning method of non-parametric regression, the locally weighted regression method was used to estimate discharge. With the purpose of improving the precision and efficiency of river discharge estimation, a novel machine learning method is proposed: the clustering-tree weighted regression method. First, the training instances are clustered. Second, the k-nearest neighbor method is used to cluster new stage samples into the best-fit cluster. Finally, the daily discharge is estimated. In the estimation process, the interference of irrelevant information can be avoided, so that the precision and efficiency of daily discharge estimation are improved. Observed data from the Luding Hydrological Station were used for testing. The simulation results demonstrate that the precision of this method is high. This provides a new effective method for discharge estimation. 展开更多
关键词 stage-discharge relationship discharge estimation locally weighted regression clustering-tree weighted regression k-nearest neighbor method
下载PDF
Estimation of premature forests in Georgia (USA) using U.S.Forest Service FIA data and Landsat imagery
14
作者 Hojung Kim Chris J. Cieszewski Roger C. Lowe 《Journal of Forestry Research》 SCIE CAS CSCD 2017年第6期1241-1252,共12页
We used geographic information system applications and statistical analyses to classify young, premature forest areas in southeastern Georgia using combined data from Landsat TM 5 satellite imagery and ground inventor... We used geographic information system applications and statistical analyses to classify young, premature forest areas in southeastern Georgia using combined data from Landsat TM 5 satellite imagery and ground inventory data. We defined premature stands as forests with trees up to 15 years old. We estimated the premature forest areas using three methods: maximum likelihood classification(MLC), regression analysis, and k-nearest neighbor(k NN)modeling. Overall accuracy(OA) of classifying the premature forest using MLC was 82% and the Kappa coefficient of agreement was 0.63, which was the highest among the methods that we have tested. The k NN approach ranked second in accuracy with OA of 61% and a Kappa coefficient of agreement of 0.22. Regression analysis yielded an OA of 57% and a Kappa coefficient of 0.14. We conclude that Landsat imagery can be effectively used for estimating premature forest areas in combination with image processing classifiers such as MLC. 展开更多
关键词 LANDSAT Maximum likelihood classification Regression analysis k-nearest neighbor
下载PDF
Estimation of Travel Times on Signalized Arterials
15
作者 Ivana Cavar Zvonko Kavran Rino Bosnjak 《Journal of Civil Engineering and Architecture》 2013年第9期1141-1149,共9页
This paper describes procedure for estimation of travel time on signalized arterial roads based on multiple data sources with application of dimensionality reduction. Travel time estimation approach incorporates forec... This paper describes procedure for estimation of travel time on signalized arterial roads based on multiple data sources with application of dimensionality reduction. Travel time estimation approach incorporates forecast of transportation nodes impendence and travel time on network links. Forecasting period is two hours and the estimation is based on historical data and real time data on traffic conditions. Travel time estimation combines multivariate regression, principal component analysis, KNN (k-nearest neighbours), cross validation and EWMA (exponentially weighted moving average) methods. When comparing estimation methodologies, relevantly better results were achieved by KNN method than with EWMA method. This is true for every time interval considered except for evening time interval when signalized arterial roads were uncongested. 展开更多
关键词 Intelligent transportation systems travel time estimation signalised arterial roads exponentially weighted movingaverage k-nearest neighbours.
下载PDF
Propagation Path Loss Models at 28 GHz Using K-Nearest Neighbor Algorithm
16
作者 Vu Thanh Quang Dinh Van Linh To Thi Thao 《通讯和计算机(中英文版)》 2022年第1期1-8,共8页
In this paper,we develop and apply K-Nearest Neighbor algorithm to propagation pathloss regression.The path loss models present the dependency of attenuation value on distance using machine learning algorithms based o... In this paper,we develop and apply K-Nearest Neighbor algorithm to propagation pathloss regression.The path loss models present the dependency of attenuation value on distance using machine learning algorithms based on the experimental data.The algorithm is performed by choosing k nearest points and training dataset to find the optimal k value.The proposed method is applied to impove and adjust pathloss model at 28 GHz in Keangnam area,Hanoi,Vietnam.The experiments in both line-of-sight and non-line-of-sight scenarios used many combinations of transmit and receive antennas at different transmit antenna heights and random locations of receive antenna have been carried out using Wireless Insite Software.The results have been compared with 3GPP and NYU Wireless Path Loss Models in order to verify the performance of the proposed approach. 展开更多
关键词 k-nearest neighbor regression 5G millimeter waves path loss
下载PDF
Wireless Communication Signal Strength Prediction Method Based on the K-nearest Neighbor Algorithm
17
作者 Zhao Chen Ning Xiong +6 位作者 Yujue Wang Yong Ding Hengkui Xiang Chenjun Tang Lingang Liu Xiuqing Zou Decun Luo 《国际计算机前沿大会会议论文集》 2019年第1期238-240,共3页
Existing interference protection systems lack automatic evaluation methods to provide scientific, objective and accurate assessment results. To address this issue, this paper develops a layout scheme by geometrically ... Existing interference protection systems lack automatic evaluation methods to provide scientific, objective and accurate assessment results. To address this issue, this paper develops a layout scheme by geometrically modeling the actual scene, so that the hand-held full-band spectrum analyzer would be able to collect signal field strength values for indoor complex scenes. An improved prediction algorithm based on the K-nearest neighbor non-parametric kernel regression was proposed to predict the signal field strengths for the whole plane before and after being shield. Then the highest accuracy set of data could be picked out by comparison. The experimental results show that the improved prediction algorithm based on the K-nearest neighbor non-parametric kernel regression can scientifically and objectively predict the indoor complex scenes’ signal strength and evaluate the interference protection with high accuracy. 展开更多
关键词 INTERFERENCE protection k-nearest neighbor algorithm NON-PARAMETRIC KERNEL regression SIGNAL field STRENGTH
下载PDF
Efficient Parallel Processing of k-Nearest Neighbor Queries by Using a Centroid-based and Hierarchical Clustering Algorithm
18
作者 Elaheh Gavagsaz 《Artificial Intelligence Advances》 2022年第1期26-41,共16页
The k-Nearest Neighbor method is one of the most popular techniques for both classification and regression purposes.Because of its operation,the application of this classification may be limited to problems with a cer... The k-Nearest Neighbor method is one of the most popular techniques for both classification and regression purposes.Because of its operation,the application of this classification may be limited to problems with a certain number of instances,particularly,when run time is a consideration.However,the classification of large amounts of data has become a fundamental task in many real-world applications.It is logical to scale the k-Nearest Neighbor method to large scale datasets.This paper proposes a new k-Nearest Neighbor classification method(KNN-CCL)which uses a parallel centroid-based and hierarchical clustering algorithm to separate the sample of training dataset into multiple parts.The introduced clustering algorithm uses four stages of successive refinements and generates high quality clusters.The k-Nearest Neighbor approach subsequently makes use of them to predict the test datasets.Finally,sets of experiments are conducted on the UCI datasets.The experimental results confirm that the proposed k-Nearest Neighbor classification method performs well with regard to classification accuracy and performance. 展开更多
关键词 CLASSIFICATION k-nearest neighbor Big data CLUSTERING Parallel processing
下载PDF
EDGEWORTH EXPANSION FOR NEAREST NEIGHBOR- KERNEL ESTIMATE AND RANDOM WEIGHTING APPROXIMATION OF CONDITIONAL DENSITY
19
作者 Yu ZhaopingInstitute of Electronic Technique,Zhengzhou450 0 0 4 《Applied Mathematics(A Journal of Chinese Universities)》 SCIE CSCD 2000年第2期167-172,共6页
In this paper,Edgeworth expansion for the nearest neighbor\|kernel estimate and random weighting approximation of conditional density are given and the consistency and convergence rate are proved.
关键词 Random weighting method Edgeworth expansion nearest neighbor\|kernel estimate.
全文增补中
改进DPC聚类算法的离群点检测与解释方法
20
作者 周玉 夏浩 裴泽宣 《哈尔滨工业大学学报》 EI CAS CSCD 北大核心 2024年第8期68-85,共18页
为解决全局离群点检测方法无法对局部离群点进行检测,以及局部异常因子在面对大量局部离群点时性能下降的问题,利用k近邻(KNN)和核密度估计方法(KDE)提出一种基于改进快速搜索和发现密度峰值聚类算法(KDPC)的离群点检测与解释方法,该方... 为解决全局离群点检测方法无法对局部离群点进行检测,以及局部异常因子在面对大量局部离群点时性能下降的问题,利用k近邻(KNN)和核密度估计方法(KDE)提出一种基于改进快速搜索和发现密度峰值聚类算法(KDPC)的离群点检测与解释方法,该方法能够同时对数据点的全局和局部进行分析。首先,利用k近邻和核密度估计方法计算数据点的局部密度,代替传统DPC算法中根据截断距离计算的局部密度。其次,将数据点的k近邻距离之和作为全局异常值,并通过KDPC聚类算法计算簇密度以及数据点的局部异常值。最后,将数据点的全局与局部异常值进行乘积作为最终异常得分,选取异常得分最高的Top-n作为离群点,通过构建全局-局部异常值决策图对全局和局部离群点进行解释。利用人工数据集和UCI数据集进行实验并与10种常用离群点检测方法进行比较。结果表明,该方法对全局和局部离群点都有着较高的检测精度和检测性能,并且AUC方面受k值影响较小。同时,利用该方法对NBA球员数据进行分析讨论,进一步证明了该方法的实用性和有效性。 展开更多
关键词 离群点检测 聚类 密度峰值 K近邻 核密度估计
下载PDF
上一页 1 2 15 下一页 到第
使用帮助 返回顶部