Winding is one of themost important components in power transformers.Ensuring the health state of the winding is of great importance to the stable operation of the power system.To efficiently and accurately diagnose t...Winding is one of themost important components in power transformers.Ensuring the health state of the winding is of great importance to the stable operation of the power system.To efficiently and accurately diagnose the disc space variation(DSV)fault degree of transformer winding,this paper presents a diagnostic method of winding fault based on the K-Nearest Neighbor(KNN)algorithmand the frequency response analysis(FRA)method.First,a laboratory winding model is used,and DSV faults with four different degrees are achieved by changing disc space of the discs in the winding.Then,a series of FRA tests are conducted to obtain the FRA results and set up the FRA dataset.Second,ten different numerical indices are utilized to obtain features of FRA curves of faulted winding.Third,the 10-fold cross-validation method is employed to determine the optimal k-value of KNN.In addition,to improve the accuracy of the KNN model,a comparative analysis is made between the accuracy of the KNN algorithm and k-value under four distance functions.After getting the most appropriate distance metric and kvalue,the fault classificationmodel based on theKNN and FRA is constructed and it is used to classify the degrees of DSV faults.The identification accuracy rate of the proposed model is up to 98.30%.Finally,the performance of the model is presented by comparing with the support vector machine(SVM),SVM optimized by the particle swarmoptimization(PSO-SVM)method,and randomforest(RF).The results show that the diagnosis accuracy of the proposed model is the highest and the model can be used to accurately diagnose the DSV fault degrees of the winding.展开更多
In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selec...In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selection.Themotivation for utilizingGWOandHHOstems fromtheir bio-inspired nature and their demonstrated success in optimization problems.We aimto leverage the strengths of these algorithms to enhance the effectiveness of feature selection in microarray-based cancer classification.We selected leave-one-out cross-validation(LOOCV)to evaluate the performance of both two widely used classifiers,k-nearest neighbors(KNN)and support vector machine(SVM),on high-dimensional cancer microarray data.The proposed method is extensively tested on six publicly available cancer microarray datasets,and a comprehensive comparison with recently published methods is conducted.Our hybrid algorithm demonstrates its effectiveness in improving classification performance,Surpassing alternative approaches in terms of precision.The outcomes confirm the capability of our method to substantially improve both the precision and efficiency of cancer classification,thereby advancing the development ofmore efficient treatment strategies.The proposed hybridmethod offers a promising solution to the gene selection problem in microarray-based cancer classification.It improves the accuracy and efficiency of cancer diagnosis and treatment,and its superior performance compared to other methods highlights its potential applicability in realworld cancer classification tasks.By harnessing the complementary search mechanisms of GWO and HHO,we leverage their bio-inspired behavior to identify informative genes relevant to cancer diagnosis and treatment.展开更多
During the storehouse surface rolling construction of a core rockfilldam, the spreading thickness of dam face is an important factor that affects the construction quality of the dam storehouse' rolling surface and...During the storehouse surface rolling construction of a core rockfilldam, the spreading thickness of dam face is an important factor that affects the construction quality of the dam storehouse' rolling surface and the overallquality of the entire dam. Currently, the method used to monitor and controlspreading thickness during the dam construction process is artificialsampling check after spreading, which makes it difficult to monitor the entire dam storehouse surface. In this paper, we present an in-depth study based on real-time monitoring and controltheory of storehouse surface rolling construction and obtain the rolling compaction thickness by analyzing the construction track of the rolling machine. Comparatively, the traditionalmethod can only analyze the rolling thickness of the dam storehouse surface after it has been compacted and cannot determine the thickness of the dam storehouse surface in realtime. To solve these problems, our system monitors the construction progress of the leveling machine and employs a real-time spreading thickness monitoring modelbased on the K-nearest neighbor algorithm. Taking the LHK core rockfilldam in Southwest China as an example, we performed real-time monitoring for the spreading thickness and conducted real-time interactive queries regarding the spreading thickness. This approach provides a new method for controlling the spreading thickness of the core rockfilldam storehouse surface.展开更多
Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear mode...Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.展开更多
Existing interference protection systems lack automatic evaluation methods to provide scientific, objective and accurate assessment results. To address this issue, this paper develops a layout scheme by geometrically ...Existing interference protection systems lack automatic evaluation methods to provide scientific, objective and accurate assessment results. To address this issue, this paper develops a layout scheme by geometrically modeling the actual scene, so that the hand-held full-band spectrum analyzer would be able to collect signal field strength values for indoor complex scenes. An improved prediction algorithm based on the K-nearest neighbor non-parametric kernel regression was proposed to predict the signal field strengths for the whole plane before and after being shield. Then the highest accuracy set of data could be picked out by comparison. The experimental results show that the improved prediction algorithm based on the K-nearest neighbor non-parametric kernel regression can scientifically and objectively predict the indoor complex scenes’ signal strength and evaluate the interference protection with high accuracy.展开更多
针对三维激光点云线性K最近邻(K-nearest neighbor, KNN)搜索耗时长的问题,提出了一种利用多处理器片上系统(multi-processor system on chip, MPSoC)现场可编程门阵列(field-programmable gate array,FPGA)实现三维激光点云KNN快速搜...针对三维激光点云线性K最近邻(K-nearest neighbor, KNN)搜索耗时长的问题,提出了一种利用多处理器片上系统(multi-processor system on chip, MPSoC)现场可编程门阵列(field-programmable gate array,FPGA)实现三维激光点云KNN快速搜索的方法。首先给出了三维激光点云KNN算法的MPSoC FPGA实现框架;然后详细阐述了每个模块的设计思路及实现过程;最后利用MZU15A开发板和天眸16线旋转机械激光雷达搭建了测试平台,完成了三维激光点云KNN算法MPSoC FPGA加速的测试验证。实验结果表明:基于MPSoC FPGA实现的三维激光点云KNN算法能在保证邻近点搜索精度的情况下,减少邻近点搜索耗时。展开更多
Although k-nearest neighbors (KNN) is a popular fingerprint match algorithm for its simplicity and accuracy, because it is sensitive to the circumstances, a fuzzy c-means (FCM) clustering algorithm is applied to i...Although k-nearest neighbors (KNN) is a popular fingerprint match algorithm for its simplicity and accuracy, because it is sensitive to the circumstances, a fuzzy c-means (FCM) clustering algorithm is applied to improve it. Thus, a KNN-based two-step FCM weighted (KTFW) algorithm for indoor positioning in wireless local area networks (WLAN) is presented in this paper. In KTFW algorithm, k reference points (RPs) chosen by KNN are clustered through FCM based on received signal strength (RSS) and location coordinates. The right clusters are chosen according to rules, so three sets of RPs are formed including the set of k RPs chosen by KNN and are given different weights. RPs supposed to have better contribution to positioning accuracy are given larger weights to improve the positioning accuracy. Simulation results indicate that KTFW generally outperforms KNN and its complexity is greatly reduced through providing initial clustering centers for FCM.展开更多
On the basis of machine leaning,suitable algorithms can make advanced time series analysis.This paper proposes a complex k-nearest neighbor(KNN)model for predicting financial time series.This model uses a complex feat...On the basis of machine leaning,suitable algorithms can make advanced time series analysis.This paper proposes a complex k-nearest neighbor(KNN)model for predicting financial time series.This model uses a complex feature extraction process integrating a forward rolling empirical mode decomposition(EMD)for financial time series signal analysis and principal component analysis(PCA)for the dimension reduction.The information-rich features are extracted then input to a weighted KNN classifier where the features are weighted with PCA loading.Finally,prediction is generated via regression on the selected nearest neighbors.The structure of the model as a whole is original.The test results on real historical data sets confirm the effectiveness of the models for predicting the Chinese stock index,an individual stock,and the EUR/USD exchange rate.展开更多
In this paper,Support Vector Machine(SVM)and K-Nearest Neighbor(KNN)based methods are to be applied on fault diagnosis in a robot manipulator.A comparative study between the two classifiers in terms of successfully det...In this paper,Support Vector Machine(SVM)and K-Nearest Neighbor(KNN)based methods are to be applied on fault diagnosis in a robot manipulator.A comparative study between the two classifiers in terms of successfully detecting and isolating the seven classes of sensor faults is considered in this work.For both classifiers,the torque,the position and the speed of the manipulator have been employed as the input vector.However,it is to mention that a large database is needed and used for the training and testing phases.The SVM method used in this paper is based on the Gaussian kernel with the parametersγand the penalty margin parameter“C”,which were adjusted via the PSO algorithm to achieve a maximum accuracy diagnosis.Simulations were carried out on the model of a Selective Compliance Assembly Robot Arm(SCARA)robot manipulator,and the results showed that the Particle Swarm Optimization(PSO)increased the per-formance of the SVM algorithm with the 96.95%accuracy while the KNN algo-rithm achieved a correlation up to 94.62%.These results showed that the SVM algorithm with PSO was more precise than the KNN algorithm when was used in fault diagnosis on a robot manipulator.展开更多
The Feixianguan Formation reservoirs in northeastern Sichuan are mainly a suite of carbonate platform deposits.The reservoir types are diverse with high heterogeneity and complex genetic mechanisms.Pores,vugs and frac...The Feixianguan Formation reservoirs in northeastern Sichuan are mainly a suite of carbonate platform deposits.The reservoir types are diverse with high heterogeneity and complex genetic mechanisms.Pores,vugs and fractures of different genetic mechanisms and scales are often developed in association,and it is difficult to classify reservoir types merely based on static data such as outcrop observation,and cores and logging data.In the study,the reservoirs in the Feixianguan Formation are grouped into five types by combining dynamic and static data,that is,karst breccia-residual vuggy type,solution-enhanced vuggy type,fractured-vuggy type,fractured type and matrix type(non-reservoir).Based on conventional logging data,core data and formation microscanner image(FMI)data of the Qilibei block,northeastern Sichuan Basin,the reservoirs are classified in accordance with fracture-vug matching relationship.Based on the principle of cluster analysis,K-Nearest Neighbor(KNN)classification templates are established,and the applicability of the model is verified by using the reservoir data from wells uninvolved in modeling.Following the analysis of the results of reservoir type discrimination and the production of corresponding reservoir intervals,the contributions of various reservoir types to production are evaluated and the reliability of reservoir type classification is verified.The results show that the solution-enhanced vuggy type is of high-quality sweet spot reservoir in the study area with good physical property and high gas production,followed by the fractured-vuggy type,and the fractured and karst breccia-residual vuggy types are the least promising.展开更多
基金supported in part by Shaanxi Natural Science Foundation Project (2023-JC-QN-0438)in part by Fundamental Research Funds for the Central Universities (2452021050).
文摘Winding is one of themost important components in power transformers.Ensuring the health state of the winding is of great importance to the stable operation of the power system.To efficiently and accurately diagnose the disc space variation(DSV)fault degree of transformer winding,this paper presents a diagnostic method of winding fault based on the K-Nearest Neighbor(KNN)algorithmand the frequency response analysis(FRA)method.First,a laboratory winding model is used,and DSV faults with four different degrees are achieved by changing disc space of the discs in the winding.Then,a series of FRA tests are conducted to obtain the FRA results and set up the FRA dataset.Second,ten different numerical indices are utilized to obtain features of FRA curves of faulted winding.Third,the 10-fold cross-validation method is employed to determine the optimal k-value of KNN.In addition,to improve the accuracy of the KNN model,a comparative analysis is made between the accuracy of the KNN algorithm and k-value under four distance functions.After getting the most appropriate distance metric and kvalue,the fault classificationmodel based on theKNN and FRA is constructed and it is used to classify the degrees of DSV faults.The identification accuracy rate of the proposed model is up to 98.30%.Finally,the performance of the model is presented by comparing with the support vector machine(SVM),SVM optimized by the particle swarmoptimization(PSO-SVM)method,and randomforest(RF).The results show that the diagnosis accuracy of the proposed model is the highest and the model can be used to accurately diagnose the DSV fault degrees of the winding.
基金the Deputyship for Research and Innovation,“Ministry of Education”in Saudi Arabia for funding this research(IFKSUOR3-014-3).
文摘In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selection.Themotivation for utilizingGWOandHHOstems fromtheir bio-inspired nature and their demonstrated success in optimization problems.We aimto leverage the strengths of these algorithms to enhance the effectiveness of feature selection in microarray-based cancer classification.We selected leave-one-out cross-validation(LOOCV)to evaluate the performance of both two widely used classifiers,k-nearest neighbors(KNN)and support vector machine(SVM),on high-dimensional cancer microarray data.The proposed method is extensively tested on six publicly available cancer microarray datasets,and a comprehensive comparison with recently published methods is conducted.Our hybrid algorithm demonstrates its effectiveness in improving classification performance,Surpassing alternative approaches in terms of precision.The outcomes confirm the capability of our method to substantially improve both the precision and efficiency of cancer classification,thereby advancing the development ofmore efficient treatment strategies.The proposed hybridmethod offers a promising solution to the gene selection problem in microarray-based cancer classification.It improves the accuracy and efficiency of cancer diagnosis and treatment,and its superior performance compared to other methods highlights its potential applicability in realworld cancer classification tasks.By harnessing the complementary search mechanisms of GWO and HHO,we leverage their bio-inspired behavior to identify informative genes relevant to cancer diagnosis and treatment.
基金supported by the Innovative Research Groups of National Natural Science Foundation of China(No. 51621092)National Basic Research Program of China ("973" Program, No. 2013CB035904)National Natural Science Foundation of China (No. 51439005)
文摘During the storehouse surface rolling construction of a core rockfilldam, the spreading thickness of dam face is an important factor that affects the construction quality of the dam storehouse' rolling surface and the overallquality of the entire dam. Currently, the method used to monitor and controlspreading thickness during the dam construction process is artificialsampling check after spreading, which makes it difficult to monitor the entire dam storehouse surface. In this paper, we present an in-depth study based on real-time monitoring and controltheory of storehouse surface rolling construction and obtain the rolling compaction thickness by analyzing the construction track of the rolling machine. Comparatively, the traditionalmethod can only analyze the rolling thickness of the dam storehouse surface after it has been compacted and cannot determine the thickness of the dam storehouse surface in realtime. To solve these problems, our system monitors the construction progress of the leveling machine and employs a real-time spreading thickness monitoring modelbased on the K-nearest neighbor algorithm. Taking the LHK core rockfilldam in Southwest China as an example, we performed real-time monitoring for the spreading thickness and conducted real-time interactive queries regarding the spreading thickness. This approach provides a new method for controlling the spreading thickness of the core rockfilldam storehouse surface.
文摘Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.
基金the National Natural Science Foundation of China under projects 61772150 and 61862012the Guangxi Key R&D Program under project AB17195025+5 种基金the Guangxi Natural Science Foundation under grants 2018GXNSFDA281054 and 2018GXNSFAA281232the National Cryptography Development Fund of China under project MMJJ20170217the Guangxi Science and Technology Base and Special Talents Program AD18281044the Innovation Project of GUET Graduate Education under project 2017YJCX46the Guangxi Young Teachers’ Basic Ability Improvement Program under Grant 2018KY0194the open program of Guangxi Key Laboratory of Cryptography and Information Security under projects GCIS201621 and GCIS201702.
文摘Existing interference protection systems lack automatic evaluation methods to provide scientific, objective and accurate assessment results. To address this issue, this paper develops a layout scheme by geometrically modeling the actual scene, so that the hand-held full-band spectrum analyzer would be able to collect signal field strength values for indoor complex scenes. An improved prediction algorithm based on the K-nearest neighbor non-parametric kernel regression was proposed to predict the signal field strengths for the whole plane before and after being shield. Then the highest accuracy set of data could be picked out by comparison. The experimental results show that the improved prediction algorithm based on the K-nearest neighbor non-parametric kernel regression can scientifically and objectively predict the indoor complex scenes’ signal strength and evaluate the interference protection with high accuracy.
文摘Although k-nearest neighbors (KNN) is a popular fingerprint match algorithm for its simplicity and accuracy, because it is sensitive to the circumstances, a fuzzy c-means (FCM) clustering algorithm is applied to improve it. Thus, a KNN-based two-step FCM weighted (KTFW) algorithm for indoor positioning in wireless local area networks (WLAN) is presented in this paper. In KTFW algorithm, k reference points (RPs) chosen by KNN are clustered through FCM based on received signal strength (RSS) and location coordinates. The right clusters are chosen according to rules, so three sets of RPs are formed including the set of k RPs chosen by KNN and are given different weights. RPs supposed to have better contribution to positioning accuracy are given larger weights to improve the positioning accuracy. Simulation results indicate that KTFW generally outperforms KNN and its complexity is greatly reduced through providing initial clustering centers for FCM.
基金supported by the Social Science Foundation of China under Grant No.17BGL231。
文摘On the basis of machine leaning,suitable algorithms can make advanced time series analysis.This paper proposes a complex k-nearest neighbor(KNN)model for predicting financial time series.This model uses a complex feature extraction process integrating a forward rolling empirical mode decomposition(EMD)for financial time series signal analysis and principal component analysis(PCA)for the dimension reduction.The information-rich features are extracted then input to a weighted KNN classifier where the features are weighted with PCA loading.Finally,prediction is generated via regression on the selected nearest neighbors.The structure of the model as a whole is original.The test results on real historical data sets confirm the effectiveness of the models for predicting the Chinese stock index,an individual stock,and the EUR/USD exchange rate.
基金supported by Taif University Researchers Supporting Project(Number TURSP-2020/122),Taif University,Taif,Saudi Arabia.
文摘In this paper,Support Vector Machine(SVM)and K-Nearest Neighbor(KNN)based methods are to be applied on fault diagnosis in a robot manipulator.A comparative study between the two classifiers in terms of successfully detecting and isolating the seven classes of sensor faults is considered in this work.For both classifiers,the torque,the position and the speed of the manipulator have been employed as the input vector.However,it is to mention that a large database is needed and used for the training and testing phases.The SVM method used in this paper is based on the Gaussian kernel with the parametersγand the penalty margin parameter“C”,which were adjusted via the PSO algorithm to achieve a maximum accuracy diagnosis.Simulations were carried out on the model of a Selective Compliance Assembly Robot Arm(SCARA)robot manipulator,and the results showed that the Particle Swarm Optimization(PSO)increased the per-formance of the SVM algorithm with the 96.95%accuracy while the KNN algo-rithm achieved a correlation up to 94.62%.These results showed that the SVM algorithm with PSO was more precise than the KNN algorithm when was used in fault diagnosis on a robot manipulator.
文摘The Feixianguan Formation reservoirs in northeastern Sichuan are mainly a suite of carbonate platform deposits.The reservoir types are diverse with high heterogeneity and complex genetic mechanisms.Pores,vugs and fractures of different genetic mechanisms and scales are often developed in association,and it is difficult to classify reservoir types merely based on static data such as outcrop observation,and cores and logging data.In the study,the reservoirs in the Feixianguan Formation are grouped into five types by combining dynamic and static data,that is,karst breccia-residual vuggy type,solution-enhanced vuggy type,fractured-vuggy type,fractured type and matrix type(non-reservoir).Based on conventional logging data,core data and formation microscanner image(FMI)data of the Qilibei block,northeastern Sichuan Basin,the reservoirs are classified in accordance with fracture-vug matching relationship.Based on the principle of cluster analysis,K-Nearest Neighbor(KNN)classification templates are established,and the applicability of the model is verified by using the reservoir data from wells uninvolved in modeling.Following the analysis of the results of reservoir type discrimination and the production of corresponding reservoir intervals,the contributions of various reservoir types to production are evaluated and the reliability of reservoir type classification is verified.The results show that the solution-enhanced vuggy type is of high-quality sweet spot reservoir in the study area with good physical property and high gas production,followed by the fractured-vuggy type,and the fractured and karst breccia-residual vuggy types are the least promising.