High dimensional data clustering,with the inherent sparsity of data and the existence of noise,is a serious challenge for clustering algorithms.A new linear manifold clustering method was proposed to address this prob...High dimensional data clustering,with the inherent sparsity of data and the existence of noise,is a serious challenge for clustering algorithms.A new linear manifold clustering method was proposed to address this problem.The basic idea was to search the line manifold clusters hidden in datasets,and then fuse some of the line manifold clusters to construct higher dimensional manifold clusters.The orthogonal distance and the tangent distance were considered together as the linear manifold distance metrics. Spatial neighbor information was fully utilized to construct the original line manifold and optimize line manifolds during the line manifold cluster searching procedure.The results obtained from experiments over real and synthetic data sets demonstrate the superiority of the proposed method over some competing clustering methods in terms of accuracy and computation time.The proposed method is able to obtain high clustering accuracy for various data sets with different sizes,manifold dimensions and noise ratios,which confirms the anti-noise capability and high clustering accuracy of the proposed method for high dimensional data.展开更多
Meshed surfaces are ubiquitous in digital geometry processing and computer graphics. The set of attributes associated with each vertex such as the vertex locations, curvature, temperature, pressure or saliency, can be...Meshed surfaces are ubiquitous in digital geometry processing and computer graphics. The set of attributes associated with each vertex such as the vertex locations, curvature, temperature, pressure or saliency, can be recognized as data living on mani- fold surfaces. So interpolation and approximation for these data are of general interest. This paper presents two approaches for mani- fold data interpolation and approximation through the properties of Laplace-Beltrami operator (Laplace operator defined on a mani- fold surface). The first one is to use Laplace operator minimizing the membrane energy of a scalar function defined on a manifold. The second one is to use bi-Laplace operator minimizing the thin plate energy of a scalar function defined on a manifold. These two approaches can process data living on high genus meshed surfaces. The approach based on Laplace operator is more suitable for manifold data approximation and can be applied manifold data smoothing, while the one based on bi-Laplace operator is more suit- able for manifold data interpolation and can be applied image extremal envelope computation. All the application examples demon- strate that our procedures are robust and efficient.展开更多
流形数据由一些弧线状或环状的类簇组成,其特点是同一类簇的样本间距离差距较大。密度峰值聚类算法不能有效识别流形类簇的类簇中心且分配剩余样本时易引发样本的连续误分配问题。为此,本文提出面向流形数据的共享近邻密度峰值聚类(dens...流形数据由一些弧线状或环状的类簇组成,其特点是同一类簇的样本间距离差距较大。密度峰值聚类算法不能有效识别流形类簇的类簇中心且分配剩余样本时易引发样本的连续误分配问题。为此,本文提出面向流形数据的共享近邻密度峰值聚类(density peaks clustering based on shared nearest neighbor for manifold datasets,DPC-SNN)算法。提出了一种基于共享近邻的样本相似度定义方式,使得同一流形类簇样本间的相似度尽可能高;基于上述相似度定义局部密度,不忽略距类簇中心较远样本的密度贡献,能更好地区分出流形类簇的类簇中心与其他样本;根据样本的相似度分配剩余样本,避免了样本的连续误分配。DPC-SNN算法与DPC、FKNNDPC、FNDPC、DPCSA及IDPC-FA算法的对比实验结果表明,DPC-SNN算法能够有效发现流形数据的类簇中心并准确完成聚类,对真实以及人脸数据集也有不错的聚类效果。展开更多
针对无线电信号的攻击愈来愈频繁的情况,本文在数据流形理论基础上,使用深度神经网络(DNN)检测无线电信号对抗样本及其攻击方法。首先使用5种不同攻击方法对无线电信号进行攻击产生对抗样本,其次使用3种不同的神经网络检测对抗样本,最...针对无线电信号的攻击愈来愈频繁的情况,本文在数据流形理论基础上,使用深度神经网络(DNN)检测无线电信号对抗样本及其攻击方法。首先使用5种不同攻击方法对无线电信号进行攻击产生对抗样本,其次使用3种不同的神经网络检测对抗样本,最后用残差神经网络(ResNet)检测对抗样本的攻击方法。在信噪比(SNR)为30 d B和20 dB的无线电信号数据上的实验结果表明,本文所使用的残差神经网络检测精度接近100%,在信噪比为10 dB的无线电信号数据上的检测精度仍然在90%以上。结果表明本文所用的残差神经网络能有效检测无线电信号的对抗样本及其攻击方法。展开更多
Dimensionality reduction and data visualization are useful and important processes in pattern recognition. Many techniques have been developed in the recent years. The self-organizing map (SOM) can be an efficient m...Dimensionality reduction and data visualization are useful and important processes in pattern recognition. Many techniques have been developed in the recent years. The self-organizing map (SOM) can be an efficient method for this purpose. This paper reviews recent advances in this area and related approaches such as multidimensional scaling (MDS), nonlinear PC A, principal manifolds, as well as the connections of the SOM and its recent variant, the visualization induced SOM (ViSOM), with these approaches. The SOM is shown to produce a quantized, qualitative scaling and while the ViSOM a quantitative or metric scaling and approximates principal curve/surface. The SOM can also be regarded as a generalized MDS to relate two metric spaces by forming a topological mapping between them. The relationships among various recently proposed techniques such as ViSOM, Isomap, LLE, and eigenmap are discussed and compared.展开更多
集气管的是炼焦制气的重要组成部分,保持集气管压力的稳定,可以提高炼焦制气的效率,降低炼焦制气中产生的气体对环境的污染。随着数据挖掘理论在工业中的应用,支持向量机(The Support Vector Machine SVM)在集气管压力的控制上取得了良...集气管的是炼焦制气的重要组成部分,保持集气管压力的稳定,可以提高炼焦制气的效率,降低炼焦制气中产生的气体对环境的污染。随着数据挖掘理论在工业中的应用,支持向量机(The Support Vector Machine SVM)在集气管压力的控制上取得了良好的效果,但其在处理非线性的数据方面的效果并不显著,为了解决这个问题,这里提出了一种平滑支持向量机模型,这是一个具有数据采集、数据平滑与非线性逼近功能相统一的系统模型,利用平滑度对数据进行噪声处理,将平滑处理过的数据用于回归模型的预测控制。这里提出的方法,对唐山某钢铁企业的实际数据进行实验仿真,结果表明,平滑支持向量模型对集气管压力的控制均方根误差较小,控制效果显著。展开更多
基金Project(60835005) supported by the National Nature Science Foundation of China
文摘High dimensional data clustering,with the inherent sparsity of data and the existence of noise,is a serious challenge for clustering algorithms.A new linear manifold clustering method was proposed to address this problem.The basic idea was to search the line manifold clusters hidden in datasets,and then fuse some of the line manifold clusters to construct higher dimensional manifold clusters.The orthogonal distance and the tangent distance were considered together as the linear manifold distance metrics. Spatial neighbor information was fully utilized to construct the original line manifold and optimize line manifolds during the line manifold cluster searching procedure.The results obtained from experiments over real and synthetic data sets demonstrate the superiority of the proposed method over some competing clustering methods in terms of accuracy and computation time.The proposed method is able to obtain high clustering accuracy for various data sets with different sizes,manifold dimensions and noise ratios,which confirms the anti-noise capability and high clustering accuracy of the proposed method for high dimensional data.
基金Supported by National Natural Science Foundation of China (No.61202261,No.61173102)NSFC Guangdong Joint Fund(No.U0935004)Opening Foundation of Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education of China(No.93K172012K02)
文摘Meshed surfaces are ubiquitous in digital geometry processing and computer graphics. The set of attributes associated with each vertex such as the vertex locations, curvature, temperature, pressure or saliency, can be recognized as data living on mani- fold surfaces. So interpolation and approximation for these data are of general interest. This paper presents two approaches for mani- fold data interpolation and approximation through the properties of Laplace-Beltrami operator (Laplace operator defined on a mani- fold surface). The first one is to use Laplace operator minimizing the membrane energy of a scalar function defined on a manifold. The second one is to use bi-Laplace operator minimizing the thin plate energy of a scalar function defined on a manifold. These two approaches can process data living on high genus meshed surfaces. The approach based on Laplace operator is more suitable for manifold data approximation and can be applied manifold data smoothing, while the one based on bi-Laplace operator is more suit- able for manifold data interpolation and can be applied image extremal envelope computation. All the application examples demon- strate that our procedures are robust and efficient.
文摘流形数据由一些弧线状或环状的类簇组成,其特点是同一类簇的样本间距离差距较大。密度峰值聚类算法不能有效识别流形类簇的类簇中心且分配剩余样本时易引发样本的连续误分配问题。为此,本文提出面向流形数据的共享近邻密度峰值聚类(density peaks clustering based on shared nearest neighbor for manifold datasets,DPC-SNN)算法。提出了一种基于共享近邻的样本相似度定义方式,使得同一流形类簇样本间的相似度尽可能高;基于上述相似度定义局部密度,不忽略距类簇中心较远样本的密度贡献,能更好地区分出流形类簇的类簇中心与其他样本;根据样本的相似度分配剩余样本,避免了样本的连续误分配。DPC-SNN算法与DPC、FKNNDPC、FNDPC、DPCSA及IDPC-FA算法的对比实验结果表明,DPC-SNN算法能够有效发现流形数据的类簇中心并准确完成聚类,对真实以及人脸数据集也有不错的聚类效果。
文摘针对无线电信号的攻击愈来愈频繁的情况,本文在数据流形理论基础上,使用深度神经网络(DNN)检测无线电信号对抗样本及其攻击方法。首先使用5种不同攻击方法对无线电信号进行攻击产生对抗样本,其次使用3种不同的神经网络检测对抗样本,最后用残差神经网络(ResNet)检测对抗样本的攻击方法。在信噪比(SNR)为30 d B和20 dB的无线电信号数据上的实验结果表明,本文所使用的残差神经网络检测精度接近100%,在信噪比为10 dB的无线电信号数据上的检测精度仍然在90%以上。结果表明本文所用的残差神经网络能有效检测无线电信号的对抗样本及其攻击方法。
文摘Dimensionality reduction and data visualization are useful and important processes in pattern recognition. Many techniques have been developed in the recent years. The self-organizing map (SOM) can be an efficient method for this purpose. This paper reviews recent advances in this area and related approaches such as multidimensional scaling (MDS), nonlinear PC A, principal manifolds, as well as the connections of the SOM and its recent variant, the visualization induced SOM (ViSOM), with these approaches. The SOM is shown to produce a quantized, qualitative scaling and while the ViSOM a quantitative or metric scaling and approximates principal curve/surface. The SOM can also be regarded as a generalized MDS to relate two metric spaces by forming a topological mapping between them. The relationships among various recently proposed techniques such as ViSOM, Isomap, LLE, and eigenmap are discussed and compared.
文摘集气管的是炼焦制气的重要组成部分,保持集气管压力的稳定,可以提高炼焦制气的效率,降低炼焦制气中产生的气体对环境的污染。随着数据挖掘理论在工业中的应用,支持向量机(The Support Vector Machine SVM)在集气管压力的控制上取得了良好的效果,但其在处理非线性的数据方面的效果并不显著,为了解决这个问题,这里提出了一种平滑支持向量机模型,这是一个具有数据采集、数据平滑与非线性逼近功能相统一的系统模型,利用平滑度对数据进行噪声处理,将平滑处理过的数据用于回归模型的预测控制。这里提出的方法,对唐山某钢铁企业的实际数据进行实验仿真,结果表明,平滑支持向量模型对集气管压力的控制均方根误差较小,控制效果显著。