During the last three decades,evolutionary algorithms(EAs)have shown superiority in solving complex optimization problems,especially those with multiple objectives and non-differentiable landscapes.However,due to the ...During the last three decades,evolutionary algorithms(EAs)have shown superiority in solving complex optimization problems,especially those with multiple objectives and non-differentiable landscapes.However,due to the stochastic search strategies,the performance of most EAs deteriorates drastically when handling a large number of decision variables.To tackle the curse of dimensionality,this work proposes an efficient EA for solving super-large-scale multi-objective optimization problems with sparse optimal solutions.The proposed algorithm estimates the sparse distribution of optimal solutions by optimizing a binary vector for each solution,and provides a fast clustering method to highly reduce the dimensionality of the search space.More importantly,all the operations related to the decision variables only contain several matrix calculations,which can be directly accelerated by GPUs.While existing EAs are capable of handling fewer than 10000 real variables,the proposed algorithm is verified to be effective in handling 1000000 real variables.Furthermore,since the proposed algorithm handles the large number of variables via accelerated matrix calculations,its runtime can be reduced to less than 10%of the runtime of existing EAs.展开更多
A three-dimensional(3D) fast fluidized bed with the riser of 3.0 m in height and 0.1 m in inner diameter was established to experimentally study the cluster behaviors of Geldart B particles. Five kinds of quartz sand ...A three-dimensional(3D) fast fluidized bed with the riser of 3.0 m in height and 0.1 m in inner diameter was established to experimentally study the cluster behaviors of Geldart B particles. Five kinds of quartz sand particles(dp= 0.100, 0.139, 0.177, 0.250 and 0.375 mm and ρp= 2480 kg·m^(-3)) were respectively investigated, with the total mass of the bed material kept as 10 kg. The superficial gas velocity in the riser ranges from 2.486 to 5.594 m·s^(-1) and the solid mass flux alters from 30 to 70 kg·((m^(-2)·s))^(-1). Cluster characteristics and evolutionary processes in the different positions of the riser were captured by the cluster visualization systems and analyzed by the self-developed binary image processing. The results found four typical cluster structures in the riser,i.e., the macro stripe-shaped cluster, saddle-shaped cluster, U-shaped cluster and the micro cluster. The increasing superficial gas velocity and particle sizes result in the increasing average cluster size and the decreasing cluster time fraction, while the solid mass flux in the riser have the reverse influences on the cluster size and time fraction. Additionally, clusters in the upper region of the riser often have the larger size and time fraction than that in the lower region. All these effects of operating conditions on clusters become less obvious when particle size is less than 0.100 mm.展开更多
The key challenge of the extended target probability hypothesis density (ET-PHD) filter is to reduce the computational complexity by using a subset to approximate the full set of partitions. In this paper, the influen...The key challenge of the extended target probability hypothesis density (ET-PHD) filter is to reduce the computational complexity by using a subset to approximate the full set of partitions. In this paper, the influence for the tracking results of different partitions is analyzed, and the form of the most informative partition is obtained. Then, a fast density peak-based clustering (FDPC) partitioning algorithm is applied to the measurement set partitioning. Since only one partition of the measurement set is used, the ET-PHD filter based on FDPC partitioning has lower computational complexity than the other ET-PHD filters. As FDPC partitioning is able to remove the spatially close clutter-generated measurements, the ET-PHD filter based on FDPC partitioning has good tracking performance in the scenario with more clutter-generated measurements. The simulation results show that the proposed algorithm can get the most informative partition and obviously reduce computational burden without losing tracking performance. As the number of clutter-generated measurements increased, the ET-PHD filter based on FDPC partitioning has better tracking performance than other ET-PHD filters. The FDPC algorithm will play an important role in the engineering realization of the multiple extended target tracking filter.展开更多
This work demonstrates the so-called PCAC (Protein principal Component Analysis Clustering) method, which clusters large-scale decoy protein structures in protein structure prediction based on principal component anal...This work demonstrates the so-called PCAC (Protein principal Component Analysis Clustering) method, which clusters large-scale decoy protein structures in protein structure prediction based on principal component analysis (PCA), is an ultra-fast and low-memory-requiring clustering method. It can be two orders of magnitude faster than the commonlyused pairwise rmsd-clustering (pRMSD) when enormous of decoys are involved. Instead of N(N – 1)/2 least-square fitting of rmsd calculations and N2 memory units to store the pairwise rmsd values in pRMSD, PCAC only requires N rmsd calculations and N × P memory storage, where N is the number of structures to be clustered and P is the number of preserved eigenvectors. Furthermore, PCAC based on the covariance Cartesian matrix generates essentially the identical result as that from the reference rmsd-clustering (rRMSD). From a test of 41 protein decoy sets, when the eigenvectors that contribute a total of 90% eigenvalues are preserved, PCAC method reproduces the results of near-native selections from rRMSD.展开更多
为探究地形因子对500米口径球面射电望远镜(Five-hundred-meter Aperture Spherical radio Telescope,FAST)周边植物物种多样性及空间分布的影响,该文选取FAST周边喀斯特峰丛洼地3种典型植物群落(乔木层、灌木层、藤本层)作为研究对象,...为探究地形因子对500米口径球面射电望远镜(Five-hundred-meter Aperture Spherical radio Telescope,FAST)周边植物物种多样性及空间分布的影响,该文选取FAST周边喀斯特峰丛洼地3种典型植物群落(乔木层、灌木层、藤本层)作为研究对象,采用方差分析及典范对应分析(CCA)研究不同地形因子(海拔、坡度、坡向、坡位)梯度下植物群落物种多样性及空间分布特征。结果表明:(1)FAST周边植物群落α多样性指数呈现灌木层>乔木层>藤本层的趋势,乔木层、藤本层植物α多样性指数随海拔升高而增加(P<0.05),地形因子对灌木层植物α多样性无显著性影响。(2)FAST周边植物群落物种的空间分布受海拔的影响最大,其次为坡度(P<0.05)。(3)FAST周边3种植物群落的Jaccard相似性指数随海拔的升高呈现增加的趋势,沿坡度的增加呈现先升高后降低的趋势。综上所述,物种对生境的选择具有差异性,海拔和坡度是影响FAST周边喀斯特峰丛洼地植物群落空间分布的关键因子。展开更多
目的 针对旋转机械故障诊断过程中存在故障信号特征提取困难、故障诊断过程有标签数据较少、故障诊断准确率低等问题,提出自适应变分模态分解算法(Adaptive Variational Mode Decomposition,AVMD)与密度峰值算法优化的模糊C均值算法(Clu...目的 针对旋转机械故障诊断过程中存在故障信号特征提取困难、故障诊断过程有标签数据较少、故障诊断准确率低等问题,提出自适应变分模态分解算法(Adaptive Variational Mode Decomposition,AVMD)与密度峰值算法优化的模糊C均值算法(Clustering by Fast Search and Find of Density Peaks Optimizing Fuzzy C-Means,DPC-FCM)结合的无监督诊断方法。方法 首先,将多尺度排列熵与峭度相结合的综合系数作为适应度函数,对VMD算法的惩罚因子alpha和模态个数K进行参数寻优,提取分解后本征模态函数(Intrinsic Mode Function,IMF)的平均样本熵与平均模糊熵,并输入至聚类算法中。其次,提出利用密度峰值聚类算法确定FCM的初始聚类中心,降低聚类结果的随机性。结果 将提出的无监督故障诊断模型应用到滚动轴承试验信号中,实现了准确的故障诊断。结论 AVMD在故障提取方面具有优越性,同时DPC算法可以有效提高FCM算法无监督聚类的准确性,二者结合可以有效实现旋转机械故障的智能分类。展开更多
为在大数据环境下处理高维矩阵和应用奇异值分解提供更高效的解决方案,从而加速数据分析和处理速度,通过研究随机投影以及Krylov子空间投影理论下关于高维矩阵求解特征值特征向量(奇异值奇异向量)问题,分别总结了6种高效计算方法并对其...为在大数据环境下处理高维矩阵和应用奇异值分解提供更高效的解决方案,从而加速数据分析和处理速度,通过研究随机投影以及Krylov子空间投影理论下关于高维矩阵求解特征值特征向量(奇异值奇异向量)问题,分别总结了6种高效计算方法并对其相关应用研究进行对比分析。结果表明,在谱聚类的应用上,通过降低核心步骤SVD(Singular Value Decomposition)的复杂度,使优化后的算法与原始谱聚类算法的精度相近,但大大缩短了运行时间,在1200维的数据下计算速度相较原算法快了10倍以上。同时,该方法应用于图像压缩领域,能有效地提高原有算法的运行效率,在精度不变的情况下,运行效率得到了1~5倍的提升。展开更多
A method of environment mapping using laser-based light detection and ranging (LIDAR) is proposed in this paper. This method not only has a good detection performance in a wide range of detection angles, but also fa...A method of environment mapping using laser-based light detection and ranging (LIDAR) is proposed in this paper. This method not only has a good detection performance in a wide range of detection angles, but also facilitates the detection of dynamic and hollowed-out obstacles. Essentially using this method, an improved clustering algorithm based on fast search and discovery of density peaks (CBFD) is presented to extract various obstacles in the environment map. By comparing with other cluster algorithms, CBFD can obtain a favorable number of clusterings automatically. Furthermore, the experiments show that CBFD is better and more robust in functionality and performance than the K-means and iterative self-organizing data analysis techniques algorithm (ISODATA).展开更多
DBSCAN(density-based spatial clustering of applications with noise)是应用最广的密度聚类算法之一.然而,它时间复杂度过高(O(n^(2))),无法处理大规模数据.因而,对它进行加速成为一个研究热点,众多富有成效的工作不断涌现.从加速目...DBSCAN(density-based spatial clustering of applications with noise)是应用最广的密度聚类算法之一.然而,它时间复杂度过高(O(n^(2))),无法处理大规模数据.因而,对它进行加速成为一个研究热点,众多富有成效的工作不断涌现.从加速目标上看,这些工作大体上可分为减少冗余计算和并行化两大类;就具体加速手段而言,可分为6个主要类别:基于分布式、基于采样化、基于近似模糊、基于快速近邻、基于空间划分以及基于GPU加速技术.根据该分类,对现有工作进行了深入梳理与交叉比较,发现采用多重技术的融合加速算法优于单一加速技术;近似模糊化、并行化与分布式是当前最有效的手段;高维数据仍然难以应对.此外,对快速化DBSCAN算法在多个领域中的应用进行了跟踪报告.最后,对本领域未来的方向进行了展望.展开更多
基金This work was supported in part by the National Key Research and Development Program of China(2018AAA0100100)the National Natural Science Foundation of China(61822301,61876123,61906001)+2 种基金the Collaborative Innovation Program of Universities in Anhui Province(GXXT-2020-051)the Hong Kong Scholars Program(XJ2019035)Anhui Provincial Natural Science Foundation(1908085QF271).
文摘During the last three decades,evolutionary algorithms(EAs)have shown superiority in solving complex optimization problems,especially those with multiple objectives and non-differentiable landscapes.However,due to the stochastic search strategies,the performance of most EAs deteriorates drastically when handling a large number of decision variables.To tackle the curse of dimensionality,this work proposes an efficient EA for solving super-large-scale multi-objective optimization problems with sparse optimal solutions.The proposed algorithm estimates the sparse distribution of optimal solutions by optimizing a binary vector for each solution,and provides a fast clustering method to highly reduce the dimensionality of the search space.More importantly,all the operations related to the decision variables only contain several matrix calculations,which can be directly accelerated by GPUs.While existing EAs are capable of handling fewer than 10000 real variables,the proposed algorithm is verified to be effective in handling 1000000 real variables.Furthermore,since the proposed algorithm handles the large number of variables via accelerated matrix calculations,its runtime can be reduced to less than 10%of the runtime of existing EAs.
基金Supported by the National Key R&D Program of China[2016YFB0600802]the National Natural Science Foundation of China[51390492,51325601]
文摘A three-dimensional(3D) fast fluidized bed with the riser of 3.0 m in height and 0.1 m in inner diameter was established to experimentally study the cluster behaviors of Geldart B particles. Five kinds of quartz sand particles(dp= 0.100, 0.139, 0.177, 0.250 and 0.375 mm and ρp= 2480 kg·m^(-3)) were respectively investigated, with the total mass of the bed material kept as 10 kg. The superficial gas velocity in the riser ranges from 2.486 to 5.594 m·s^(-1) and the solid mass flux alters from 30 to 70 kg·((m^(-2)·s))^(-1). Cluster characteristics and evolutionary processes in the different positions of the riser were captured by the cluster visualization systems and analyzed by the self-developed binary image processing. The results found four typical cluster structures in the riser,i.e., the macro stripe-shaped cluster, saddle-shaped cluster, U-shaped cluster and the micro cluster. The increasing superficial gas velocity and particle sizes result in the increasing average cluster size and the decreasing cluster time fraction, while the solid mass flux in the riser have the reverse influences on the cluster size and time fraction. Additionally, clusters in the upper region of the riser often have the larger size and time fraction than that in the lower region. All these effects of operating conditions on clusters become less obvious when particle size is less than 0.100 mm.
基金supported by the National Natural Science Foundation of China(61401475)
文摘The key challenge of the extended target probability hypothesis density (ET-PHD) filter is to reduce the computational complexity by using a subset to approximate the full set of partitions. In this paper, the influence for the tracking results of different partitions is analyzed, and the form of the most informative partition is obtained. Then, a fast density peak-based clustering (FDPC) partitioning algorithm is applied to the measurement set partitioning. Since only one partition of the measurement set is used, the ET-PHD filter based on FDPC partitioning has lower computational complexity than the other ET-PHD filters. As FDPC partitioning is able to remove the spatially close clutter-generated measurements, the ET-PHD filter based on FDPC partitioning has good tracking performance in the scenario with more clutter-generated measurements. The simulation results show that the proposed algorithm can get the most informative partition and obviously reduce computational burden without losing tracking performance. As the number of clutter-generated measurements increased, the ET-PHD filter based on FDPC partitioning has better tracking performance than other ET-PHD filters. The FDPC algorithm will play an important role in the engineering realization of the multiple extended target tracking filter.
文摘This work demonstrates the so-called PCAC (Protein principal Component Analysis Clustering) method, which clusters large-scale decoy protein structures in protein structure prediction based on principal component analysis (PCA), is an ultra-fast and low-memory-requiring clustering method. It can be two orders of magnitude faster than the commonlyused pairwise rmsd-clustering (pRMSD) when enormous of decoys are involved. Instead of N(N – 1)/2 least-square fitting of rmsd calculations and N2 memory units to store the pairwise rmsd values in pRMSD, PCAC only requires N rmsd calculations and N × P memory storage, where N is the number of structures to be clustered and P is the number of preserved eigenvectors. Furthermore, PCAC based on the covariance Cartesian matrix generates essentially the identical result as that from the reference rmsd-clustering (rRMSD). From a test of 41 protein decoy sets, when the eigenvectors that contribute a total of 90% eigenvalues are preserved, PCAC method reproduces the results of near-native selections from rRMSD.
文摘为探究地形因子对500米口径球面射电望远镜(Five-hundred-meter Aperture Spherical radio Telescope,FAST)周边植物物种多样性及空间分布的影响,该文选取FAST周边喀斯特峰丛洼地3种典型植物群落(乔木层、灌木层、藤本层)作为研究对象,采用方差分析及典范对应分析(CCA)研究不同地形因子(海拔、坡度、坡向、坡位)梯度下植物群落物种多样性及空间分布特征。结果表明:(1)FAST周边植物群落α多样性指数呈现灌木层>乔木层>藤本层的趋势,乔木层、藤本层植物α多样性指数随海拔升高而增加(P<0.05),地形因子对灌木层植物α多样性无显著性影响。(2)FAST周边植物群落物种的空间分布受海拔的影响最大,其次为坡度(P<0.05)。(3)FAST周边3种植物群落的Jaccard相似性指数随海拔的升高呈现增加的趋势,沿坡度的增加呈现先升高后降低的趋势。综上所述,物种对生境的选择具有差异性,海拔和坡度是影响FAST周边喀斯特峰丛洼地植物群落空间分布的关键因子。
文摘目的 针对旋转机械故障诊断过程中存在故障信号特征提取困难、故障诊断过程有标签数据较少、故障诊断准确率低等问题,提出自适应变分模态分解算法(Adaptive Variational Mode Decomposition,AVMD)与密度峰值算法优化的模糊C均值算法(Clustering by Fast Search and Find of Density Peaks Optimizing Fuzzy C-Means,DPC-FCM)结合的无监督诊断方法。方法 首先,将多尺度排列熵与峭度相结合的综合系数作为适应度函数,对VMD算法的惩罚因子alpha和模态个数K进行参数寻优,提取分解后本征模态函数(Intrinsic Mode Function,IMF)的平均样本熵与平均模糊熵,并输入至聚类算法中。其次,提出利用密度峰值聚类算法确定FCM的初始聚类中心,降低聚类结果的随机性。结果 将提出的无监督故障诊断模型应用到滚动轴承试验信号中,实现了准确的故障诊断。结论 AVMD在故障提取方面具有优越性,同时DPC算法可以有效提高FCM算法无监督聚类的准确性,二者结合可以有效实现旋转机械故障的智能分类。
文摘为在大数据环境下处理高维矩阵和应用奇异值分解提供更高效的解决方案,从而加速数据分析和处理速度,通过研究随机投影以及Krylov子空间投影理论下关于高维矩阵求解特征值特征向量(奇异值奇异向量)问题,分别总结了6种高效计算方法并对其相关应用研究进行对比分析。结果表明,在谱聚类的应用上,通过降低核心步骤SVD(Singular Value Decomposition)的复杂度,使优化后的算法与原始谱聚类算法的精度相近,但大大缩短了运行时间,在1200维的数据下计算速度相较原算法快了10倍以上。同时,该方法应用于图像压缩领域,能有效地提高原有算法的运行效率,在精度不变的情况下,运行效率得到了1~5倍的提升。
基金Supported by the National Natural Science Foundation of China(61103157)
文摘A method of environment mapping using laser-based light detection and ranging (LIDAR) is proposed in this paper. This method not only has a good detection performance in a wide range of detection angles, but also facilitates the detection of dynamic and hollowed-out obstacles. Essentially using this method, an improved clustering algorithm based on fast search and discovery of density peaks (CBFD) is presented to extract various obstacles in the environment map. By comparing with other cluster algorithms, CBFD can obtain a favorable number of clusterings automatically. Furthermore, the experiments show that CBFD is better and more robust in functionality and performance than the K-means and iterative self-organizing data analysis techniques algorithm (ISODATA).
基金VI. ACKNOWLEDGEMENTS This work was supported by the Hundred Talents fund of The Chinese Academy of Sciences, the National Natural Science Foundation of China (No.20703048, No.20803083, and No.20933008), the Center for Molecular Science Foundation of Institute of Chemistry, Chinese Academy of Sciences (No.CMS-CX200803), and the National Basic Research Programs of China (No.2006CB932100 and No.2006CB806200).
文摘DBSCAN(density-based spatial clustering of applications with noise)是应用最广的密度聚类算法之一.然而,它时间复杂度过高(O(n^(2))),无法处理大规模数据.因而,对它进行加速成为一个研究热点,众多富有成效的工作不断涌现.从加速目标上看,这些工作大体上可分为减少冗余计算和并行化两大类;就具体加速手段而言,可分为6个主要类别:基于分布式、基于采样化、基于近似模糊、基于快速近邻、基于空间划分以及基于GPU加速技术.根据该分类,对现有工作进行了深入梳理与交叉比较,发现采用多重技术的融合加速算法优于单一加速技术;近似模糊化、并行化与分布式是当前最有效的手段;高维数据仍然难以应对.此外,对快速化DBSCAN算法在多个领域中的应用进行了跟踪报告.最后,对本领域未来的方向进行了展望.