期刊文献+

基于UMAP辅助的模糊C聚类方法进行太赫兹光谱识别 被引量:1

UMAP-Assisted Fuzzy C-Clustering Method for Recognition of Terahertz Spectrum
下载PDF
导出
摘要 太赫兹(THz)具有低能性、瞬态性、波谱分析能力强的优点,在物质鉴别方面具有广阔的应用前景。现有的基于THz的物质鉴别方法,虽然取得了一定的效果,但是存在容易陷入局部最优的问题,从而导致识别精度不高。均匀流形逼近与投影(UMAP)作为一种非线性降维方法,其假设数据均匀分布在黎曼流形上,可以对具有模糊拓扑结构的流形进行建模。UMAP降维的过程是通过最小化两个拓扑表示之间的交叉熵,从而实现低维空间中数据表示的布局优化。传统的模糊C聚类方法(FCM)在聚类时,初始聚类中心往往随机给定,当初始聚类中心选择不恰当时,容易导致错误的聚类。为此,提出一种基于UMAP辅助的模糊C聚类算法,首先运用UMAP对输入的THz样本矩阵进行降维;再根据类与类之间距离最大化的原则,选择合适的初始聚类中心;最后利用模糊C均值聚类的方法进行聚类。所提出的方法不仅能够解决聚类过程中类与类之间过度拥挤的现象,而且能够反映出类别间的距离信息以便于给样本选择合适的初始聚类中心。为了验证提出的聚类方法的可靠性,运用太赫兹时域光谱技术对鲁棉研28、鲁棉研29、鲁棉研36、中棉28四种不同类型的转基因棉花种子进行了探测,利用基于UMAP辅助的模糊C聚类算法对转基因棉花种子的吸光度光谱数据进行聚类分析,成功地将四种不同类型的转基因棉花种子区分开,得到了总正确率为0.9833的聚类效果,说明提出的基于UMAP辅助的模糊C聚类算法在物质太赫兹光谱识别方面具有良好的应用前景。 Terahertz(THz)waves characterized by low energy,instantaneity and proficiency in spectral analysis have a promising futures in material identification.Although the existing substance identification methods based on THz have achieved certain effects,they are prone to fall into local optimization,resulting in low identification accuracy.Uniform manifold Approximation and Projection(UMAP),as a nonlinear dimensionality reduction method,assume that the data are uniformly distributed on Riemannian manifolds,which can be used to model manifolds with fuzzy topology.UMAP dimension reduction is to optimize the layout of data representation in low-dimensional space by minimizing the cross-entropy between two topological representations.The initial clustering centre is often given randomly in the traditional fuzzy C-clustering method(FCM).When the initial clustering center is not selected correctly,it is easy to fall into the local optimum,leading to wrong clustering.To this end,this paper proposes a Uniform Manifold Approximation and Projection(UMAP)assisted fuzzy C-clustering algorithm.Firstly,UMAP is used to reduce the dimensionality of the input THz sample matrix.And then,based on the principle of maximizing the distance between categories,the appropriate initial clustering center is selected.Finally,the fuzzy C-means method is employed to perform the clustering analysis.This proposed algorithm can solve the overcrowding problem between categories in the clustering process and reflect the distance information between categories to facilitate the selection of appropriate initial clustering centers.In order to verify the reliability of the algorithm proposed in this paper,four different types of genetically modified cotton seeds of Lu Mianyan28,Lu Mianyan29,Lu Mianyan36,and Zhongmian28 were detected by using THz time-domain spectroscopy technology.Then,the UMAP-assisted fuzzy C-clustering method was used to cluster the absorbance spectral data of four different types of genetically modified cotton seeds.The different cotton seeds are successfully well separated,and the clustering effect with a total correct rate of 0.9833 is obtained.The result fully demonstrates that the fuzzy C-clustering method based on UMAP-assisted proposed in this paper has a good application prospect in identifying material THz spectrum.
作者 易灿灿 庹帅 涂闪 张文涛 YI Can-can;TUO Shuai;TU Shan;ZHANG Wen-tao(Key Laboratory of Metallurgical Equipment and Control Technology of Ministry of Education,Wuhan University of Science and Technology,Wuhan 430081,China;Hubei Provincial Key Laboratory of Mechanical Transmission and Manufacturing Engineering,Wuhan University of Science and Technology,Wuhan 430081,China;Precision Manufacturing Institute,Wuhan University of Science and Technology,Wuhan 430081,China;School of Physical Science and Technology,Guangxi Normal University,Guilin 541004,China;School of Electronic Engineering and Automation,Guilin University of Electronic Technology,Guilin 541004,China)
出处 《光谱学与光谱分析》 SCIE EI CAS CSCD 北大核心 2022年第9期2694-2701,共8页 Spectroscopy and Spectral Analysis
基金 国家自然科学基金项目(51805382) 广西重点研发计划项目(2020AB44003) 广西光电信息处理重点实验室主任基金项目(GD18206) 国家科技重大专项(2017ZX02101007-003)资助。
关键词 太赫兹时域光谱 物质鉴别 转基因棉花种子 UMAP 降维 模糊C聚类 Terahertz time-domain spectroscopy Substance identification Transgenic cotton seeds UMAP Dimensionality reduction Fuzzy C-clustering method
  • 相关文献

参考文献7

二级参考文献52

共引文献29

同被引文献14

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部