期刊文献+

联合矩阵局部保持投影的近红外光谱特征提取 被引量:6

Research on Feature Extraction of Near-Infrared Spectroscopy Based on Joint Matrix Local Preserving Projection
下载PDF
导出
摘要 近红外光谱存在高维、噪声大、重叠和非线性等特性,严重影响建模准确,因此提出了一种基于联合矩阵局部保持投影(JMLPP)的特征提取方法。首先,利用基于聚类的光谱特征选择方法对原始近红外光谱数据进行有效特征提取,按种与分类相关性强的指标将样本分为种不同的聚类方式,依据类内关联性强,类间差异性大的聚类思想,通过调节类内参数、类间参数确定类内阈值与类间阈值,分别对种不同聚类方式筛选光谱特征区间,得到指标特征矩阵,并集操作生成联合矩阵。其次,从两个方面对局部保持投影算法(LPP)进行了改进:引入测地距离构造邻域距离矩阵,较欧式距离更好的表达了高维数据样本点间的拓扑结构;改进了边权矩阵,解决了样本稀疏导致的不确定性,避免了有效信息的丢失。最后,采用改进的LPP算法对联合矩阵进行降维操作,从而得到最优光谱特征子集。为验证JMLPP算法有效性,首先从光谱投影方面将该算法与PCA、 LPP算法进行了对比,结果表明JMLPP算法有较好的等级区分能力,投影空间中的烟叶样品分类清晰,明显优于PCA与LPP算法。其次从模型分类准确性方面进行了对比,分别采用全谱段与PCA, LPP和JMLPP降维后的特征建立烟叶等级分类模型,实验结果表明, JMLPP算法建立的分类模型准确率为93.8%,对5种烟叶分级的敏感度分别为95.2%, 93.1%, 94.2%, 92.1%和92.5%,特异度分别为99.3%, 98.4%, 98.6%, 97.5%和97%,模型准确率、敏感度与特异度均明显优于其他3种方法。该算法通过基于聚类的特征提取和改进的局部保持投影算法实现了烟叶分级特征的有效提取,并保留原始数据的局部线性关系,使最终建立的模型具有良好的稳定性和较高的准确性。 Aiming at the problem that the high-dimensional,high-noise,overlap and nonlinear features of the near-infrared spectrum seriously affect the modeling accuracy,a feature extraction method based on joint matrix local preservation projection(JMLPP)is proposed in this paper.First,the cluster-based spectral feature selection is used for effective features extraction.According to kinds of indicators with a strong correlation of classification,the samples are divided into kinds of different clustering modes.Based on the idea of strong intra-class correlation and great inter-class difference,the intra-class threshold and the inter-class threshold are determined by adjusting the intra-class parameter and the inter-class parameter.The spectral feature regions are selected according to kinds of different clustering modes,and feature matrices are obtained,whereas a joint matrix is generated by the union operation.Cluster-based feature extraction eliminates features with low intra-class correlation and high correlation between classes,and realizes the elimination of noise information in the spectrum.Secondly,the local preservation projection algorithm(LPP)is improved in this paper from two aspects:the geodesic distance is introduced to construct the neighborhood distance matrix,and the topology between the high-dimensional sample data is better expressed than the Euclidean distance.Meanwhile,the edge weight matrix is also improved,which solves the uncertainty caused by sample sparseness and avoids the loss of effective information.Finally,the improved LPP algorithm is used to reduce the dimensionality of the joint matrix,and the optimal spectral feature subset of the low-dimensional mapping is obtained.In order to verify the effectiveness of the JMLPP algorithm,this paper first compares the JMLPP with PCA and LPP from the perspective of spectral projection.The results show that JMLPP has better classification ability,and the tobacco samples in the projection space are clearly classified,and the effect is obviously better than PCA and LPP.In addition,the results of the model classification are also compared.The classification models were established by using the full spectra and dimension reduction features of the PCA,LPP and JMLPP.The experimental results show that the accuracy of the classification model established by JMLPP algorithm is 93.8%.The sensitivity of the five categories of tobacco grading classification are 95.2%,93.1%,94.2%,92.1%,92.5%,and the specificities are 99.3%,98.4%,98.6%,97.5%,and 97%,respectively.The accuracy,sensitivity and specificity of the model are significantly higher than the other three methods.The JMLPP algorithm effectively extracts useful information of classification based on cluster-based feature extraction and local preserving projection algorithm,and maintains the local linear relationship of the original data.The stability and accuracy of model are desirable.
作者 胡善科 秦玉华 段如敏 吴丽君 宫会丽 HU Shan-ke;QIN Yu-hua;DUAN Ru-min;WU Li-jun;GONG Hui-li(College of Information Science and Technology,Qingdao University of Science and Technology,Qingdao 266061,China;Technical Research Center,China Tobacco Yunnan Industrial Co.,Ltd.,Kunming 650024,China;College of Information Science and Engineering,China Ocean University,Qingdao 266100,China)
出处 《光谱学与光谱分析》 SCIE EI CAS CSCD 北大核心 2020年第12期3772-3777,共6页 Spectroscopy and Spectral Analysis
基金 国家重点研发计划项目(2018YFB1701704) 云南中烟工业有限责任公司项目(2018JC01)资助。
关键词 特征提取 联合矩阵 测地线距离 局部保持投影算法 近红外光谱 Feature extraction Joint matrix Geodesic distance Local preservation projection algorithm Near-infrared spectroscopy
  • 相关文献

参考文献6

二级参考文献75

共引文献123

同被引文献67

引证文献6

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部