摘要
为了提取不同活力种子光谱特征的精细结构和复杂信息,该研究探索了连续小波变换提取不同活力脱绒棉种的光谱信息,并提出了一种基于相关性及特征重要性筛选小波特征(wavelet features,WFs)的方法。通过人工老化试验,获取不同活力等级的脱绒棉种,并采集其高光谱影像,对原始光谱进行Savitzky-Golay平滑、多元散射校正、一阶微分、二阶微分等预处理。然后对比gauss4、mexh和bior6.8等小波基函数提取的WFs。利用主成分分析对光谱特征(spectral features,SFs)与WFs降维,并建立支持向量机(support vector machines, SVM)、随机森林(random forest,RF)、极限学习机(extreme learning machines,ELM)和反向传播神经网络(back propagation neural network,BPNN)等的种子活力检测模型,对比SFs与WFs的建模精度。为了进一步提取出WFs中的精细光谱信息,基于相关性分析和随机森林特征重要性评价,提取了与种子活力的相关性在前1%的小波特征(1%|R|-WFs)、在种子活力识别中特征重要性在前1%的小波特征(1%Importance-WFs)及二者融合的1%|R|+1%Importance-WFs 3个WFs特征集并带入上述机器学习模型。结果表明:1)bior6.8函数提取的不同活力脱绒棉种的WFs效果较好,其他函数在提取WFs时,出现明显的振铃效应。2)在各品种的所有机器学习模型中,WFs主成分的建模精度均高于SFs主成分的建模精度,且基于1%|R|+1%ImportanceWFs的准确率最高。3)金科21与金科20种子活力检测的最优模型均为:1%|R|+1%Importance-WFs+ELM;新陆早64种子活力检测的最优模型为:1%|R|+1%Importance-WFs+各机器模型与PCA-WFs+ELM/BPNN。金科21最优模型训练集和测试集的准确率分别为99.63%、98.28%;金科20与新陆早64最优模型训练集和测试集的准确率均为100%。结果表明,该研究提出的基于相关性及特征重要性的方法能够有效提取出不同活力脱绒棉种的光谱差异信息,为种子活力高光谱检测提供一种光谱特征分析思路。
The purpose of this study is to explore the feasibility of using continuous wavelet transform to extract spectral difference information of different vigor desiccated cotton species.A method of filtering wavelet features(WFs)based on correlation and feature importance is proposed to extract the fine structure and complex information of spectral features of seeds with different vigor.Different vigor classes of desiccated cotton seeds were obtained through artificial aging experiments.And its high spectral image was collected.The raw spectra are preprocessed with Savitzky-Golay smoothing,multivariate scattering correction,first-order differentiation,and second-order differentiation.Then,the WFs extracted by wavelet basis functions such as gauss4,mexh and bior6.8 were compared.Spectral wavelet features(SFs)and WFs were downscaled using principal component analysis.Based on machine learning algorithms such as support vector machines(SVM),random forest(RF),extreme learning machines(ELM),and back propagation neural network(BPNN),a seed vigor detection model was developed for SFs principal components and WFs principal components.The accuracy of the seed vigor detection model was compared between SFs principal components and WFs principal components.The fine spectral information in WFs was further extracted based on correlation analysis and random forest feature importance evaluation.Including the 1%|R|-WFs feature set with the correlation with seed vigor at the top 1%,the 1%Importance-WFs feature set with the feature importance at the top 1%in seed vigor recognition,and the 1%|R|+1%Importance-WFs feature set with the combination of the two,and bring these three WFs feature sets into the above machine learning model.The results showed that:1)The bior6.8 function extracted better WFs for different vigor desiccated cotton species.Other wavelet basis functions show a clear ringing effect when extracting WFs.2)The modeling accuracy of the WFs principal components is higher than that of the SFs principal components in all machine learning models for each species.The model based on 1%|R|+1%Importance-WFs has the highest accuracy.3)The optimal models for seed vigor detection of Jinke 21 and Jinke 20 were:1%|R|+1%Importance-WFs+ELM.The optimal model of Xinluzao 64 seed vigor detection is:1%|R|+1%Importance-WFs+any machine model and PCA-WFs+ELM/BPNN.The accuracies of the training set and test set of the optimal model of Jinke 21 are 99.63%and 98.28%,and the accuracies of the training set and test set of the optimal model of Jinke 20 and Xinluzao 64 are both 100%.The results indicate that the method proposed in this paper based on correlation and feature importance can effectively extract spectral difference information of different vitality dried cottonseeds,providing a new spectral characterization approach for seed vitality hyperspectral detection.
作者
杜文玲
郭鹏
刘笑
DU Wenling;GUO Peng;LIU Xiao(College of Science,Shihezi University,Shihezi 832003,China;Key Laboratory of Oasis Town and Basin System Ecological Corps,Shihezi 832003,China)
出处
《农业工程学报》
EI
CAS
CSCD
北大核心
2024年第20期174-186,共13页
Transactions of the Chinese Society of Agricultural Engineering
基金
国家自然科学基金项目(U2003109)。