In this study, a seed origin discrimination model for Clinacanthus nutans was developed. First, 81 C. nutans samples from three seed origin locations were collected, and their Near-Infrared (NIR) spectra were obtained...In this study, a seed origin discrimination model for Clinacanthus nutans was developed. First, 81 C. nutans samples from three seed origin locations were collected, and their Near-Infrared (NIR) spectra were obtained. Next, Principal Component Analysis (PCA) was performed on the NIR spectra of the 81 C. nutans samples. Then, MSC (multiplicative scatter correction), SNV (standard normal variate), first derivative, and second derivative pre-treatments of the C. nutans spectra were performed and combined with the Support Vector Machine (SVM) algorithm for modelling and analysis. Among these methods, first-order derivative pre-treatment achieved the best SVM model effectiveness, with a training set accuracy of 93.44% (57/61) and a test set accuracy of 85.00% (17/20). In order to further improve the discrimination accuracy of the model, three optimization algorithms Grid Search (GS), Genetic Algorithm (GA), and Particle Swarm Optimization (PSO) were employed to identify the best c and g parameters for the SVM model. The results demonstrated that the PSO optimization algorithm yielded the best parameters of c = 0.8343, g = 57.8741, with corresponding model training set the accuracy of 96.36% (60/61) and test set the accuracy of 95.00% (20/21). Therefore, developing a seed origin classification model for C. nutans based on NIR spectroscopy combined with chemometrics is feasible and has the advantages of being simple, rapid, and green.展开更多
Emblic medicine is a popular natural source in the world due to its outstanding healthcare and therapeutic functions.Our preliminary results indicated that the quality of emblic medicines might have an apparent region...Emblic medicine is a popular natural source in the world due to its outstanding healthcare and therapeutic functions.Our preliminary results indicated that the quality of emblic medicines might have an apparent regional variation.A rapid and effective geographical traceability system has not been designed yet.To trace the geographical origins so that their quality can be controlled,an integrated spectroscopic strategy including spectral pretreatment,outlier diagnosis,feature selection,data fusion,and machine learning algorithm was proposed.A featured data matrix(245220)was successfully generated,and a carefully adjusted RF machine learning algorithm was utilized to develop the geographical traceability model.The results demonstrate that the proposed strategy is effective and can be generalized.Sensitivity(SEN),specificity(SPE)and accuracy(ACC)of 97.65%,99.85%and 97.63%for the calibrated set,as well as 100.00%predictive efficiency,were obtained using this spectroscopic analysis strategy.Our study has created an integrated analysis process for multiple spectral data,which can achieve a rapid,nondestructive and green quality detection for emblic medicines originating from seventeen geographical origins.展开更多
文摘In this study, a seed origin discrimination model for Clinacanthus nutans was developed. First, 81 C. nutans samples from three seed origin locations were collected, and their Near-Infrared (NIR) spectra were obtained. Next, Principal Component Analysis (PCA) was performed on the NIR spectra of the 81 C. nutans samples. Then, MSC (multiplicative scatter correction), SNV (standard normal variate), first derivative, and second derivative pre-treatments of the C. nutans spectra were performed and combined with the Support Vector Machine (SVM) algorithm for modelling and analysis. Among these methods, first-order derivative pre-treatment achieved the best SVM model effectiveness, with a training set accuracy of 93.44% (57/61) and a test set accuracy of 85.00% (17/20). In order to further improve the discrimination accuracy of the model, three optimization algorithms Grid Search (GS), Genetic Algorithm (GA), and Particle Swarm Optimization (PSO) were employed to identify the best c and g parameters for the SVM model. The results demonstrated that the PSO optimization algorithm yielded the best parameters of c = 0.8343, g = 57.8741, with corresponding model training set the accuracy of 96.36% (60/61) and test set the accuracy of 95.00% (20/21). Therefore, developing a seed origin classification model for C. nutans based on NIR spectroscopy combined with chemometrics is feasible and has the advantages of being simple, rapid, and green.
基金This work is financially supported by the National Wild Plant Germplasm Resources Infrastructure which is the follow-up work of a project called Standardization and Community for the Collection and Preservation of Important Wild Plant Germplasm Resources(2005DKA21006).
文摘Emblic medicine is a popular natural source in the world due to its outstanding healthcare and therapeutic functions.Our preliminary results indicated that the quality of emblic medicines might have an apparent regional variation.A rapid and effective geographical traceability system has not been designed yet.To trace the geographical origins so that their quality can be controlled,an integrated spectroscopic strategy including spectral pretreatment,outlier diagnosis,feature selection,data fusion,and machine learning algorithm was proposed.A featured data matrix(245220)was successfully generated,and a carefully adjusted RF machine learning algorithm was utilized to develop the geographical traceability model.The results demonstrate that the proposed strategy is effective and can be generalized.Sensitivity(SEN),specificity(SPE)and accuracy(ACC)of 97.65%,99.85%and 97.63%for the calibrated set,as well as 100.00%predictive efficiency,were obtained using this spectroscopic analysis strategy.Our study has created an integrated analysis process for multiple spectral data,which can achieve a rapid,nondestructive and green quality detection for emblic medicines originating from seventeen geographical origins.