期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
High Dimension Multivariate Data Analysis for Small Group Samples of Chemical Volatile Profiles of African Nightshade Species
1
作者 Lorna Chepkemoi Daisy Salifu +1 位作者 lucy kananu murungi Henri E. Z. Tonnang 《Journal of Data Analysis and Information Processing》 2024年第2期210-231,共22页
Quantitative headspace analysis of volatiles emitted by plants or any other living organisms in chemical ecology studies generates large multidimensional data that require extensive mining and refining to extract usef... Quantitative headspace analysis of volatiles emitted by plants or any other living organisms in chemical ecology studies generates large multidimensional data that require extensive mining and refining to extract useful information. More often the number of variables and the quantified volatile compounds exceed the number of observations or samples and hence many traditional statistical analysis methods become inefficient. Here, we employed machine learning algorithm, random forest (RF) in combination with distance-based procedure, similarity percentage (SIMPER) as preprocessing steps to reduce the data dimensionality in the chemical profiles of volatiles from three African nightshade plant species before subjecting the data to non-metric multidimensional scaling (NMDS). In addition, non-parametric methods namely permutational multivariate analysis of variance (PERMANOVA) and analysis of similarities (ANOSIM) were applied to test hypothesis of differences among the African nightshade species based on the volatiles profiles and ascertain the patterns revealed by NMDS plots. Our results revealed that there were significant differences among the African nightshade species when the data’s dimension was reduced using RF variable importance and SIMPER, as also supported by NMDS plots that showed S. scabrum being separated from S. villosum and S. sarrachoides based on the reduced data variables. The novelty of our work is on the merits of using data reduction techniques to successfully reveal differences in groups which could have otherwise not been the case if the analysis were performed on the entire original data matrix characterized by small samples. The R code used in the analysis has been shared herein for interested researchers to customise it for their own data of similar nature. 展开更多
关键词 Random Forest Similarity Percentage PERMANOVA ANOSIM Non-Metric Multi-Dimensional Scaling
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部