期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Variable importance-weighted Random Forests 被引量:4
1
作者 Yiyi Liu Hongyu Zhao 《Frontiers of Electrical and Electronic Engineering in China》 CSCD 2017年第4期338-351,共14页
Background: Random Forests is a popular classification and regression method that has proven powerful for various prediction problems in biological studies. However, its performance often deteriorates when the number... Background: Random Forests is a popular classification and regression method that has proven powerful for various prediction problems in biological studies. However, its performance often deteriorates when the number of features increases. To address this limitation, feature elimination Random Forests was proposed that only uses features with the largest variable importance scores. Yet the performance of this method is not satisfying, possibly due to its rigid feature selection, and increased correlations between trees of forest. Methods: We propose variable importance-weighted Random Forests, which instead of sampling features with equal probability at each node to build up trees, samples features according to their variable importance scores, and then select the best split from the randomly selected features. Results: We evaluate the performance of our method through comprehensive simulation and real data analyses, for both regression and classification. Compared to the standard Random Forests and the feature elimination Random Forests methods, our proposed method has improved performance in most cases. Conclusions: By incorporating the variable importance scores into the random feature selection step, our method can better utilize more informative features without completely ignoring less informative ones, hence has improved prediction accuracy in the presence of weak signals and large noises. We have implemented an R package "viRandomForests" based on the original R package "randomForest" and it can be freely downloaded from http:// zhaocenter.org/software. 展开更多
关键词 Random Forests variable importance score CLASSIFICATION regression
原文传递
Estimation of Sunflower Seed Yield Using Partial Least Squares Regression and Artificial Neural Network Models 被引量:5
2
作者 ZENG Wenzhi XU Chi +2 位作者 Gang ZHAO WU Jingwei HUANG Jiesheng 《Pedosphere》 SCIE CAS CSCD 2018年第5期764-774,共11页
Statistical models can efficiently establish the relationships between crop growth and environmental conditions while explicitly quantifying uncertainties. This study aimed to test the efficiency of statistical models... Statistical models can efficiently establish the relationships between crop growth and environmental conditions while explicitly quantifying uncertainties. This study aimed to test the efficiency of statistical models established using partial least squares regression(PLSR) and artificial neural network(ANN) in predicting seed yields of sunflower(Helianthus annuus). Two-year field trial data on sunflower growth under different salinity levels and nitrogen(N) application rates in the Yichang Experimental Station in Hetao Irrigation District, Inner Mongolia, China, were used to calibrate and validate the statistical models. The variable importance in projection score was calculated in order to select the sensitive crop indices for seed yield prediction. We found that when the most sensitive indices were used as inputs for seed yield estimation, the PLSR could attain a comparable accuracy(root mean square error(RMSE) = 0.93 t ha-1, coefficient of determination(R^2) = 0.69) to that when using all measured indices(RMSE = 0.81 t ha-1,R^2= 0.77). The ANN model outperformed the PLSR for yield prediction with different combinations of inputs of both microplots and field data. The results indicated that sunflower seed yield could be reasonably estimated by using a small number of crop characteristic indices under complex environmental conditions and management options(e.g., saline soils and N application). Since leaf area index and plant height were found to be the most sensitive crop indices for sunflower seed yield prediction, remotely sensed data and the ANN model may be joined for regional crop yield simulation. 展开更多
关键词 leaf area index microplot experiment plant height remote sensing SALINIZATION variable importance in projection score
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部