In this study, the upper ocean heat content (OHC) variations in the South China Sea (SCS) during 1993- 2006 were investigated by examining ocean temperatures in seven datasets, including World Ocean Atlas 2009 (W...In this study, the upper ocean heat content (OHC) variations in the South China Sea (SCS) during 1993- 2006 were investigated by examining ocean temperatures in seven datasets, including World Ocean Atlas 2009 (WOA09) (climatology), Ishii datasets, Ocean General Circulation ModeI for the Earth Simulator (OFES), Simple Ocean Data Assimilation system (SODA), Global Ocean Data Assimilation System (GODAS), China Oceanic ReAnalysis system (CORA) , and an ocean reanalysis dataset for the joining area of Asia and Indian-Pacific Ocean (AIPO1.0). Among these datasets, two were independent of any numerical model, four relied on data assimilation, and one was generated without any data assimilation. The annual cycles revealed by the seven datasets were similar, but the interannual variations were different. Vertical structures of temperatures along the 18~N, 12.75~N, and 120~E sections were compared with data collected during open cruises in 1998 and 2005-08. The results indicated that Ishii, OFES, CORA, and AIPO1.0 were more consistent with the observations. Through systematic shortcomings and advantages in presenting the upper comparisons, we found that each dataset had its own OHC in the SCS.展开更多
Deep learning has been increasingly popular in omics data analysis.Recent works incorporating variable selection into deep learning have greatly enhanced the model’s interpretability.However,because deep learning des...Deep learning has been increasingly popular in omics data analysis.Recent works incorporating variable selection into deep learning have greatly enhanced the model’s interpretability.However,because deep learning desires a large sample size,the existing methods may result in uncertain findings when the dataset has a small sample size,commonly seen in omics data analysis.With the explosion and availability of omics data from multiple populations/studies,the existing methods naively pool them into one dataset to enhance the sample size while ignoring that variable structures can differ across datasets,which might lead to inaccurate variable selection results.We propose a penalized integrative deep neural network(PIN)to simultaneously select important variables from multiple datasets.PIN directly aggregates multiple datasets as input and considers both homogeneity and heterogeneity situations among multiple datasets in an integrative analysis framework.Results from extensive simulation studies and applications of PIN to gene expression datasets from elders with different cognitive statuses or ovarian cancer patients at different stages demonstrate that PIN outperforms existing methods with considerably improved performance among multiple datasets.The source code is freely available on Github(rucliyang/PINFunc).We speculate that the proposed PIN method will promote the identification of disease-related important variables based on multiple studies/datasets from diverse origins.展开更多
High spatial resolution and high temporal frequency fractional vegetation cover(FVC) products have been increasingly in demand to monitor and research land surface processes. This paper develops an algorithm to estima...High spatial resolution and high temporal frequency fractional vegetation cover(FVC) products have been increasingly in demand to monitor and research land surface processes. This paper develops an algorithm to estimate FVC at a 30-m/15-day resolution over China by taking advantage of the spatial and temporal information from different types of sensors: the 30-m resolution sensor on the Chinese environment satellite(HJ-1) and the 1-km Moderate Resolution Imaging Spectroradiometer(MODIS). The algorithm was implemented for each main vegetation class and each land cover type over China. First, the high spatial resolution and high temporal frequency normalized difference vegetation index(NDVI) was acquired by using the continuous correction(CC) data assimilation method. Then, FVC was generated with a nonlinear pixel unmixing model. Model coefficients were obtained by statistical analysis of the MODIS NDVI. The proposed method was evaluated based on in situ FVC measurements and a global FVC product(GEOV1 FVC). Direct validation using in situ measurements at 97 sampling plots per half month in 2010 showed that the annual mean errors(MEs) of forest, cropland, and grassland were-0.025, 0.133, and 0.160, respectively, indicating that the FVCs derived from the proposed algorithm were consistent with ground measurements [R2 = 0.809,root-mean-square deviation(RMSD) = 0.065]. An intercomparison between the proposed FVC and GEOV1 FVC demonstrated that the two products had good spatial–temporal consistency and similar magnitude(RMSD approximates 0.1). Overall, the approach provides a new operational way to estimate high spatial resolution and high temporal frequency FVC from multiple remote sensing datasets.展开更多
基金supported by the National Basic Research Program of China (Grant Nos. 2010CB950400 and 2013CB430301)the National Natural Science Foundation of China (Grant Nos. 41276025 and 41176023)+2 种基金the R&D Special Fund for Public Welfare Industry (Meteorology) (Grant No. GYHY201106036)The OFES simulation was conducted on the Earth Simulator under the support of JAMSTECsupported by the Data Sharing Infrastructure of Earth System Science-Data Sharing Service Center of the South China Sea and adjacent regions
文摘In this study, the upper ocean heat content (OHC) variations in the South China Sea (SCS) during 1993- 2006 were investigated by examining ocean temperatures in seven datasets, including World Ocean Atlas 2009 (WOA09) (climatology), Ishii datasets, Ocean General Circulation ModeI for the Earth Simulator (OFES), Simple Ocean Data Assimilation system (SODA), Global Ocean Data Assimilation System (GODAS), China Oceanic ReAnalysis system (CORA) , and an ocean reanalysis dataset for the joining area of Asia and Indian-Pacific Ocean (AIPO1.0). Among these datasets, two were independent of any numerical model, four relied on data assimilation, and one was generated without any data assimilation. The annual cycles revealed by the seven datasets were similar, but the interannual variations were different. Vertical structures of temperatures along the 18~N, 12.75~N, and 120~E sections were compared with data collected during open cruises in 1998 and 2005-08. The results indicated that Ishii, OFES, CORA, and AIPO1.0 were more consistent with the observations. Through systematic shortcomings and advantages in presenting the upper comparisons, we found that each dataset had its own OHC in the SCS.
基金National Natural Science Foundation of China,Grant/Award Number:72271237Building World-class Universities of Renmin University of China,Grant/Award Number:21XNF037。
文摘Deep learning has been increasingly popular in omics data analysis.Recent works incorporating variable selection into deep learning have greatly enhanced the model’s interpretability.However,because deep learning desires a large sample size,the existing methods may result in uncertain findings when the dataset has a small sample size,commonly seen in omics data analysis.With the explosion and availability of omics data from multiple populations/studies,the existing methods naively pool them into one dataset to enhance the sample size while ignoring that variable structures can differ across datasets,which might lead to inaccurate variable selection results.We propose a penalized integrative deep neural network(PIN)to simultaneously select important variables from multiple datasets.PIN directly aggregates multiple datasets as input and considers both homogeneity and heterogeneity situations among multiple datasets in an integrative analysis framework.Results from extensive simulation studies and applications of PIN to gene expression datasets from elders with different cognitive statuses or ovarian cancer patients at different stages demonstrate that PIN outperforms existing methods with considerably improved performance among multiple datasets.The source code is freely available on Github(rucliyang/PINFunc).We speculate that the proposed PIN method will promote the identification of disease-related important variables based on multiple studies/datasets from diverse origins.
基金Supported by the National Key Research and Development Program of China (2018YFC1506501, 2018YFA0605503, and2016YFB0501502)Special Program of Gaofen Satellites (04-Y30B01-9001-18/20-3-1)National Natural Science Foundation of China (41871230 and 41871231)。
文摘High spatial resolution and high temporal frequency fractional vegetation cover(FVC) products have been increasingly in demand to monitor and research land surface processes. This paper develops an algorithm to estimate FVC at a 30-m/15-day resolution over China by taking advantage of the spatial and temporal information from different types of sensors: the 30-m resolution sensor on the Chinese environment satellite(HJ-1) and the 1-km Moderate Resolution Imaging Spectroradiometer(MODIS). The algorithm was implemented for each main vegetation class and each land cover type over China. First, the high spatial resolution and high temporal frequency normalized difference vegetation index(NDVI) was acquired by using the continuous correction(CC) data assimilation method. Then, FVC was generated with a nonlinear pixel unmixing model. Model coefficients were obtained by statistical analysis of the MODIS NDVI. The proposed method was evaluated based on in situ FVC measurements and a global FVC product(GEOV1 FVC). Direct validation using in situ measurements at 97 sampling plots per half month in 2010 showed that the annual mean errors(MEs) of forest, cropland, and grassland were-0.025, 0.133, and 0.160, respectively, indicating that the FVCs derived from the proposed algorithm were consistent with ground measurements [R2 = 0.809,root-mean-square deviation(RMSD) = 0.065]. An intercomparison between the proposed FVC and GEOV1 FVC demonstrated that the two products had good spatial–temporal consistency and similar magnitude(RMSD approximates 0.1). Overall, the approach provides a new operational way to estimate high spatial resolution and high temporal frequency FVC from multiple remote sensing datasets.