In this paper, model checking problem is considered for general linear model when covariables are measured with error and an independent validation data set is available, Without assuming any error model structure bet...In this paper, model checking problem is considered for general linear model when covariables are measured with error and an independent validation data set is available, Without assuming any error model structure between the true variable and the surrogate variable, the author first apply nonparametric method to model the relationship between the true variable and the surrogate variable with the help of the validation sample. Then the author construct a score-type test statistic through model adjustment. The large sample behaviors of the score-type test statistic are investigated. It is shown that the test is consistent and can detect the alternative hypothesis close to the null hypothesis at the rate n^-r with 0 ≤ r ≤1/2. Simulation results indicate that the proposed method works well. Key words General linear model, measurement error, model checking, validation data.展开更多
This paper considers partial function linear models of the form Y =∫X(t)β(t)dt + g(T)with Y measured with error. The authors propose an estimation procedure when the basis functions are data driven, such as with fun...This paper considers partial function linear models of the form Y =∫X(t)β(t)dt + g(T)with Y measured with error. The authors propose an estimation procedure when the basis functions are data driven, such as with functional principal components. Estimators of β(t) and g(t) with the primary data and validation data are presented and some asymptotic results are given. Finite sample properties are investigated through some simulation study and a real data application.展开更多
The goal was to perform the filling,consistency and processing of the rainfall time series data from 1943 to 2013 in five regions of the state.Data were obtained from several sources(ANA,CPRM,INMET,SERLA and LIGHT),to...The goal was to perform the filling,consistency and processing of the rainfall time series data from 1943 to 2013 in five regions of the state.Data were obtained from several sources(ANA,CPRM,INMET,SERLA and LIGHT),totaling 23 stations.The time series(raw data)showed failures that were filled with data from TRMM satellite via 3B43 product,and with the climatological normal from INMET.The 3B43 product was used from 1998 to 2013 and the climatological normal over the 1947-1997 period.Data were submitted to descriptive and exploratory analysis,parametric tests(Shapiro-Wilks and Bartlett),cluster analysis(CA),and data processing(Box Cox)in the 23 stations.Descriptive analysis of the raw data consistency showed a probability of occurrence above 75%(high time variability).Through the CA,two homogeneous rainfall groups(G1 and G2)were defined.The group G1 and G2 represent 77.01%and 22.99%of the rainfall occurring in SRJ,respectively.Box Cox Processing was effective in stabilizing the normality of the residuals and homogeneity of variance of the monthly rainfall time series of the five regions of the state.Data from 3B43 product and the climatological normal can be used as an alternative source of quality data for gap filling.展开更多
Machine learning advancements in healthcare have made data collected through smartphones and wearable devices a vital source of public health and medical insights.While wearable device data help to monitor,detect,and ...Machine learning advancements in healthcare have made data collected through smartphones and wearable devices a vital source of public health and medical insights.While wearable device data help to monitor,detect,and predict diseases and health conditions,some data owners hesitate to share such sensitive data with companies or researchers due to privacy concerns.Moreover,wearable devices have been recently available as commercial products;thus large,diverse,and representative datasets are not available to most researchers.In this article,the authors propose an open marketplace where wearable device users securely monetize their wearable device records by sharing data with consumers(e.g.,researchers)to make wearable device data more available to healthcare researchers.To secure the data transactions in a privacy-preserving manner,the authors use a decentralized approach using Blockchain and Non-Fungible Tokens(NFTs).To ensure data originality and integrity with secure validation,the marketplace uses Trusted Execution Environments(TEE)in wearable devices to verify the correctness of health data.The marketplace also allows researchers to train models using Federated Learning with a TEE-backed secure aggregation of data users may not be willing to share.To ensure user participation,we model incentive mechanisms for the Federated Learning-based and anonymized data-sharing approaches using NFTs.The authors also propose using payment channels and batching to reduce smart contact gas fees and optimize user profits.If widely adopted,it’s believed that TEE and Blockchain-based incentives will promote the ethical use of machine learning with validated wearable device data in healthcare and improve user participation due to incentives.展开更多
Wired drill pipe(WDP)technology is one of the most promising data acquisition technologies in today s oil and gas industry.For the first time it allows sensors to be positioned along the drill string which enables c...Wired drill pipe(WDP)technology is one of the most promising data acquisition technologies in today s oil and gas industry.For the first time it allows sensors to be positioned along the drill string which enables collecting and transmitting valuable data not only from the bottom hole assembly(BHA),but also along the entire length of the wellbore to the drill floor.The technology has received industry acceptance as a viable alternative to the typical logging while drilling(LWD)method.Recently more and more WDP applications can be found in the challenging drilling environments around the world,leading to many innovations to the industry.Nevertheless most of the data acquired from WDP can be noisy and in some circumstances of very poor quality.Diverse factors contribute to the poor data quality.Most common sources include mis-calibrated sensors,sensor drifting,errors during data transmission,or some abnormal conditions in the well,etc.The challenge of improving the data quality has attracted more and more focus from many researchers during the past decade.This paper has proposed a promising solution to address such challenge by making corrections of the raw WDP data and estimating unmeasurable parameters to reveal downhole behaviors.An advanced data processing method,data validation and reconciliation(DVR)has been employed,which makes use of the redundant data from multiple WDP sensors to filter/remove the noise from the measurements and ensures the coherence of all sensors and models.Moreover it has the ability to distinguish the accurate measurements from the inaccurate ones.In addition,the data with improved quality can be used for estimating some crucial parameters in the drilling process which are unmeasurable in the first place,hence provide better model calibrations for integrated well planning and realtime operations.展开更多
This report briefly introduces the current status of the CSES(China Seismo-Electromagnetic Satellite)mission which includes the first satellite CSES 01 in-orbit(launched in February 2018),and the second satellite CSES...This report briefly introduces the current status of the CSES(China Seismo-Electromagnetic Satellite)mission which includes the first satellite CSES 01 in-orbit(launched in February 2018),and the second satellite CSES 02(will be launched in 2023)under development.The CSES 01 has been steadily operating in orbit for over four years,providing abundant global geophysical field data,including the background geomagnetic field,the electromagnetic field and wave,the plasma(in-situ and profile data),and the energetic particles in the ionosphere.The CSES 01 platform and the scientific instruments generally perform well.The data validation and calibration are vital for CSES 01,for it aims to monitor earthquakes by extracting the very weak seismic precursors from a relatively disturbing space electromagnetic environment.For this purpose,we are paying specific efforts to validate data quality comprehensively.From the CSES 01 observations,we have obtained many scientific results on the ionosphere electromagnetic environment,the seismo-ionospheric disturbance phenomena,the space weather process,and the Lithosphere-Atmosphere-Ionosphere coupling mechanism.展开更多
In this paper, linear errors-in-response models are considered in the presence of validation data on the responses. A semiparametric dimension reduction technique is employed to define an estimator of β with asymptot...In this paper, linear errors-in-response models are considered in the presence of validation data on the responses. A semiparametric dimension reduction technique is employed to define an estimator of β with asymptotic normality, the estimated empirical loglikelihoods and the adjusted empirical loglikelihoods for the vector of regression coefficients and linear combinations of the regression coefficients, respectively. The estimated empirical log-likelihoods are shown to be asymptotically distributed as weighted sums of independent X 2 1 and the adjusted empirical loglikelihoods are proved to be asymptotically distributed as standard chi-squares, respectively.展开更多
This paper presents the building process of an interactive instrument called the Colombian Solar Atlas able to easily visualize meteorological data but also assess the current and future potentials of solar photovolta...This paper presents the building process of an interactive instrument called the Colombian Solar Atlas able to easily visualize meteorological data but also assess the current and future potentials of solar photovoltaic generation throughout the whole territory of Colombia,South America.This new tool is based on two different meteorological databases.The first one is done with historical data extracted from satellite imagery information,and the other one corresponds to data issues from regional-scale climate change projection models.The satellite database was validated with different in-situ measurements.The Colombian Solar Atlas uses basic and advanced photovoltaic generation models to estimate the generation of a custom solar installation.With this tool,a user selects a point on the map and can have directly pertinent information to search for an optimal location with a spatial resolution of 4 km2.This tool is the first open interactive online tool particularly adapted to study the photovoltaic power potential in Colombia,considering the country’s needs and native language.展开更多
Malaria infection continues to be a major problem in many parts of the world including Africa.Environmental variables are known to significantly affect the population dynamics and abundance of insects,major catalysts ...Malaria infection continues to be a major problem in many parts of the world including Africa.Environmental variables are known to significantly affect the population dynamics and abundance of insects,major catalysts of vector-borne diseases,but the exact extent and consequences of this sensitivity are not yet well established.To assess the impact of the variability in temperature and rainfall on the transmission dynamics of malaria in a population,we propose a model consisting of a system of non-autonomous deterministic equations that incorporate the effect of both temperature and rainfall to the dispersion and mortality rate of adult mosquitoes.The model has been validated using epidemiological data collected from the western region of Ethiopia by considering the trends for the cases of malaria and the climate variation in the region.Further,a mathematical analysis is performed to assess the impact of temperature and rainfall change on the transmission dynamics of the model.The periodic variation of seasonal variables as well as the non-periodic variation due to the long-term climate variation have been incorporated and analyzed.In both periodic and non-periodic cases,it has been shown that the disease-free solution of the model is globally asymptotically stable when the basic reproduction ratio is less than unity in the periodic system and when the threshold function is less than unity in the non-periodic system.The disease is uniformly persistent when the basic reproduction ratio is greater than unity in the periodic system and when the threshold function is greater than unity in the non-periodic system.展开更多
基金This research is supported by the National Natural Science Foundation of China (10901162, 10926073), the President Fund of GUCAS and China Postdoctoral Science Foundation.
文摘In this paper, model checking problem is considered for general linear model when covariables are measured with error and an independent validation data set is available, Without assuming any error model structure between the true variable and the surrogate variable, the author first apply nonparametric method to model the relationship between the true variable and the surrogate variable with the help of the validation sample. Then the author construct a score-type test statistic through model adjustment. The large sample behaviors of the score-type test statistic are investigated. It is shown that the test is consistent and can detect the alternative hypothesis close to the null hypothesis at the rate n^-r with 0 ≤ r ≤1/2. Simulation results indicate that the proposed method works well. Key words General linear model, measurement error, model checking, validation data.
基金supported by the National Natural Science Foundation of China under Grant Nos.11561006 and 11471127Master Foundation of Guangxi University of Technology under Grant No.070235+2 种基金Doctoral Foundation of Guangxi University of Science and Technology under Grant No.14Z07Research Projects of Colleges and Universities in Guangxi under Grant No.KY2015YB171the Open Fund Project of Guangxi Colleges and Universities Key Laboratory of Mathematics and Statistical Model under Grant No.2016GXKLMS005
文摘This paper considers partial function linear models of the form Y =∫X(t)β(t)dt + g(T)with Y measured with error. The authors propose an estimation procedure when the basis functions are data driven, such as with functional principal components. Estimators of β(t) and g(t) with the primary data and validation data are presented and some asymptotic results are given. Finite sample properties are investigated through some simulation study and a real data application.
基金The authors acknowledge the Agência Nacional deÁguas(ANA),Companhia de Pesquisa de Recursos Minerais(CPRM)Instituto Nacional de Meteorologia(INMET)+2 种基金SERLA Fundação Superintendência Estadual de Rios e Lagoas and Light Serviços de Eletricidade S/A by gently give in the data for composing the rainfall time seriesthe CAPES for granting Doctorate scholarshipThe second author thanks Brazilian National Council for Scientific and Technological Development(CNPq)for granting the Research Productivity Fellowship level 2(309681/2019-7).
文摘The goal was to perform the filling,consistency and processing of the rainfall time series data from 1943 to 2013 in five regions of the state.Data were obtained from several sources(ANA,CPRM,INMET,SERLA and LIGHT),totaling 23 stations.The time series(raw data)showed failures that were filled with data from TRMM satellite via 3B43 product,and with the climatological normal from INMET.The 3B43 product was used from 1998 to 2013 and the climatological normal over the 1947-1997 period.Data were submitted to descriptive and exploratory analysis,parametric tests(Shapiro-Wilks and Bartlett),cluster analysis(CA),and data processing(Box Cox)in the 23 stations.Descriptive analysis of the raw data consistency showed a probability of occurrence above 75%(high time variability).Through the CA,two homogeneous rainfall groups(G1 and G2)were defined.The group G1 and G2 represent 77.01%and 22.99%of the rainfall occurring in SRJ,respectively.Box Cox Processing was effective in stabilizing the normality of the residuals and homogeneity of variance of the monthly rainfall time series of the five regions of the state.Data from 3B43 product and the climatological normal can be used as an alternative source of quality data for gap filling.
文摘Machine learning advancements in healthcare have made data collected through smartphones and wearable devices a vital source of public health and medical insights.While wearable device data help to monitor,detect,and predict diseases and health conditions,some data owners hesitate to share such sensitive data with companies or researchers due to privacy concerns.Moreover,wearable devices have been recently available as commercial products;thus large,diverse,and representative datasets are not available to most researchers.In this article,the authors propose an open marketplace where wearable device users securely monetize their wearable device records by sharing data with consumers(e.g.,researchers)to make wearable device data more available to healthcare researchers.To secure the data transactions in a privacy-preserving manner,the authors use a decentralized approach using Blockchain and Non-Fungible Tokens(NFTs).To ensure data originality and integrity with secure validation,the marketplace uses Trusted Execution Environments(TEE)in wearable devices to verify the correctness of health data.The marketplace also allows researchers to train models using Federated Learning with a TEE-backed secure aggregation of data users may not be willing to share.To ensure user participation,we model incentive mechanisms for the Federated Learning-based and anonymized data-sharing approaches using NFTs.The authors also propose using payment channels and batching to reduce smart contact gas fees and optimize user profits.If widely adopted,it’s believed that TEE and Blockchain-based incentives will promote the ethical use of machine learning with validated wearable device data in healthcare and improve user participation due to incentives.
基金supported by University of Stavanger, NorwaySINTEF,the Center for Integrated Operations in the Petroleum Industry and the management of National Oilwell Varco Intelli Serv
文摘Wired drill pipe(WDP)technology is one of the most promising data acquisition technologies in today s oil and gas industry.For the first time it allows sensors to be positioned along the drill string which enables collecting and transmitting valuable data not only from the bottom hole assembly(BHA),but also along the entire length of the wellbore to the drill floor.The technology has received industry acceptance as a viable alternative to the typical logging while drilling(LWD)method.Recently more and more WDP applications can be found in the challenging drilling environments around the world,leading to many innovations to the industry.Nevertheless most of the data acquired from WDP can be noisy and in some circumstances of very poor quality.Diverse factors contribute to the poor data quality.Most common sources include mis-calibrated sensors,sensor drifting,errors during data transmission,or some abnormal conditions in the well,etc.The challenge of improving the data quality has attracted more and more focus from many researchers during the past decade.This paper has proposed a promising solution to address such challenge by making corrections of the raw WDP data and estimating unmeasurable parameters to reveal downhole behaviors.An advanced data processing method,data validation and reconciliation(DVR)has been employed,which makes use of the redundant data from multiple WDP sensors to filter/remove the noise from the measurements and ensures the coherence of all sensors and models.Moreover it has the ability to distinguish the accurate measurements from the inaccurate ones.In addition,the data with improved quality can be used for estimating some crucial parameters in the drilling process which are unmeasurable in the first place,hence provide better model calibrations for integrated well planning and realtime operations.
基金Supported by the National Natural Science Foundation of China(4187417,42104159)National Key R&D Program of China(2018YFC1503501)+1 种基金the APSCO Earthquake Research Project PhaseⅡthe Dragon 5 cooperation 2020-2024(ID.59236)。
文摘This report briefly introduces the current status of the CSES(China Seismo-Electromagnetic Satellite)mission which includes the first satellite CSES 01 in-orbit(launched in February 2018),and the second satellite CSES 02(will be launched in 2023)under development.The CSES 01 has been steadily operating in orbit for over four years,providing abundant global geophysical field data,including the background geomagnetic field,the electromagnetic field and wave,the plasma(in-situ and profile data),and the energetic particles in the ionosphere.The CSES 01 platform and the scientific instruments generally perform well.The data validation and calibration are vital for CSES 01,for it aims to monitor earthquakes by extracting the very weak seismic precursors from a relatively disturbing space electromagnetic environment.For this purpose,we are paying specific efforts to validate data quality comprehensively.From the CSES 01 observations,we have obtained many scientific results on the ionosphere electromagnetic environment,the seismo-ionospheric disturbance phenomena,the space weather process,and the Lithosphere-Atmosphere-Ionosphere coupling mechanism.
基金This work was supported by the National Natural Science Foundation of China(Key Grant 10231030,Special Grant 10241001)Humboldt-Universitat Berlin-Sonderforschungsbereich 373.
文摘In this paper, linear errors-in-response models are considered in the presence of validation data on the responses. A semiparametric dimension reduction technique is employed to define an estimator of β with asymptotic normality, the estimated empirical loglikelihoods and the adjusted empirical loglikelihoods for the vector of regression coefficients and linear combinations of the regression coefficients, respectively. The estimated empirical log-likelihoods are shown to be asymptotically distributed as weighted sums of independent X 2 1 and the adjusted empirical loglikelihoods are proved to be asymptotically distributed as standard chi-squares, respectively.
文摘This paper presents the building process of an interactive instrument called the Colombian Solar Atlas able to easily visualize meteorological data but also assess the current and future potentials of solar photovoltaic generation throughout the whole territory of Colombia,South America.This new tool is based on two different meteorological databases.The first one is done with historical data extracted from satellite imagery information,and the other one corresponds to data issues from regional-scale climate change projection models.The satellite database was validated with different in-situ measurements.The Colombian Solar Atlas uses basic and advanced photovoltaic generation models to estimate the generation of a custom solar installation.With this tool,a user selects a point on the map and can have directly pertinent information to search for an optimal location with a spatial resolution of 4 km2.This tool is the first open interactive online tool particularly adapted to study the photovoltaic power potential in Colombia,considering the country’s needs and native language.
文摘Malaria infection continues to be a major problem in many parts of the world including Africa.Environmental variables are known to significantly affect the population dynamics and abundance of insects,major catalysts of vector-borne diseases,but the exact extent and consequences of this sensitivity are not yet well established.To assess the impact of the variability in temperature and rainfall on the transmission dynamics of malaria in a population,we propose a model consisting of a system of non-autonomous deterministic equations that incorporate the effect of both temperature and rainfall to the dispersion and mortality rate of adult mosquitoes.The model has been validated using epidemiological data collected from the western region of Ethiopia by considering the trends for the cases of malaria and the climate variation in the region.Further,a mathematical analysis is performed to assess the impact of temperature and rainfall change on the transmission dynamics of the model.The periodic variation of seasonal variables as well as the non-periodic variation due to the long-term climate variation have been incorporated and analyzed.In both periodic and non-periodic cases,it has been shown that the disease-free solution of the model is globally asymptotically stable when the basic reproduction ratio is less than unity in the periodic system and when the threshold function is less than unity in the non-periodic system.The disease is uniformly persistent when the basic reproduction ratio is greater than unity in the periodic system and when the threshold function is greater than unity in the non-periodic system.