In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining ...In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining the Tweedie regression model involves training on a centralized dataset,when the data is provided by multiple parties,training a privacy-preserving Tweedie regression model without exchanging raw data becomes a challenge.To address this issue,this study introduces a novel vertical federated learning-based Tweedie regression algorithm for multi-party auto insurance rate setting in data silos.The algorithm can keep sensitive data locally and uses privacy-preserving techniques to achieve intersection operations between the two parties holding the data.After determining which entities are shared,the participants train the model locally using the shared entity data to obtain the local generalized linear model intermediate parameters.The homomorphic encryption algorithms are introduced to interact with and update the model intermediate parameters to collaboratively complete the joint training of the car insurance rate-setting model.Performance tests on two publicly available datasets show that the proposed federated Tweedie regression algorithm can effectively generate Tweedie regression models that leverage the value of data fromboth partieswithout exchanging data.The assessment results of the scheme approach those of the Tweedie regressionmodel learned fromcentralized data,and outperformthe Tweedie regressionmodel learned independently by a single party.展开更多
In recent years,Konosirus punctatus has accounted for a large portion in catch composition and become important economic species in the South Yellow Sea.However,the distribution of K.punctatus early life stages is sti...In recent years,Konosirus punctatus has accounted for a large portion in catch composition and become important economic species in the South Yellow Sea.However,the distribution of K.punctatus early life stages is still poorly understood.In this study,generalized additive models with Tweedie distribution were used to analyze the relationships between K.punctatus ichthyoplankton and environmental factors(longitude and latitude,sea surface temperature(SST),sea surface salinity(SSS)and depth),and predict distribution K.punctatus spawning ground and nursing ground,based on samplings collected in 6 months during 2014–2017.The results showed that K.punctatus’spawning ground were mainly distributed in central and north study area(from 33.0°N to 37.0°N).By comparison,the nursing ground shifted southward,which were approximately located along central and south coast of study area(from 31.7°N to 35.5°N).The optimal models identified that suitable SST,SSS and depth for eggs were 19–26℃,25–30 and 9–23 m,respectively.The suitable SSS for larvae were 29–31.The K.punctatus spawning habit might have changed in the past decades,which was a response to increasing SST and fishing pressure.That needs to be proved in further study.The study provides references of conservation and exploitation for K.punctatus.展开更多
A probabilistic precipitation forecasting model using generalized additive models (GAMs) and Bayesian model averaging (BMA) was proposed in this paper. GAMs were used to fit the spatial-temporal precipitation mode...A probabilistic precipitation forecasting model using generalized additive models (GAMs) and Bayesian model averaging (BMA) was proposed in this paper. GAMs were used to fit the spatial-temporal precipitation models to individual ensemble member forecasts. The distributions of the precipitation occurrence and the cumulative precipitation amount were represented simultaneously by a single Tweedie distribution. BMA was then used as a post-processing method to combine the individual models to form a more skillful probabilistic forecasting model. The mixing weights were estimated using the expectation-maximization algorithm. The residual diagnostics was used to examine if the fitted BMA forecasting model had fully captured the spatial and temporal variations of precipitation. The proposed method was applied to daily observations at the Yishusi River basin for July 2007 using the National Centers for Environmental Prediction ensemble forecasts. By applying scoring rules, the BMA forecasts were verified and showed better performances compared with the empirical probabilistic ensemble forecasts, particularly for extreme precipitation. Finally, possible improvements and a^plication of this method to the downscaling of climate change scenarios were discussed.展开更多
基金This research was funded by the National Natural Science Foundation of China(No.62272124)the National Key Research and Development Program of China(No.2022YFB2701401)+3 种基金Guizhou Province Science and Technology Plan Project(Grant Nos.Qiankehe Paltform Talent[2020]5017)The Research Project of Guizhou University for Talent Introduction(No.[2020]61)the Cultivation Project of Guizhou University(No.[2019]56)the Open Fund of Key Laboratory of Advanced Manufacturing Technology,Ministry of Education(GZUAMT2021KF[01]).
文摘In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining the Tweedie regression model involves training on a centralized dataset,when the data is provided by multiple parties,training a privacy-preserving Tweedie regression model without exchanging raw data becomes a challenge.To address this issue,this study introduces a novel vertical federated learning-based Tweedie regression algorithm for multi-party auto insurance rate setting in data silos.The algorithm can keep sensitive data locally and uses privacy-preserving techniques to achieve intersection operations between the two parties holding the data.After determining which entities are shared,the participants train the model locally using the shared entity data to obtain the local generalized linear model intermediate parameters.The homomorphic encryption algorithms are introduced to interact with and update the model intermediate parameters to collaboratively complete the joint training of the car insurance rate-setting model.Performance tests on two publicly available datasets show that the proposed federated Tweedie regression algorithm can effectively generate Tweedie regression models that leverage the value of data fromboth partieswithout exchanging data.The assessment results of the scheme approach those of the Tweedie regressionmodel learned fromcentralized data,and outperformthe Tweedie regressionmodel learned independently by a single party.
基金The Public Science and Technology Research Funds Projects of Ocean under contract No.201305030the National Natural Science Foundation of China under contract No.41930535。
文摘In recent years,Konosirus punctatus has accounted for a large portion in catch composition and become important economic species in the South Yellow Sea.However,the distribution of K.punctatus early life stages is still poorly understood.In this study,generalized additive models with Tweedie distribution were used to analyze the relationships between K.punctatus ichthyoplankton and environmental factors(longitude and latitude,sea surface temperature(SST),sea surface salinity(SSS)and depth),and predict distribution K.punctatus spawning ground and nursing ground,based on samplings collected in 6 months during 2014–2017.The results showed that K.punctatus’spawning ground were mainly distributed in central and north study area(from 33.0°N to 37.0°N).By comparison,the nursing ground shifted southward,which were approximately located along central and south coast of study area(from 31.7°N to 35.5°N).The optimal models identified that suitable SST,SSS and depth for eggs were 19–26℃,25–30 and 9–23 m,respectively.The suitable SSS for larvae were 29–31.The K.punctatus spawning habit might have changed in the past decades,which was a response to increasing SST and fishing pressure.That needs to be proved in further study.The study provides references of conservation and exploitation for K.punctatus.
基金Supported by the National Basic Research and Development (973) Program of China (2010CB428402)China Meteorological Administration Special Public Welfare Research Fund (GYHY200706001)
文摘A probabilistic precipitation forecasting model using generalized additive models (GAMs) and Bayesian model averaging (BMA) was proposed in this paper. GAMs were used to fit the spatial-temporal precipitation models to individual ensemble member forecasts. The distributions of the precipitation occurrence and the cumulative precipitation amount were represented simultaneously by a single Tweedie distribution. BMA was then used as a post-processing method to combine the individual models to form a more skillful probabilistic forecasting model. The mixing weights were estimated using the expectation-maximization algorithm. The residual diagnostics was used to examine if the fitted BMA forecasting model had fully captured the spatial and temporal variations of precipitation. The proposed method was applied to daily observations at the Yishusi River basin for July 2007 using the National Centers for Environmental Prediction ensemble forecasts. By applying scoring rules, the BMA forecasts were verified and showed better performances compared with the empirical probabilistic ensemble forecasts, particularly for extreme precipitation. Finally, possible improvements and a^plication of this method to the downscaling of climate change scenarios were discussed.