In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining ...In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining the Tweedie regression model involves training on a centralized dataset,when the data is provided by multiple parties,training a privacy-preserving Tweedie regression model without exchanging raw data becomes a challenge.To address this issue,this study introduces a novel vertical federated learning-based Tweedie regression algorithm for multi-party auto insurance rate setting in data silos.The algorithm can keep sensitive data locally and uses privacy-preserving techniques to achieve intersection operations between the two parties holding the data.After determining which entities are shared,the participants train the model locally using the shared entity data to obtain the local generalized linear model intermediate parameters.The homomorphic encryption algorithms are introduced to interact with and update the model intermediate parameters to collaboratively complete the joint training of the car insurance rate-setting model.Performance tests on two publicly available datasets show that the proposed federated Tweedie regression algorithm can effectively generate Tweedie regression models that leverage the value of data fromboth partieswithout exchanging data.The assessment results of the scheme approach those of the Tweedie regressionmodel learned fromcentralized data,and outperformthe Tweedie regressionmodel learned independently by a single party.展开更多
基于Tweedie分布的广义线性模型(generalized linear model,简称GLM),并结合Kriging模型,发展了日降水量统计降尺度的GLM-Kriging模型.首先用GLM拟合研究区域内日降水量与数值模式输出的影响局地降水的物理量之间的关系,日降水量的空间...基于Tweedie分布的广义线性模型(generalized linear model,简称GLM),并结合Kriging模型,发展了日降水量统计降尺度的GLM-Kriging模型.首先用GLM拟合研究区域内日降水量与数值模式输出的影响局地降水的物理量之间的关系,日降水量的空间相关性反映在模型的残差中;然后用Kriging模型来拟合GLM的随机化百分位残差(randomized quantile residuals,简称RQ残差).结合NCEP再分析资料应用于2007年7月沂沭泗流域的42站日降水观测,结果表明GLM-Kriging降尺度模型较好地还原了主要降水过程,整体上取得了较高的准确度,可用于气候变化影响评估或数值天气预报产品的释用,还可进一步扩展为日降水量的时空统计模型.展开更多
【目的】提高保险领域中保单累积损失预测的准确率。传统的Tweedie回归模型只能对非零均值建立回归模型,却不能对零概率建立回归模型,从而导致该模型的拟合效果并不理想。【方法】考虑到保单损失数据中往往包含着大量的零索赔,此时可视...【目的】提高保险领域中保单累积损失预测的准确率。传统的Tweedie回归模型只能对非零均值建立回归模型,却不能对零概率建立回归模型,从而导致该模型的拟合效果并不理想。【方法】考虑到保单损失数据中往往包含着大量的零索赔,此时可视其为一种半连续型数据。因此,基于半连续两部模型,并考虑到累积损失中非零连续部分的分布类型,提出3种不同的累积损失预测模型,并结合一组实际损失数据进行模型对比分析。【结果】与Tweedie回归模型相比,本研究所提出的半连续两部回归模型的赤池信息准则值(Akaike information criterion,AIC)和贝叶斯信息量准则值(Bayesian information criterion,BIC)更小,具有较好的拟合效果。【结论】本研究结果可为保险领域中的保单累积损失预测提供参考。展开更多
A model to predict Incurred But Not Neported Claims Reserving(IBNR)is studied in this paper.Double generalized linear models are applied to fit claims numbers data and average claims sizes data,respectively.The mean s...A model to predict Incurred But Not Neported Claims Reserving(IBNR)is studied in this paper.Double generalized linear models are applied to fit claims numbers data and average claims sizes data,respectively.The mean square error of prediction is shown also.The model generalize that of Tweedie’s compound Poisson.Moreover,an example on Swiss Motor Insurance data is exhibited,which is shown more efficient.展开更多
基金This research was funded by the National Natural Science Foundation of China(No.62272124)the National Key Research and Development Program of China(No.2022YFB2701401)+3 种基金Guizhou Province Science and Technology Plan Project(Grant Nos.Qiankehe Paltform Talent[2020]5017)The Research Project of Guizhou University for Talent Introduction(No.[2020]61)the Cultivation Project of Guizhou University(No.[2019]56)the Open Fund of Key Laboratory of Advanced Manufacturing Technology,Ministry of Education(GZUAMT2021KF[01]).
文摘In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining the Tweedie regression model involves training on a centralized dataset,when the data is provided by multiple parties,training a privacy-preserving Tweedie regression model without exchanging raw data becomes a challenge.To address this issue,this study introduces a novel vertical federated learning-based Tweedie regression algorithm for multi-party auto insurance rate setting in data silos.The algorithm can keep sensitive data locally and uses privacy-preserving techniques to achieve intersection operations between the two parties holding the data.After determining which entities are shared,the participants train the model locally using the shared entity data to obtain the local generalized linear model intermediate parameters.The homomorphic encryption algorithms are introduced to interact with and update the model intermediate parameters to collaboratively complete the joint training of the car insurance rate-setting model.Performance tests on two publicly available datasets show that the proposed federated Tweedie regression algorithm can effectively generate Tweedie regression models that leverage the value of data fromboth partieswithout exchanging data.The assessment results of the scheme approach those of the Tweedie regressionmodel learned fromcentralized data,and outperformthe Tweedie regressionmodel learned independently by a single party.
文摘基于Tweedie分布的广义线性模型(generalized linear model,简称GLM),并结合Kriging模型,发展了日降水量统计降尺度的GLM-Kriging模型.首先用GLM拟合研究区域内日降水量与数值模式输出的影响局地降水的物理量之间的关系,日降水量的空间相关性反映在模型的残差中;然后用Kriging模型来拟合GLM的随机化百分位残差(randomized quantile residuals,简称RQ残差).结合NCEP再分析资料应用于2007年7月沂沭泗流域的42站日降水观测,结果表明GLM-Kriging降尺度模型较好地还原了主要降水过程,整体上取得了较高的准确度,可用于气候变化影响评估或数值天气预报产品的释用,还可进一步扩展为日降水量的时空统计模型.
文摘【目的】提高保险领域中保单累积损失预测的准确率。传统的Tweedie回归模型只能对非零均值建立回归模型,却不能对零概率建立回归模型,从而导致该模型的拟合效果并不理想。【方法】考虑到保单损失数据中往往包含着大量的零索赔,此时可视其为一种半连续型数据。因此,基于半连续两部模型,并考虑到累积损失中非零连续部分的分布类型,提出3种不同的累积损失预测模型,并结合一组实际损失数据进行模型对比分析。【结果】与Tweedie回归模型相比,本研究所提出的半连续两部回归模型的赤池信息准则值(Akaike information criterion,AIC)和贝叶斯信息量准则值(Bayesian information criterion,BIC)更小,具有较好的拟合效果。【结论】本研究结果可为保险领域中的保单累积损失预测提供参考。
文摘A model to predict Incurred But Not Neported Claims Reserving(IBNR)is studied in this paper.Double generalized linear models are applied to fit claims numbers data and average claims sizes data,respectively.The mean square error of prediction is shown also.The model generalize that of Tweedie’s compound Poisson.Moreover,an example on Swiss Motor Insurance data is exhibited,which is shown more efficient.