This paper considers the rational expectations model with multiplicative noise and input delay,where the system dynamics rely on the conditional expectations of future states.The main contribution is to obtain a suffi...This paper considers the rational expectations model with multiplicative noise and input delay,where the system dynamics rely on the conditional expectations of future states.The main contribution is to obtain a sufficient condition for the exact controllability of the rational expectations model.In particular,we derive a sufficient Gramian matrix condition and a rank condition for the delay-free case.The key is the solvability of the backward stochastic difference equations with input delay which is derived from the forward and backward stochastic system.展开更多
In this paper, we consider the dual risk model in which periodic taxation are paid according to a loss-carry-forward system and dividends are paid under a threshold strategy. We give an analytical approach to derive t...In this paper, we consider the dual risk model in which periodic taxation are paid according to a loss-carry-forward system and dividends are paid under a threshold strategy. We give an analytical approach to derive the expression of gδ(u) (i.e. the Laplace transform of the first upper exit time). We discuss the expected discounted tax payments for this model and obtain its corresponding integro-differential equations. Finally, for Erlang (2) inter-innovation distribution, closedform expressions for the expected discounted tax payments are given.展开更多
The object of this study is to propose a statistical model for predicting the Expected Path Length (expected number of steps the attacker will take, starting from the initial state to compromise the security goal—EPL...The object of this study is to propose a statistical model for predicting the Expected Path Length (expected number of steps the attacker will take, starting from the initial state to compromise the security goal—EPL) in a cyber-attack. The model we developed is based on utilizing vulnerability information along with having host centric attack graph. Utilizing the developed model, one can identify the interaction among the vulnerabilities and individual variables (risk factors) that drive the Expected Path Length. Gaining a better understanding of the relationship between vulnerabilities and their interactions can provide security administrators a better view and an understanding of their security status. In addition, we have also ranked the attributable variables and their contribution in estimating the subject length. Thus, one can utilize the ranking process to take precautions and actions to minimize Expected Path Length.展开更多
Objective: Improvement in cancer survival over recent decades has not been accompanied by a narrowing of socioeconomic disparities. This study aimed to quantify the loss of life expectancy(LOLE) resulting from a cance...Objective: Improvement in cancer survival over recent decades has not been accompanied by a narrowing of socioeconomic disparities. This study aimed to quantify the loss of life expectancy(LOLE) resulting from a cancer diagnosis and examine disparities in LOLE based on area-level socioeconomic status(SES).Methods: Data were collected for all people between 50 and 89 years of age who were diagnosed with cancer, registered in the NSW Cancer Registry between 2001 and 2019, and underwent mortality follow-up evaluations until December 2020. Flexible parametric survival models were fitted to estimate the LOLE by gender and area-level SES for 12 common cancers.Results: Of 422,680 people with cancer, 24% and 18% lived in the most and least disadvantaged areas, respectively. Patients from the most disadvantaged areas had a significantly greater average LOLE than patients from the least disadvantaged areas for cancers with high survival rates, including prostate [2.9 years(95% CI: 2.5±3.2 years) vs. 1.6 years(95% CI: 1.3±1.9 years)] and breast cancer [1.6 years(95% CI: 1.4±1.8 years) vs. 1.2 years(95% CI: 1.0±1.4 years)]. The highest average LOLE occurred in males residing in the most disadvantaged areas with pancreatic [16.5 years(95% CI: 16.1±16.8 years) vs. 16.2 years(95% CI: 15.7±16.7 years)] and liver cancer [15.5 years(95% CI: 15.0±16.0 years) vs. 14.7 years(95% CI: 14.0±15.5 years)]. Females residing in the least disadvantaged areas with thyroid cancer [0.9 years(95% CI: 0.4±1.4 years) vs. 0.6 years(95% CI: 0.2±1.0 years)] or melanoma [0.9 years(95% CI: 0.8±1.1 years) vs. 0.7 years(95% CI: 0.5±0.8 years)] had the lowest average LOLE.Conclusions: Patients from the most disadvantaged areas had the highest LOLE with SES-based differences greatest for patients diagnosed with cancer at an early stage or cancers with higher survival rates, suggesting the need to prioritise early detection and reduce treatment-related barriers and survivorship challenges to improve life expectancy.展开更多
In this paper, we consider a risk model in which two types of individual claims, main claims and by-claims, are defined. Every by-claim is induced by the main claim randomly and may be delayed for one time period with...In this paper, we consider a risk model in which two types of individual claims, main claims and by-claims, are defined. Every by-claim is induced by the main claim randomly and may be delayed for one time period with a certain probability. The dividend policy that certain amount of dividends will be paid as long as the surplus is greater than a constant dividend barrier is also introduced into this delayed claims risk model. By means of the probability generating functions, formulae for the expected present value of total dividend payments prior to ruin are obtained for discrete-type individual claims. Explicit expressions for the corresponding results are derived for K n claim amount distributions. Numerical illustrations are also given.展开更多
In the current environment of increasingly fierce competition in the tourism industry,service quality has become crucial for enhancing the competitiveness of scenic spots.This paper uses the SERVQUAL model to design a...In the current environment of increasingly fierce competition in the tourism industry,service quality has become crucial for enhancing the competitiveness of scenic spots.This paper uses the SERVQUAL model to design a service quality evaluation questionnaire that captures the gap between tourists’expectations and perceptions,using the Ciqikou Scenic Spot as a case study.Data collected from field surveys are used to comprehensively and meticulously evaluate the service quality of the Ciqikou Scenic Spot.The analysis results show that the scenic spot,with its unique folk culture experience and beautiful ecological environment,has certain advantages in terms of service quality.However,significant deficiencies exist in infrastructure and environmental hygiene.Accordingly,targeted improvement suggestions are proposed to further enhance the service quality of the Ciqikou Scenic Spot and meet the increasingly diverse and personalized needs of tourists.This study provides not only a specific service quality improvement strategy for the Ciqikou Scenic Spot but also a valuable reference for other tourist attractions.展开更多
Classical survival analysis assumes all subjects will experience the event of interest, but in some cases, a portion of the population may never encounter the event. These survival methods further assume independent s...Classical survival analysis assumes all subjects will experience the event of interest, but in some cases, a portion of the population may never encounter the event. These survival methods further assume independent survival times, which is not valid for honey bees, which live in nests. The study introduces a semi-parametric marginal proportional hazards mixture cure (PHMC) model with exchangeable correlation structure, using generalized estimating equations for survival data analysis. The model was tested on clustered right-censored bees survival data with a cured fraction, where two bee species were subjected to different entomopathogens to test the effect of the entomopathogens on the survival of the bee species. The Expectation-Solution algorithm is used to estimate the parameters. The study notes a weak positive association between cure statuses (ρ1=0.0007) and survival times for uncured bees (ρ2=0.0890), emphasizing their importance. The odds of being uncured for A. mellifera is higher than the odds for species M. ferruginea. The bee species, A. mellifera are more susceptible to entomopathogens icipe 7, icipe 20, and icipe 69. The Cox-Snell residuals show that the proposed semiparametric PH model generally fits the data well as compared to model that assume independent correlation structure. Thus, the semi parametric marginal proportional hazards mixture cure is parsimonious model for correlated bees survival data.展开更多
Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear mode...Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.展开更多
This article analyzes the Pareto optimal allocations,agreeable trades and agreeable bets under the maxmin Choquet expected utility(MCEU)model.We provide several useful characterizations for Pareto optimal allocations ...This article analyzes the Pareto optimal allocations,agreeable trades and agreeable bets under the maxmin Choquet expected utility(MCEU)model.We provide several useful characterizations for Pareto optimal allocations for risk averse agents.We derive the formulation descriptions for non-existence agreeable trades or agreeable bets for risk neutral agents.We build some relationships between ex-ante stage and interim stage on agreeable trades or bets when new information arrives.展开更多
Depression in later life is an underrepresented yet important research area. The aim of the study was to explore depressed older persons’ need for and expectations of improved health services one year after implement...Depression in later life is an underrepresented yet important research area. The aim of the study was to explore depressed older persons’ need for and expectations of improved health services one year after implementation of the Chronic Care Model (CCM). A qualitative evaluative design was used. Data were collected through individual interviews with older persons living in Norway. The qualitative content analysis revealed two themes: The need to be safeguarded and Expectation of being considered valuable and capable. Evaluation of the improvement in care with focus on the CCM components showed that the most important components for improving the depressed older person’s daily life were: delivery system re-design, self-management support, productive interaction and a well-informed active patient. The findings highlight the need for a health services designed for persons suffering from chronic ill-health, where the CCM could serve as a framework for policy change and support the redesign of the existing healthcare system. We conclude that older persons with depression need attention, especially those who have been suffering for many years. The identified components may have implications for health professionals in the promotion of mental healthcare.展开更多
In this current paper, the exposure time effects on four endocrine disruptors and teleost fishes were evaluated using the reduced life expectancy (RLE) model based on the effect concentration (EC<sub>50</sub&...In this current paper, the exposure time effects on four endocrine disruptors and teleost fishes were evaluated using the reduced life expectancy (RLE) model based on the effect concentration (EC<sub>50</sub>) of available literature published. The result on the regression analysis over different exposure times has demonstrated that the EC<sub>50</sub> of hepatic biomarkers falls with increasing exposure times in a predictable manner. The slopes of the regression equations reflect the strength of the toxic effects on the various teleost fish. The EC<sub>50</sub> reduction over time can be interpreted based on the bioconcentration process, which can be used to understand transfer routes of the compounds from water to fish body. RLE model also provides useful information in assessing the toxic effects on fish life expectancy as a result of the occurrence of compounds.展开更多
At present, there are significant regional differences in average life expectancy among countries in the world. Not only is there a great disparity in average life expectancy, but also the gender difference is positiv...At present, there are significant regional differences in average life expectancy among countries in the world. Not only is there a great disparity in average life expectancy, but also the gender difference is positive and negative, and is distributed in a bipolar distribution of “long life in rich countries and short life in poor countries”. This paper analyzes the factors affecting the life grade by using the ordered multivariate discrete selection model and combined with the average life expectancy data of countries all over the world in 2017. The test results show that: 1) The growth of per capita GDP, elderly dependency ratio and the proportion of people using at least basic drinking water services can effectively improve the level of life expectancy;2) The birth rate has an inhibitory effect on the average life expectancy;3) Through model comparison, probit model is more suitable for the analysis of this kind of problems than logit model, and the properties of the obtained model are better.展开更多
Intrusion detection is the investigation process of information about the system activities or its data to detect any malicious behavior or unauthorized activity.Most of the IDS implement K-means clustering technique ...Intrusion detection is the investigation process of information about the system activities or its data to detect any malicious behavior or unauthorized activity.Most of the IDS implement K-means clustering technique due to its linear complexity and fast computing ability.Nonetheless,it is Naïve use of the mean data value for the cluster core that presents a major drawback.The chances of two circular clusters having different radius and centering at the same mean will occur.This condition cannot be addressed by the K-means algorithm because the mean value of the various clusters is very similar together.However,if the clusters are not spherical,it fails.To overcome this issue,a new integrated hybrid model by integrating expectation maximizing(EM)clustering using a Gaussian mixture model(GMM)and naïve Bays classifier have been proposed.In this model,GMM give more flexibility than K-Means in terms of cluster covariance.Also,they use probabilities function and soft clustering,that’s why they can have multiple cluster for a single data.In GMM,we can define the cluster form in GMM by two parameters:the mean and the standard deviation.This means that by using these two parameters,the cluster can take any kind of elliptical shape.EM-GMM will be used to cluster data based on data activity into the corresponding category.展开更多
Cyber losses in terms of number of records breached under cyber incidents commonly feature a significant portion of zeros, specific characteristics of mid-range losses and large losses, which make it hard to model the...Cyber losses in terms of number of records breached under cyber incidents commonly feature a significant portion of zeros, specific characteristics of mid-range losses and large losses, which make it hard to model the whole range of the losses using a standard loss distribution. We tackle this modeling problem by proposing a three-component spliced regression model that can simultaneously model zeros, moderate and large losses and consider heterogeneous effects in mixture components. To apply our proposed model to Privacy Right Clearinghouse (PRC) data breach chronology, we segment geographical groups using unsupervised cluster analysis, and utilize a covariate-dependent probability to model zero losses, finite mixture distributions for moderate body and an extreme value distribution for large losses capturing the heavy-tailed nature of the loss data. Parameters and coefficients are estimated using the Expectation-Maximization (EM) algorithm. Combining with our frequency model (generalized linear mixed model) for data breaches, aggregate loss distributions are investigated and applications on cyber insurance pricing and risk management are discussed.展开更多
基金supported by the National Natural Science Foundation of China under Grants 61821004,62250056,62350710214,U23A20325,62350055the Natural Science Foundation of Shandong Province,China(ZR2021ZD14,ZR2021JQ24)+2 种基金High-level Talent Team Project of Qingdao West Coast New Area,China(RCTD-JC-2019-05)Key Research and Development Program of Shandong Province,China(2020CXGC01208)Science and Technology Project of Qingdao West Coast New Area,China(2019-32,2020-20,2020-1-4).
文摘This paper considers the rational expectations model with multiplicative noise and input delay,where the system dynamics rely on the conditional expectations of future states.The main contribution is to obtain a sufficient condition for the exact controllability of the rational expectations model.In particular,we derive a sufficient Gramian matrix condition and a rank condition for the delay-free case.The key is the solvability of the backward stochastic difference equations with input delay which is derived from the forward and backward stochastic system.
文摘In this paper, we consider the dual risk model in which periodic taxation are paid according to a loss-carry-forward system and dividends are paid under a threshold strategy. We give an analytical approach to derive the expression of gδ(u) (i.e. the Laplace transform of the first upper exit time). We discuss the expected discounted tax payments for this model and obtain its corresponding integro-differential equations. Finally, for Erlang (2) inter-innovation distribution, closedform expressions for the expected discounted tax payments are given.
文摘The object of this study is to propose a statistical model for predicting the Expected Path Length (expected number of steps the attacker will take, starting from the initial state to compromise the security goal—EPL) in a cyber-attack. The model we developed is based on utilizing vulnerability information along with having host centric attack graph. Utilizing the developed model, one can identify the interaction among the vulnerabilities and individual variables (risk factors) that drive the Expected Path Length. Gaining a better understanding of the relationship between vulnerabilities and their interactions can provide security administrators a better view and an understanding of their security status. In addition, we have also ranked the attributable variables and their contribution in estimating the subject length. Thus, one can utilize the ranking process to take precautions and actions to minimize Expected Path Length.
基金supported by National Health and Research Council of Australia Leadership Investigator Grants (NHMRCAPP1194679)+1 种基金the ACPCC has received equipment and a funding contribution from Roche Molecular Diagnostics USAco-PI on a major implementation programme Elimination of Cervical Cancer in the Western Pacific,which has received support from the Minderoo Foundation。
文摘Objective: Improvement in cancer survival over recent decades has not been accompanied by a narrowing of socioeconomic disparities. This study aimed to quantify the loss of life expectancy(LOLE) resulting from a cancer diagnosis and examine disparities in LOLE based on area-level socioeconomic status(SES).Methods: Data were collected for all people between 50 and 89 years of age who were diagnosed with cancer, registered in the NSW Cancer Registry between 2001 and 2019, and underwent mortality follow-up evaluations until December 2020. Flexible parametric survival models were fitted to estimate the LOLE by gender and area-level SES for 12 common cancers.Results: Of 422,680 people with cancer, 24% and 18% lived in the most and least disadvantaged areas, respectively. Patients from the most disadvantaged areas had a significantly greater average LOLE than patients from the least disadvantaged areas for cancers with high survival rates, including prostate [2.9 years(95% CI: 2.5±3.2 years) vs. 1.6 years(95% CI: 1.3±1.9 years)] and breast cancer [1.6 years(95% CI: 1.4±1.8 years) vs. 1.2 years(95% CI: 1.0±1.4 years)]. The highest average LOLE occurred in males residing in the most disadvantaged areas with pancreatic [16.5 years(95% CI: 16.1±16.8 years) vs. 16.2 years(95% CI: 15.7±16.7 years)] and liver cancer [15.5 years(95% CI: 15.0±16.0 years) vs. 14.7 years(95% CI: 14.0±15.5 years)]. Females residing in the least disadvantaged areas with thyroid cancer [0.9 years(95% CI: 0.4±1.4 years) vs. 0.6 years(95% CI: 0.2±1.0 years)] or melanoma [0.9 years(95% CI: 0.8±1.1 years) vs. 0.7 years(95% CI: 0.5±0.8 years)] had the lowest average LOLE.Conclusions: Patients from the most disadvantaged areas had the highest LOLE with SES-based differences greatest for patients diagnosed with cancer at an early stage or cancers with higher survival rates, suggesting the need to prioritise early detection and reduce treatment-related barriers and survivorship challenges to improve life expectancy.
基金The NSF (11201217) of Chinathe NSF (20132BAB211010) of Jiangxi Province
文摘In this paper, we consider a risk model in which two types of individual claims, main claims and by-claims, are defined. Every by-claim is induced by the main claim randomly and may be delayed for one time period with a certain probability. The dividend policy that certain amount of dividends will be paid as long as the surplus is greater than a constant dividend barrier is also introduced into this delayed claims risk model. By means of the probability generating functions, formulae for the expected present value of total dividend payments prior to ruin are obtained for discrete-type individual claims. Explicit expressions for the corresponding results are derived for K n claim amount distributions. Numerical illustrations are also given.
文摘In the current environment of increasingly fierce competition in the tourism industry,service quality has become crucial for enhancing the competitiveness of scenic spots.This paper uses the SERVQUAL model to design a service quality evaluation questionnaire that captures the gap between tourists’expectations and perceptions,using the Ciqikou Scenic Spot as a case study.Data collected from field surveys are used to comprehensively and meticulously evaluate the service quality of the Ciqikou Scenic Spot.The analysis results show that the scenic spot,with its unique folk culture experience and beautiful ecological environment,has certain advantages in terms of service quality.However,significant deficiencies exist in infrastructure and environmental hygiene.Accordingly,targeted improvement suggestions are proposed to further enhance the service quality of the Ciqikou Scenic Spot and meet the increasingly diverse and personalized needs of tourists.This study provides not only a specific service quality improvement strategy for the Ciqikou Scenic Spot but also a valuable reference for other tourist attractions.
文摘Classical survival analysis assumes all subjects will experience the event of interest, but in some cases, a portion of the population may never encounter the event. These survival methods further assume independent survival times, which is not valid for honey bees, which live in nests. The study introduces a semi-parametric marginal proportional hazards mixture cure (PHMC) model with exchangeable correlation structure, using generalized estimating equations for survival data analysis. The model was tested on clustered right-censored bees survival data with a cured fraction, where two bee species were subjected to different entomopathogens to test the effect of the entomopathogens on the survival of the bee species. The Expectation-Solution algorithm is used to estimate the parameters. The study notes a weak positive association between cure statuses (ρ1=0.0007) and survival times for uncured bees (ρ2=0.0890), emphasizing their importance. The odds of being uncured for A. mellifera is higher than the odds for species M. ferruginea. The bee species, A. mellifera are more susceptible to entomopathogens icipe 7, icipe 20, and icipe 69. The Cox-Snell residuals show that the proposed semiparametric PH model generally fits the data well as compared to model that assume independent correlation structure. Thus, the semi parametric marginal proportional hazards mixture cure is parsimonious model for correlated bees survival data.
文摘Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.
基金supported by the National Natural Science Foundation of China(No.12171471)Natural Science Foundation of Jiangsu Province(No.BK20221543).
文摘This article analyzes the Pareto optimal allocations,agreeable trades and agreeable bets under the maxmin Choquet expected utility(MCEU)model.We provide several useful characterizations for Pareto optimal allocations for risk averse agents.We derive the formulation descriptions for non-existence agreeable trades or agreeable bets for risk neutral agents.We build some relationships between ex-ante stage and interim stage on agreeable trades or bets when new information arrives.
文摘Depression in later life is an underrepresented yet important research area. The aim of the study was to explore depressed older persons’ need for and expectations of improved health services one year after implementation of the Chronic Care Model (CCM). A qualitative evaluative design was used. Data were collected through individual interviews with older persons living in Norway. The qualitative content analysis revealed two themes: The need to be safeguarded and Expectation of being considered valuable and capable. Evaluation of the improvement in care with focus on the CCM components showed that the most important components for improving the depressed older person’s daily life were: delivery system re-design, self-management support, productive interaction and a well-informed active patient. The findings highlight the need for a health services designed for persons suffering from chronic ill-health, where the CCM could serve as a framework for policy change and support the redesign of the existing healthcare system. We conclude that older persons with depression need attention, especially those who have been suffering for many years. The identified components may have implications for health professionals in the promotion of mental healthcare.
文摘In this current paper, the exposure time effects on four endocrine disruptors and teleost fishes were evaluated using the reduced life expectancy (RLE) model based on the effect concentration (EC<sub>50</sub>) of available literature published. The result on the regression analysis over different exposure times has demonstrated that the EC<sub>50</sub> of hepatic biomarkers falls with increasing exposure times in a predictable manner. The slopes of the regression equations reflect the strength of the toxic effects on the various teleost fish. The EC<sub>50</sub> reduction over time can be interpreted based on the bioconcentration process, which can be used to understand transfer routes of the compounds from water to fish body. RLE model also provides useful information in assessing the toxic effects on fish life expectancy as a result of the occurrence of compounds.
文摘At present, there are significant regional differences in average life expectancy among countries in the world. Not only is there a great disparity in average life expectancy, but also the gender difference is positive and negative, and is distributed in a bipolar distribution of “long life in rich countries and short life in poor countries”. This paper analyzes the factors affecting the life grade by using the ordered multivariate discrete selection model and combined with the average life expectancy data of countries all over the world in 2017. The test results show that: 1) The growth of per capita GDP, elderly dependency ratio and the proportion of people using at least basic drinking water services can effectively improve the level of life expectancy;2) The birth rate has an inhibitory effect on the average life expectancy;3) Through model comparison, probit model is more suitable for the analysis of this kind of problems than logit model, and the properties of the obtained model are better.
文摘Intrusion detection is the investigation process of information about the system activities or its data to detect any malicious behavior or unauthorized activity.Most of the IDS implement K-means clustering technique due to its linear complexity and fast computing ability.Nonetheless,it is Naïve use of the mean data value for the cluster core that presents a major drawback.The chances of two circular clusters having different radius and centering at the same mean will occur.This condition cannot be addressed by the K-means algorithm because the mean value of the various clusters is very similar together.However,if the clusters are not spherical,it fails.To overcome this issue,a new integrated hybrid model by integrating expectation maximizing(EM)clustering using a Gaussian mixture model(GMM)and naïve Bays classifier have been proposed.In this model,GMM give more flexibility than K-Means in terms of cluster covariance.Also,they use probabilities function and soft clustering,that’s why they can have multiple cluster for a single data.In GMM,we can define the cluster form in GMM by two parameters:the mean and the standard deviation.This means that by using these two parameters,the cluster can take any kind of elliptical shape.EM-GMM will be used to cluster data based on data activity into the corresponding category.
文摘Cyber losses in terms of number of records breached under cyber incidents commonly feature a significant portion of zeros, specific characteristics of mid-range losses and large losses, which make it hard to model the whole range of the losses using a standard loss distribution. We tackle this modeling problem by proposing a three-component spliced regression model that can simultaneously model zeros, moderate and large losses and consider heterogeneous effects in mixture components. To apply our proposed model to Privacy Right Clearinghouse (PRC) data breach chronology, we segment geographical groups using unsupervised cluster analysis, and utilize a covariate-dependent probability to model zero losses, finite mixture distributions for moderate body and an extreme value distribution for large losses capturing the heavy-tailed nature of the loss data. Parameters and coefficients are estimated using the Expectation-Maximization (EM) algorithm. Combining with our frequency model (generalized linear mixed model) for data breaches, aggregate loss distributions are investigated and applications on cyber insurance pricing and risk management are discussed.