Collaborative Filtering(CF) is a leading approach to build recommender systems which has gained considerable development and popularity. A predominant approach to CF is rating prediction recommender algorithm, aiming ...Collaborative Filtering(CF) is a leading approach to build recommender systems which has gained considerable development and popularity. A predominant approach to CF is rating prediction recommender algorithm, aiming to predict a user's rating for those items which were not rated yet by the user. However, with the increasing number of items and users, thedata is sparse.It is difficult to detectlatent closely relation among the items or users for predicting the user behaviors. In this paper,we enhance the rating prediction approach leading to substantial improvement of prediction accuracy by categorizing according to the genres of movies. Then the probabilities that users are interested in the genres are computed to integrate the prediction of each genre cluster. A novel probabilistic approach based on the sentiment analysis of the user reviews is also proposed to give intuitional explanations of why an item is recommended.To test the novel recommendation approach, a new corpus of user reviews on movies obtained from the Internet Movies Database(IMDB) has been generated. Experimental results show that the proposed framework is effective and achieves a better prediction performance.展开更多
[Objective] The research aimed to analyze explanation effect of the European numerical prediction on temperature. [Method] Based on CMSVM regression method, by using 850 hPa grid point data of the European numerical p...[Objective] The research aimed to analyze explanation effect of the European numerical prediction on temperature. [Method] Based on CMSVM regression method, by using 850 hPa grid point data of the European numerical prediction from 2003 to 2009 and actual data of the maximum and minimum temperatures at 8 automatic stations in Qingyang City, prediction model of the temperature was established, and running effect of the business from 2008 to 2010 was tested and evaluated. [Result] The method had very good guidance role in real-time business running of the temperature prediction. Test and evaluation found that as forecast time prolonged, prediction accuracies of the maximum and minimum temperatures declined. When temperature anomaly was higher (actual temperature was higher than historical mean), prediction accuracy increased. Influence of the European numerical prediction was bigger. [Conclusion] Compared with other methods, operation of the prediction method was convenient, modeling was automatic, running time was short, system was stable, and prediction accuracy was high. It was suitable for implementing of the explanation work for numerical prediction product at meteorological station.展开更多
The rainfall induced landslides and debris flows are the major disasters in China, as well in Europe, South America, Japan and Australia. This paper proposes a new type of joint probability prediction model—Double La...The rainfall induced landslides and debris flows are the major disasters in China, as well in Europe, South America, Japan and Australia. This paper proposes a new type of joint probability prediction model—Double Layer Nested Multivariate Compound Extreme Value Distribution (DLNMCEVD) to predict landslides and debris flows triggered by rainfall. The outer layer of DLNMCEVD is predicting the joint probabilities of different combinations for rainfall characteristics, air temperature and humidity, which should be considered as external load factors with geological and geotechnical characteristics as resistance factors for reliability analysis of slope stability in the inner layer of model. For the reliability and consequence analysis of rainfall-induced slope failure, the Global Uncertainty Analysis and Global Sensitivity Analysis (GUA & GSA) should be taken into account for input-output iterations. Finally, based on the statistics prediction by DLNMCEVD, the geological hazards prevention alarm and regionalization can be provided in this paper.展开更多
With the integration of global economy development and the rapid growth of science knowledge and technology,the needs of people’s consumption are increasingly personalized and diversified.Such a market background mak...With the integration of global economy development and the rapid growth of science knowledge and technology,the needs of people’s consumption are increasingly personalized and diversified.Such a market background makes sales forecasting become an indispensable part of enterprise management and development.The definition of the sales forecasting is that based on the past few years’sales situation,the enterprises through systematic sales forecasting models estimate of the quantity and amount of all or some specific sales products and services in a specific time in the future.Accurate sales forecasting can promote enterprises to do better in future revenue,and can also encourage enterprises to set and keep an efficient sales management team.This paper will analyze traditional sales forecasting methods and sales forecasting methods based on big data models related to the perspective of machine learning,and then compare them.The research shows that the two sales forecasting methods have their own advantages and disadvantages.In the future,enterprises can adopt the two sales forecasting methods in parallel to maximize the utilization advantage of sales forecasting for enterprises.展开更多
The quality of the resulting pulping continuous digesters is monitored by measuring the Kappa number, which is a reference of residual lignin. The control of the kappa number is carried out mainly in the top of the di...The quality of the resulting pulping continuous digesters is monitored by measuring the Kappa number, which is a reference of residual lignin. The control of the kappa number is carried out mainly in the top of the digester, therefore it is important to get some indication of this analysis beforehand. In this context, the aim of this work was to obtain a prediction model of the kappa number in advance to the laboratory results. This paper proposes a new approach using the Box & Jenkins methodology to develop a dynamic model for predicting the kappa number from a Kamyr continuous digester from an eucalyptus Kraft pulp mill in Brazil. With a database of 1500 observations over a period of 30 days of operation, some ARMA models were studied, leading to the choice of ARMA (1, 2) as the best forecasting model. After fitting the model, we performed validation with a new set of data from 30 days of operation, achieving a model of 2.7% mean absolute percent error.展开更多
Owing to the convenience of online loans,an increasing number of people are borrowing money on online platforms.With the emergence of machine learning technology,predicting loan defaults has become a popular topic.How...Owing to the convenience of online loans,an increasing number of people are borrowing money on online platforms.With the emergence of machine learning technology,predicting loan defaults has become a popular topic.However,machine learning models have a black-box problem that cannot be disregarded.To make the prediction model rules more understandable and thereby increase the user’s faith in the model,an explanatory model must be used.Logistic regression,decision tree,XGBoost,and LightGBM models are employed to predict a loan default.The prediction results show that LightGBM and XGBoost outperform logistic regression and decision tree models in terms of the predictive ability.The area under curve for LightGBM is 0.7213.The accuracies of LightGBM and XGBoost exceed 0.8.The precisions of LightGBM and XGBoost exceed 0.55.Simultaneously,we employed the local interpretable model-agnostic explanations approach to undertake an explainable analysis of the prediction findings.The results show that factors such as the loan term,loan grade,credit rating,and loan amount affect the predicted outcomes.展开更多
基金supported in part by National Science Foundation of China under Grants No.61303105 and 61402304the Humanity&Social Science general project of Ministry of Education under Grants No.14YJAZH046+2 种基金the Beijing Natural Science Foundation under Grants No.4154065the Beijing Educational Committee Science and Technology Development Planned under Grants No.KM201410028017Academic Degree Graduate Courses group projects
文摘Collaborative Filtering(CF) is a leading approach to build recommender systems which has gained considerable development and popularity. A predominant approach to CF is rating prediction recommender algorithm, aiming to predict a user's rating for those items which were not rated yet by the user. However, with the increasing number of items and users, thedata is sparse.It is difficult to detectlatent closely relation among the items or users for predicting the user behaviors. In this paper,we enhance the rating prediction approach leading to substantial improvement of prediction accuracy by categorizing according to the genres of movies. Then the probabilities that users are interested in the genres are computed to integrate the prediction of each genre cluster. A novel probabilistic approach based on the sentiment analysis of the user reviews is also proposed to give intuitional explanations of why an item is recommended.To test the novel recommendation approach, a new corpus of user reviews on movies obtained from the Internet Movies Database(IMDB) has been generated. Experimental results show that the proposed framework is effective and achieves a better prediction performance.
文摘[Objective] The research aimed to analyze explanation effect of the European numerical prediction on temperature. [Method] Based on CMSVM regression method, by using 850 hPa grid point data of the European numerical prediction from 2003 to 2009 and actual data of the maximum and minimum temperatures at 8 automatic stations in Qingyang City, prediction model of the temperature was established, and running effect of the business from 2008 to 2010 was tested and evaluated. [Result] The method had very good guidance role in real-time business running of the temperature prediction. Test and evaluation found that as forecast time prolonged, prediction accuracies of the maximum and minimum temperatures declined. When temperature anomaly was higher (actual temperature was higher than historical mean), prediction accuracy increased. Influence of the European numerical prediction was bigger. [Conclusion] Compared with other methods, operation of the prediction method was convenient, modeling was automatic, running time was short, system was stable, and prediction accuracy was high. It was suitable for implementing of the explanation work for numerical prediction product at meteorological station.
文摘The rainfall induced landslides and debris flows are the major disasters in China, as well in Europe, South America, Japan and Australia. This paper proposes a new type of joint probability prediction model—Double Layer Nested Multivariate Compound Extreme Value Distribution (DLNMCEVD) to predict landslides and debris flows triggered by rainfall. The outer layer of DLNMCEVD is predicting the joint probabilities of different combinations for rainfall characteristics, air temperature and humidity, which should be considered as external load factors with geological and geotechnical characteristics as resistance factors for reliability analysis of slope stability in the inner layer of model. For the reliability and consequence analysis of rainfall-induced slope failure, the Global Uncertainty Analysis and Global Sensitivity Analysis (GUA & GSA) should be taken into account for input-output iterations. Finally, based on the statistics prediction by DLNMCEVD, the geological hazards prevention alarm and regionalization can be provided in this paper.
文摘With the integration of global economy development and the rapid growth of science knowledge and technology,the needs of people’s consumption are increasingly personalized and diversified.Such a market background makes sales forecasting become an indispensable part of enterprise management and development.The definition of the sales forecasting is that based on the past few years’sales situation,the enterprises through systematic sales forecasting models estimate of the quantity and amount of all or some specific sales products and services in a specific time in the future.Accurate sales forecasting can promote enterprises to do better in future revenue,and can also encourage enterprises to set and keep an efficient sales management team.This paper will analyze traditional sales forecasting methods and sales forecasting methods based on big data models related to the perspective of machine learning,and then compare them.The research shows that the two sales forecasting methods have their own advantages and disadvantages.In the future,enterprises can adopt the two sales forecasting methods in parallel to maximize the utilization advantage of sales forecasting for enterprises.
文摘The quality of the resulting pulping continuous digesters is monitored by measuring the Kappa number, which is a reference of residual lignin. The control of the kappa number is carried out mainly in the top of the digester, therefore it is important to get some indication of this analysis beforehand. In this context, the aim of this work was to obtain a prediction model of the kappa number in advance to the laboratory results. This paper proposes a new approach using the Box & Jenkins methodology to develop a dynamic model for predicting the kappa number from a Kamyr continuous digester from an eucalyptus Kraft pulp mill in Brazil. With a database of 1500 observations over a period of 30 days of operation, some ARMA models were studied, leading to the choice of ARMA (1, 2) as the best forecasting model. After fitting the model, we performed validation with a new set of data from 30 days of operation, achieving a model of 2.7% mean absolute percent error.
基金supported by Fundamental Research Funds for the Central Universities(WUT:2022IVA067).
文摘Owing to the convenience of online loans,an increasing number of people are borrowing money on online platforms.With the emergence of machine learning technology,predicting loan defaults has become a popular topic.However,machine learning models have a black-box problem that cannot be disregarded.To make the prediction model rules more understandable and thereby increase the user’s faith in the model,an explanatory model must be used.Logistic regression,decision tree,XGBoost,and LightGBM models are employed to predict a loan default.The prediction results show that LightGBM and XGBoost outperform logistic regression and decision tree models in terms of the predictive ability.The area under curve for LightGBM is 0.7213.The accuracies of LightGBM and XGBoost exceed 0.8.The precisions of LightGBM and XGBoost exceed 0.55.Simultaneously,we employed the local interpretable model-agnostic explanations approach to undertake an explainable analysis of the prediction findings.The results show that factors such as the loan term,loan grade,credit rating,and loan amount affect the predicted outcomes.