Data sharing and privacy protection are made possible by federated learning,which allows for continuous model parameter sharing between several clients and a central server.Multiple reliable and high-quality clients m...Data sharing and privacy protection are made possible by federated learning,which allows for continuous model parameter sharing between several clients and a central server.Multiple reliable and high-quality clients must participate in practical applications for the federated learning global model to be accurate,but because the clients are independent,the central server cannot fully control their behavior.The central server has no way of knowing the correctness of the model parameters provided by each client in this round,so clients may purposefully or unwittingly submit anomalous data,leading to abnormal behavior,such as becoming malicious attackers or defective clients.To reduce their negative consequences,it is crucial to quickly detect these abnormalities and incentivize them.In this paper,we propose a Federated Learning framework for Detecting and Incentivizing Abnormal Clients(FL-DIAC)to accomplish efficient and security federated learning.We build a detector that introduces an auto-encoder for anomaly detection and use it to perform anomaly identification and prevent the involvement of abnormal clients,in particular for the anomaly client detection problem.Among them,before the model parameters are input to the detector,we propose a Fourier transform-based anomaly data detectionmethod for dimensionality reduction in order to reduce the computational complexity.Additionally,we create a credit scorebased incentive structure to encourage clients to participate in training in order tomake clients actively participate.Three training models(CNN,MLP,and ResNet-18)and three datasets(MNIST,Fashion MNIST,and CIFAR-10)have been used in experiments.According to theoretical analysis and experimental findings,the FL-DIAC is superior to other federated learning schemes of the same type in terms of effectiveness.展开更多
Federated learning has been used extensively in business inno-vation scenarios in various industries.This research adopts the federated learning approach for the first time to address the issue of bank-enterprise info...Federated learning has been used extensively in business inno-vation scenarios in various industries.This research adopts the federated learning approach for the first time to address the issue of bank-enterprise information asymmetry in the credit assessment scenario.First,this research designs a credit risk assessment model based on federated learning and feature selection for micro and small enterprises(MSEs)using multi-dimensional enterprise data and multi-perspective enterprise information.The proposed model includes four main processes:namely encrypted entity alignment,hybrid feature selection,secure multi-party computation,and global model updating.Secondly,a two-step feature selection algorithm based on wrapper and filter is designed to construct the optimal feature set in multi-source heterogeneous data,which can provide excellent accuracy and interpretability.In addition,a local update screening strategy is proposed to select trustworthy model parameters for aggregation each time to ensure the quality of the global model.The results of the study show that the model error rate is reduced by 6.22%and the recall rate is improved by 11.03%compared to the algorithms commonly used in credit risk research,significantly improving the ability to identify defaulters.Finally,the business operations of commercial banks are used to confirm the potential of the proposed model for real-world implementation.展开更多
Credit card fraud remains a significant challenge, with financial losses and consumer protection at stake. This study addresses the need for practical, real-time fraud detection methodologies. Using a Kaggle credit ca...Credit card fraud remains a significant challenge, with financial losses and consumer protection at stake. This study addresses the need for practical, real-time fraud detection methodologies. Using a Kaggle credit card dataset, I tackle class imbalance using the Synthetic Minority Oversampling Technique (SMOTE) to enhance modeling efficiency. I compare several machine learning algorithms, including Logistic Regression, Linear Discriminant Analysis, K-nearest Neighbors, Classification and Regression Tree, Naive Bayes, Support Vector, Random Forest, XGBoost, and Light Gradient-Boosting Machine to classify transactions as fraud or genuine. Rigorous evaluation metrics, such as AUC, PRAUC, F1, KS, Recall, and Precision, identify the Random Forest as the best performer in detecting fraudulent activities. The Random Forest model successfully identifies approximately 92% of transactions scoring 90 and above as fraudulent, equating to a detection rate of over 70% for all fraudulent transactions in the test dataset. Moreover, the model captures more than half of the fraud in each bin of the test dataset. SHAP values provide model explainability, with the SHAP summary plot highlighting the global importance of individual features, such as “V12” and “V14”. SHAP force plots offer local interpretability, revealing the impact of specific features on individual predictions. This study demonstrates the potential of machine learning, particularly the Random Forest model, for real-time credit card fraud detection, offering a promising approach to mitigate financial losses and protect consumers.展开更多
The purpose of this paper is to argue the effectiveness of self-regulated learning in English education in Chinese college classroom instruction. A study is given to show whether the introduction of self-regulated lea...The purpose of this paper is to argue the effectiveness of self-regulated learning in English education in Chinese college classroom instruction. A study is given to show whether the introduction of self-regulated learning can help improve Chinese college students' English learning, and help them perform better in the National English test-CET-4 (College English Test Level-4,).展开更多
The proliferation of digital payment methods facilitated by various online platforms and applications has led to a surge in financial fraud,particularly in credit card transactions.Advanced technologies such as machin...The proliferation of digital payment methods facilitated by various online platforms and applications has led to a surge in financial fraud,particularly in credit card transactions.Advanced technologies such as machine learning have been widely employed to enhance the early detection and prevention of losses arising frompotentially fraudulent activities.However,a prevalent approach in existing literature involves the use of extensive data sampling and feature selection algorithms as a precursor to subsequent investigations.While sampling techniques can significantly reduce computational time,the resulting dataset relies on generated data and the accuracy of the pre-processing machine learning models employed.Such datasets often lack true representativeness of realworld data,potentially introducing secondary issues that affect the precision of the results.For instance,undersampling may result in the loss of critical information,while over-sampling can lead to overfitting machine learning models.In this paper,we proposed a classification study of credit card fraud using fundamental machine learning models without the application of any sampling techniques on all the features present in the original dataset.The results indicate that Support Vector Machine(SVM)consistently achieves classification performance exceeding 90%across various evaluation metrics.This discovery serves as a valuable reference for future research,encouraging comparative studies on original dataset without the reliance on sampling techniques.Furthermore,we explore hybrid machine learning techniques,such as ensemble learning constructed based on SVM,K-Nearest Neighbor(KNN)and decision tree,highlighting their potential advancements in the field.The study demonstrates that the proposed machine learning models yield promising results,suggesting that pre-processing the dataset with sampling algorithm or additional machine learning technique may not always be necessary.This research contributes to the field of credit card fraud detection by emphasizing the potential of employing machine learning models directly on original datasets,thereby simplifying the workflow and potentially improving the accuracy and efficiency of fraud detection systems.展开更多
This study investigates the need for credit supervision as conducted by on-site banking supervisors.It builds on a real bank on-site credit examination to compare the performance of a hypothetical self-supervision app...This study investigates the need for credit supervision as conducted by on-site banking supervisors.It builds on a real bank on-site credit examination to compare the performance of a hypothetical self-supervision approach,in which banks themselves assess their loan portfolios without external intervention,with the on-site banking supervision approach of the Central Bank of Brazil.The experiment develops two machine learning classification models:the first model is based on good and bad ratings informed by banks,and the second model is based on past on-site credit portfolio examinations conducted by banking supervision.The findings show that the overall performance of the on-site supervision approach is consistently higher than the performance of the self-supervision approach,justifying the need for on-site credit portfolio examination as conducted by the Central Bank.展开更多
Personal credit risk assessment is an important part of the development of financial enterprises. Big data credit investigation is an inevitable trend of personal credit risk assessment, but some data are missing and ...Personal credit risk assessment is an important part of the development of financial enterprises. Big data credit investigation is an inevitable trend of personal credit risk assessment, but some data are missing and the amount of data is small, so it is difficult to train. At the same time, for different financial platforms, we need to use different models to train according to the characteristics of the current samples, which is time-consuming. <span style="font-family:Verdana;">In view of</span><span style="font-family:Verdana;"> these two problems, this paper uses the idea of transfer learning to build a transferable personal credit risk model based on Instance-based Transfer Learning (Instance-based TL). The model balances the weight of the samples in the source domain, and migrates the existing large dataset samples to the target domain of small samples, and finds out the commonness between them. At the same time, we have done a lot of experiments on the selection of base learners, including traditional machine learning algorithms and ensemble learning algorithms, such as decision tree, logistic regression, </span><span style="font-family:Verdana;">xgboost</span> <span style="font-family:Verdana;">and</span><span style="font-family:Verdana;"> so on. The datasets are from P2P platform and bank, the results show that the AUC value of Instance-based TL is 24% higher than that of the traditional machine learning model, which fully proves that the model in this paper has good application value. The model’s evaluation uses AUC, prediction, recall, F1. These criteria prove that this model has good application value from many aspects. At present, we are trying to apply this model to more fields to improve the robustness and applicability of the model;on the other hand, we are trying to do more in-depth research on domain adaptation to enrich the model.</span>展开更多
This study investigates the learning curve of commercial banks regarding the efficiency of credit and value creation.However,current empirical methods for accessing the learning curve in organizations are not suitable...This study investigates the learning curve of commercial banks regarding the efficiency of credit and value creation.However,current empirical methods for accessing the learning curve in organizations are not suitable for use in financial institutions.Considering bank-specific characteristics,we introduce a dynamic learning curve using a cost function adjusted to capture learning-by-doing in banks.Using the model,we test several hypotheses on the impact of bank intermediary experience(learning)on the efficiency of credit and value creation in Japanese commercial banks.The findings show that bank intermediary learning significantly improves the cost efficiency gain in the gross value created,total credit created,and investment.However,bank intermediary experience has no significant effect on the efficiency of the economic value created for all the banks analyzed.These findings have practical implications for evaluating cost dynamics in bank credit and value creation,risk management,lending to the real sector,and shareholder value creation.展开更多
Implementing new machine learning(ML)algorithms for credit default prediction is associated with better predictive performance;however,it also generates new model risks,particularly concerning the supervisory validati...Implementing new machine learning(ML)algorithms for credit default prediction is associated with better predictive performance;however,it also generates new model risks,particularly concerning the supervisory validation process.Recent industry surveys often mention that uncertainty about how supervisors might assess these risks could be a barrier to innovation.In this study,we propose a new framework to quantify model risk-adjustments to compare the performance of several ML methods.To address this challenge,we first harness the internal ratings-based approach to identify up to 13 risk components that we classify into 3 main categories—statistics,technology,and market conduct.Second,to evaluate the importance of each risk category,we collect a series of regulatory documents related to three potential use cases—regulatory capital,credit scoring,or provisioning—and we compute the weight of each category according to the intensity of their mentions,using natural language processing and a risk terminology based on expert knowledge.Finally,we test our framework using popular ML models in credit risk,and a publicly available database,to quantify some proxies of a subset of risk factors that we deem representative.We measure the statistical risk according to the number of hyperparameters and the stability of the predictions.The technological risk is assessed through the transparency of the algorithm and the latency of the ML training method,while the market conduct risk is quantified by the time it takes to run a post hoc technique(SHapley Additive exPlanations)to interpret the output.展开更多
To solve the high-dimensionality issue and improve its accuracy in credit risk assessment,a high-dimensionality-trait-driven learning paradigm is proposed for feature extraction and classifier selection.The proposed p...To solve the high-dimensionality issue and improve its accuracy in credit risk assessment,a high-dimensionality-trait-driven learning paradigm is proposed for feature extraction and classifier selection.The proposed paradigm consists of three main stages:categorization of high dimensional data,high-dimensionality-trait-driven feature extraction,and high-dimensionality-trait-driven classifier selection.In the first stage,according to the definition of high-dimensionality and the relationship between sample size and feature dimensions,the high-dimensionality traits of credit dataset are further categorized into two types:100<feature dimensions<sample size,and feature dimensions≥sample size.In the second stage,some typical feature extraction methods are tested regarding the two categories of high dimensionality.In the final stage,four types of classifiers are performed to evaluate credit risk considering different high-dimensionality traits.For the purpose of illustration and verification,credit classification experiments are performed on two publicly available credit risk datasets,and the results show that the proposed high-dimensionality-trait-driven learning paradigm for feature extraction and classifier selection is effective in handling high-dimensional credit classification issues and improving credit classification accuracy relative to the benchmark models listed in this study.展开更多
Credit bank system is a very important management mode in current higher vocational education,and it is also an important means for the Ministry of Education to promote China's modern education reform.This paper c...Credit bank system is a very important management mode in current higher vocational education,and it is also an important means for the Ministry of Education to promote China's modern education reform.This paper combines the specific background of higher vocational enrollment expansion,expounds the practical significance of the credit bank system in the training of the expanded enrollment talents,and explores the methods and rules of the authentication and transformation of the learning achievement of the expanded enrollment students under the system.In order to achieve convergence of different types of learning results,smooth talent growth channel,a useful exploration is carried out.展开更多
Credit card fraud is a wide-ranging issue for financial institutions, involving theft and fraud committed using a payment card. In this paper, we explore the application of linear and nonlinear statistical modeling an...Credit card fraud is a wide-ranging issue for financial institutions, involving theft and fraud committed using a payment card. In this paper, we explore the application of linear and nonlinear statistical modeling and machine learning models on real credit card transaction data. The models built are supervised fraud models that attempt to identify which transactions are most likely fraudulent. We discuss the processes of data exploration, data cleaning, variable creation, feature selection, model algorithms, and results. Five different supervised models are explored and compared including logistic regression, neural networks, random forest, boosted tree and support vector machines. The boosted tree model shows the best fraud detection result (FDR = 49.83%) for this particular data set. The resulting model can be utilized in a credit card fraud detection system. A similar model development process can be performed in related business domains such as insurance and telecommunications, to avoid or detect fraudulent activity.展开更多
Self-regulation is crucial to learners’learning outcomes in a blended education context.This paper first discusses its definitions and importance,then explores factors affecting self-regulation,and finally puts forwa...Self-regulation is crucial to learners’learning outcomes in a blended education context.This paper first discusses its definitions and importance,then explores factors affecting self-regulation,and finally puts forward several ways to improve learners’self-regulation.展开更多
Objective:Self-directed training represents a challenge in simulation-based training as low cognitive effort can occur when learners overrate their own level of performance.This study aims to explore the mechanisms un...Objective:Self-directed training represents a challenge in simulation-based training as low cognitive effort can occur when learners overrate their own level of performance.This study aims to explore the mechanisms underlying the positive effects of a structured self-assessment intervention during simulation-based training of mastoidectomy.Methods:A prospective,educational cohort study of a novice training program consisting of directed,self-regulated learning with distributed practice(5x3 procedures)in a virtual reality temporal bone simulator.The intervention consisted of structured self-assessment after each procedure using a rating form supported by small videos.Semi-structured telephone interviews upon completion of training were conducted with 13 out of 15 participants.Interviews were analysed using directed content analysis and triangulated with quantitative data on secondary task reaction time for cognitive load estimation and participants’self-assessment scores.Results:Six major themes were identified in the interviews:goal-directed behaviour,use of learning supports for scaffolding of the training,cognitive engagement,motivation from self-assessment,selfassessment bias,and feedback on self-assessment(validation).Participants seemed to self-regulate their learning by forming individual sub-goals and strategies within the overall goal of the procedure.They scaffolded their learning through the available learning supports.Finally,structured self-assessment was reported to increase the participants’cognitive engagement,which was further supported by a quantitative increase in cognitive load.Conclusions:Structured self-assessment in simulation-based surgical training of mastoidectomy seems to promote cognitive engagement and motivation in the learning task and to facilitate self-regulated learning.展开更多
With the rapid development of the internet of things(IoT),electricity consumption data can be captured and recorded in the IoT cloud center.This provides a credible data source for enterprise credit scoring,which is o...With the rapid development of the internet of things(IoT),electricity consumption data can be captured and recorded in the IoT cloud center.This provides a credible data source for enterprise credit scoring,which is one of the most vital elements during the financial decision-making process.Accordingly,this paper proposes to use deep learning to train an enterprise credit scoring model by inputting the electricity consumption data.Instead of predicting the credit rating,our method can generate an absolute credit score by a novel deep ranking model–ranking extreme gradient boosting net(rankXGB).To boost the performance,the rankXGB model combines several weak ranking models into a strong model.Due to the high computational cost and the vast amounts of data,we design an edge computing framework to reduce the latency of enterprise credit evaluation.Specially,we design a two-stage deep learning task architecture,including a cloud-based weak credit ranking and an edge-based credit score calculation.In the first stage,we send the electricity consumption data of the evaluated enterprise to the computing cloud server,where multiple weak-ranking networks are executed in parallel to produce multiple weak-ranking results.In the second stage,the edge device fuses multiple ranking results generated in the cloud server to produce a more reliable ranking result,which is used to calculate an absolute credit score by score normalization.The experiments demonstrate that our method can achieve accurate enterprise credit evaluation quickly.展开更多
According to the Food and Agriculture Organization of the United Nations (FAO), there are about 500 million smallholder farmers in the world, and in developing countries, such farmers produce about 80% of the food con...According to the Food and Agriculture Organization of the United Nations (FAO), there are about 500 million smallholder farmers in the world, and in developing countries, such farmers produce about 80% of the food consumed there;their farming activities are therefore critical to the economies of their countries and to the global food security. However, these farmers face the challenges of limited access to credit, often due to the fact that many of them farm on unregistered land that cannot be offered as collateral to lending institutions;but even when they are on registered land, the fear of losing such land that they should default on loan payments often prevents them from applying for farm credit;and even if they apply, they still get disadvantaged by low credit scores (a measure of creditworthiness). The result is that they are often unable to use optimal farm inputs such as fertilizer and good seeds among others. This depresses their yields, and in turn, has negative implications for the food security in their communities, and in the world, hence making it difficult for the UN to achieve its sustainable goal no.2 (no hunger). This study aimed to demonstrate how geospatial technology can be used to leverage farm credit scoring for the benefit of smallholder farmers. A survey was conducted within the study area to identify the smallholder farms and farmers. A sample of surveyed farmers was then subjected to credit scoring by machine learning. In the first instance, the traditional financial data approach was used and the results showed that over 40% of the farmers could not qualify for credit. When non-financial geospatial data, i.e. Normalized Difference Vegetation Index (NDVI) was introduced into the scoring model, the number of farmers not qualifying for credit reduced significantly to 24%. It is concluded that the introduction of the NDVI variable into the traditional scoring model could improve significantly the smallholder farmers’ chances of accessing credit, thus enabling such a farmer to be better evaluated for credit on the basis of the health of their crop, rather than on a traditional form of collateral.展开更多
基金supported by Key Research and Development Program of China (No.2022YFC3005401)Key Research and Development Program of Yunnan Province,China (Nos.202203AA080009,202202AF080003)+1 种基金Science and Technology Achievement Transformation Program of Jiangsu Province,China (BA2021002)Fundamental Research Funds for the Central Universities (Nos.B220203006,B210203024).
文摘Data sharing and privacy protection are made possible by federated learning,which allows for continuous model parameter sharing between several clients and a central server.Multiple reliable and high-quality clients must participate in practical applications for the federated learning global model to be accurate,but because the clients are independent,the central server cannot fully control their behavior.The central server has no way of knowing the correctness of the model parameters provided by each client in this round,so clients may purposefully or unwittingly submit anomalous data,leading to abnormal behavior,such as becoming malicious attackers or defective clients.To reduce their negative consequences,it is crucial to quickly detect these abnormalities and incentivize them.In this paper,we propose a Federated Learning framework for Detecting and Incentivizing Abnormal Clients(FL-DIAC)to accomplish efficient and security federated learning.We build a detector that introduces an auto-encoder for anomaly detection and use it to perform anomaly identification and prevent the involvement of abnormal clients,in particular for the anomaly client detection problem.Among them,before the model parameters are input to the detector,we propose a Fourier transform-based anomaly data detectionmethod for dimensionality reduction in order to reduce the computational complexity.Additionally,we create a credit scorebased incentive structure to encourage clients to participate in training in order tomake clients actively participate.Three training models(CNN,MLP,and ResNet-18)and three datasets(MNIST,Fashion MNIST,and CIFAR-10)have been used in experiments.According to theoretical analysis and experimental findings,the FL-DIAC is superior to other federated learning schemes of the same type in terms of effectiveness.
基金funded by the State Grid Jiangsu Electric Power Company(Grant No.JS2020112)the National Natural Science Foundation of China(Grant No.62272236).
文摘Federated learning has been used extensively in business inno-vation scenarios in various industries.This research adopts the federated learning approach for the first time to address the issue of bank-enterprise information asymmetry in the credit assessment scenario.First,this research designs a credit risk assessment model based on federated learning and feature selection for micro and small enterprises(MSEs)using multi-dimensional enterprise data and multi-perspective enterprise information.The proposed model includes four main processes:namely encrypted entity alignment,hybrid feature selection,secure multi-party computation,and global model updating.Secondly,a two-step feature selection algorithm based on wrapper and filter is designed to construct the optimal feature set in multi-source heterogeneous data,which can provide excellent accuracy and interpretability.In addition,a local update screening strategy is proposed to select trustworthy model parameters for aggregation each time to ensure the quality of the global model.The results of the study show that the model error rate is reduced by 6.22%and the recall rate is improved by 11.03%compared to the algorithms commonly used in credit risk research,significantly improving the ability to identify defaulters.Finally,the business operations of commercial banks are used to confirm the potential of the proposed model for real-world implementation.
文摘Credit card fraud remains a significant challenge, with financial losses and consumer protection at stake. This study addresses the need for practical, real-time fraud detection methodologies. Using a Kaggle credit card dataset, I tackle class imbalance using the Synthetic Minority Oversampling Technique (SMOTE) to enhance modeling efficiency. I compare several machine learning algorithms, including Logistic Regression, Linear Discriminant Analysis, K-nearest Neighbors, Classification and Regression Tree, Naive Bayes, Support Vector, Random Forest, XGBoost, and Light Gradient-Boosting Machine to classify transactions as fraud or genuine. Rigorous evaluation metrics, such as AUC, PRAUC, F1, KS, Recall, and Precision, identify the Random Forest as the best performer in detecting fraudulent activities. The Random Forest model successfully identifies approximately 92% of transactions scoring 90 and above as fraudulent, equating to a detection rate of over 70% for all fraudulent transactions in the test dataset. Moreover, the model captures more than half of the fraud in each bin of the test dataset. SHAP values provide model explainability, with the SHAP summary plot highlighting the global importance of individual features, such as “V12” and “V14”. SHAP force plots offer local interpretability, revealing the impact of specific features on individual predictions. This study demonstrates the potential of machine learning, particularly the Random Forest model, for real-time credit card fraud detection, offering a promising approach to mitigate financial losses and protect consumers.
文摘The purpose of this paper is to argue the effectiveness of self-regulated learning in English education in Chinese college classroom instruction. A study is given to show whether the introduction of self-regulated learning can help improve Chinese college students' English learning, and help them perform better in the National English test-CET-4 (College English Test Level-4,).
文摘The proliferation of digital payment methods facilitated by various online platforms and applications has led to a surge in financial fraud,particularly in credit card transactions.Advanced technologies such as machine learning have been widely employed to enhance the early detection and prevention of losses arising frompotentially fraudulent activities.However,a prevalent approach in existing literature involves the use of extensive data sampling and feature selection algorithms as a precursor to subsequent investigations.While sampling techniques can significantly reduce computational time,the resulting dataset relies on generated data and the accuracy of the pre-processing machine learning models employed.Such datasets often lack true representativeness of realworld data,potentially introducing secondary issues that affect the precision of the results.For instance,undersampling may result in the loss of critical information,while over-sampling can lead to overfitting machine learning models.In this paper,we proposed a classification study of credit card fraud using fundamental machine learning models without the application of any sampling techniques on all the features present in the original dataset.The results indicate that Support Vector Machine(SVM)consistently achieves classification performance exceeding 90%across various evaluation metrics.This discovery serves as a valuable reference for future research,encouraging comparative studies on original dataset without the reliance on sampling techniques.Furthermore,we explore hybrid machine learning techniques,such as ensemble learning constructed based on SVM,K-Nearest Neighbor(KNN)and decision tree,highlighting their potential advancements in the field.The study demonstrates that the proposed machine learning models yield promising results,suggesting that pre-processing the dataset with sampling algorithm or additional machine learning technique may not always be necessary.This research contributes to the field of credit card fraud detection by emphasizing the potential of employing machine learning models directly on original datasets,thereby simplifying the workflow and potentially improving the accuracy and efficiency of fraud detection systems.
文摘This study investigates the need for credit supervision as conducted by on-site banking supervisors.It builds on a real bank on-site credit examination to compare the performance of a hypothetical self-supervision approach,in which banks themselves assess their loan portfolios without external intervention,with the on-site banking supervision approach of the Central Bank of Brazil.The experiment develops two machine learning classification models:the first model is based on good and bad ratings informed by banks,and the second model is based on past on-site credit portfolio examinations conducted by banking supervision.The findings show that the overall performance of the on-site supervision approach is consistently higher than the performance of the self-supervision approach,justifying the need for on-site credit portfolio examination as conducted by the Central Bank.
文摘Personal credit risk assessment is an important part of the development of financial enterprises. Big data credit investigation is an inevitable trend of personal credit risk assessment, but some data are missing and the amount of data is small, so it is difficult to train. At the same time, for different financial platforms, we need to use different models to train according to the characteristics of the current samples, which is time-consuming. <span style="font-family:Verdana;">In view of</span><span style="font-family:Verdana;"> these two problems, this paper uses the idea of transfer learning to build a transferable personal credit risk model based on Instance-based Transfer Learning (Instance-based TL). The model balances the weight of the samples in the source domain, and migrates the existing large dataset samples to the target domain of small samples, and finds out the commonness between them. At the same time, we have done a lot of experiments on the selection of base learners, including traditional machine learning algorithms and ensemble learning algorithms, such as decision tree, logistic regression, </span><span style="font-family:Verdana;">xgboost</span> <span style="font-family:Verdana;">and</span><span style="font-family:Verdana;"> so on. The datasets are from P2P platform and bank, the results show that the AUC value of Instance-based TL is 24% higher than that of the traditional machine learning model, which fully proves that the model in this paper has good application value. The model’s evaluation uses AUC, prediction, recall, F1. These criteria prove that this model has good application value from many aspects. At present, we are trying to apply this model to more fields to improve the robustness and applicability of the model;on the other hand, we are trying to do more in-depth research on domain adaptation to enrich the model.</span>
基金supported by JSPS KAKENHI Grant Number 19J10715.
文摘This study investigates the learning curve of commercial banks regarding the efficiency of credit and value creation.However,current empirical methods for accessing the learning curve in organizations are not suitable for use in financial institutions.Considering bank-specific characteristics,we introduce a dynamic learning curve using a cost function adjusted to capture learning-by-doing in banks.Using the model,we test several hypotheses on the impact of bank intermediary experience(learning)on the efficiency of credit and value creation in Japanese commercial banks.The findings show that bank intermediary learning significantly improves the cost efficiency gain in the gross value created,total credit created,and investment.However,bank intermediary experience has no significant effect on the efficiency of the economic value created for all the banks analyzed.These findings have practical implications for evaluating cost dynamics in bank credit and value creation,risk management,lending to the real sector,and shareholder value creation.
文摘Implementing new machine learning(ML)algorithms for credit default prediction is associated with better predictive performance;however,it also generates new model risks,particularly concerning the supervisory validation process.Recent industry surveys often mention that uncertainty about how supervisors might assess these risks could be a barrier to innovation.In this study,we propose a new framework to quantify model risk-adjustments to compare the performance of several ML methods.To address this challenge,we first harness the internal ratings-based approach to identify up to 13 risk components that we classify into 3 main categories—statistics,technology,and market conduct.Second,to evaluate the importance of each risk category,we collect a series of regulatory documents related to three potential use cases—regulatory capital,credit scoring,or provisioning—and we compute the weight of each category according to the intensity of their mentions,using natural language processing and a risk terminology based on expert knowledge.Finally,we test our framework using popular ML models in credit risk,and a publicly available database,to quantify some proxies of a subset of risk factors that we deem representative.We measure the statistical risk according to the number of hyperparameters and the stability of the predictions.The technological risk is assessed through the transparency of the algorithm and the latency of the ML training method,while the market conduct risk is quantified by the time it takes to run a post hoc technique(SHapley Additive exPlanations)to interpret the output.
基金This work is partially supported by grants from the Key Program of National Natural Science Foundation of China(NSFC Nos.71631005 and 71731009)the Major Program of the National Social Science Foundation of China(No.19ZDA103).
文摘To solve the high-dimensionality issue and improve its accuracy in credit risk assessment,a high-dimensionality-trait-driven learning paradigm is proposed for feature extraction and classifier selection.The proposed paradigm consists of three main stages:categorization of high dimensional data,high-dimensionality-trait-driven feature extraction,and high-dimensionality-trait-driven classifier selection.In the first stage,according to the definition of high-dimensionality and the relationship between sample size and feature dimensions,the high-dimensionality traits of credit dataset are further categorized into two types:100<feature dimensions<sample size,and feature dimensions≥sample size.In the second stage,some typical feature extraction methods are tested regarding the two categories of high dimensionality.In the final stage,four types of classifiers are performed to evaluate credit risk considering different high-dimensionality traits.For the purpose of illustration and verification,credit classification experiments are performed on two publicly available credit risk datasets,and the results show that the proposed high-dimensionality-trait-driven learning paradigm for feature extraction and classifier selection is effective in handling high-dimensional credit classification issues and improving credit classification accuracy relative to the benchmark models listed in this study.
文摘Credit bank system is a very important management mode in current higher vocational education,and it is also an important means for the Ministry of Education to promote China's modern education reform.This paper combines the specific background of higher vocational enrollment expansion,expounds the practical significance of the credit bank system in the training of the expanded enrollment talents,and explores the methods and rules of the authentication and transformation of the learning achievement of the expanded enrollment students under the system.In order to achieve convergence of different types of learning results,smooth talent growth channel,a useful exploration is carried out.
文摘Credit card fraud is a wide-ranging issue for financial institutions, involving theft and fraud committed using a payment card. In this paper, we explore the application of linear and nonlinear statistical modeling and machine learning models on real credit card transaction data. The models built are supervised fraud models that attempt to identify which transactions are most likely fraudulent. We discuss the processes of data exploration, data cleaning, variable creation, feature selection, model algorithms, and results. Five different supervised models are explored and compared including logistic regression, neural networks, random forest, boosted tree and support vector machines. The boosted tree model shows the best fraud detection result (FDR = 49.83%) for this particular data set. The resulting model can be utilized in a credit card fraud detection system. A similar model development process can be performed in related business domains such as insurance and telecommunications, to avoid or detect fraudulent activity.
文摘Self-regulation is crucial to learners’learning outcomes in a blended education context.This paper first discusses its definitions and importance,then explores factors affecting self-regulation,and finally puts forward several ways to improve learners’self-regulation.
文摘Objective:Self-directed training represents a challenge in simulation-based training as low cognitive effort can occur when learners overrate their own level of performance.This study aims to explore the mechanisms underlying the positive effects of a structured self-assessment intervention during simulation-based training of mastoidectomy.Methods:A prospective,educational cohort study of a novice training program consisting of directed,self-regulated learning with distributed practice(5x3 procedures)in a virtual reality temporal bone simulator.The intervention consisted of structured self-assessment after each procedure using a rating form supported by small videos.Semi-structured telephone interviews upon completion of training were conducted with 13 out of 15 participants.Interviews were analysed using directed content analysis and triangulated with quantitative data on secondary task reaction time for cognitive load estimation and participants’self-assessment scores.Results:Six major themes were identified in the interviews:goal-directed behaviour,use of learning supports for scaffolding of the training,cognitive engagement,motivation from self-assessment,selfassessment bias,and feedback on self-assessment(validation).Participants seemed to self-regulate their learning by forming individual sub-goals and strategies within the overall goal of the procedure.They scaffolded their learning through the available learning supports.Finally,structured self-assessment was reported to increase the participants’cognitive engagement,which was further supported by a quantitative increase in cognitive load.Conclusions:Structured self-assessment in simulation-based surgical training of mastoidectomy seems to promote cognitive engagement and motivation in the learning task and to facilitate self-regulated learning.
基金This research was funded by National Natural Science Foundation of China (61906036)Science and Technology Project of State Grid Jiangsu Power Supply Company (No.J2021034).
文摘With the rapid development of the internet of things(IoT),electricity consumption data can be captured and recorded in the IoT cloud center.This provides a credible data source for enterprise credit scoring,which is one of the most vital elements during the financial decision-making process.Accordingly,this paper proposes to use deep learning to train an enterprise credit scoring model by inputting the electricity consumption data.Instead of predicting the credit rating,our method can generate an absolute credit score by a novel deep ranking model–ranking extreme gradient boosting net(rankXGB).To boost the performance,the rankXGB model combines several weak ranking models into a strong model.Due to the high computational cost and the vast amounts of data,we design an edge computing framework to reduce the latency of enterprise credit evaluation.Specially,we design a two-stage deep learning task architecture,including a cloud-based weak credit ranking and an edge-based credit score calculation.In the first stage,we send the electricity consumption data of the evaluated enterprise to the computing cloud server,where multiple weak-ranking networks are executed in parallel to produce multiple weak-ranking results.In the second stage,the edge device fuses multiple ranking results generated in the cloud server to produce a more reliable ranking result,which is used to calculate an absolute credit score by score normalization.The experiments demonstrate that our method can achieve accurate enterprise credit evaluation quickly.
文摘According to the Food and Agriculture Organization of the United Nations (FAO), there are about 500 million smallholder farmers in the world, and in developing countries, such farmers produce about 80% of the food consumed there;their farming activities are therefore critical to the economies of their countries and to the global food security. However, these farmers face the challenges of limited access to credit, often due to the fact that many of them farm on unregistered land that cannot be offered as collateral to lending institutions;but even when they are on registered land, the fear of losing such land that they should default on loan payments often prevents them from applying for farm credit;and even if they apply, they still get disadvantaged by low credit scores (a measure of creditworthiness). The result is that they are often unable to use optimal farm inputs such as fertilizer and good seeds among others. This depresses their yields, and in turn, has negative implications for the food security in their communities, and in the world, hence making it difficult for the UN to achieve its sustainable goal no.2 (no hunger). This study aimed to demonstrate how geospatial technology can be used to leverage farm credit scoring for the benefit of smallholder farmers. A survey was conducted within the study area to identify the smallholder farms and farmers. A sample of surveyed farmers was then subjected to credit scoring by machine learning. In the first instance, the traditional financial data approach was used and the results showed that over 40% of the farmers could not qualify for credit. When non-financial geospatial data, i.e. Normalized Difference Vegetation Index (NDVI) was introduced into the scoring model, the number of farmers not qualifying for credit reduced significantly to 24%. It is concluded that the introduction of the NDVI variable into the traditional scoring model could improve significantly the smallholder farmers’ chances of accessing credit, thus enabling such a farmer to be better evaluated for credit on the basis of the health of their crop, rather than on a traditional form of collateral.