Fraud problems in loan application assessment cause significant losses for finance companies worldwide, and much research has focused on machine learning methods to improve the efficacy of fraud detection in some fina...Fraud problems in loan application assessment cause significant losses for finance companies worldwide, and much research has focused on machine learning methods to improve the efficacy of fraud detection in some financial domains. However, diverse information falsification in individual fraud remains one of the most challenging problems in loan applications. To this end, we conducted an empirical study to explore the relationships between various fraud types and analyzed the factors influencing information fabrication. Weak relationships exist among different falsification types, and some essential factors play the same roles in different fraud types. In contrast, others have various or opposing effects on these types of frauds. Based on this finding, we propose a novel hierarchical multi-task learning approach to refine fraud-detection systems. Specifically, we first developed a hierarchical fraud category method to break down this problem into several subtasks according to the information types falsified by customers, reducing fraud identification's difficulty. Second, a heterogeneous network with a meta-path-based random walk and heterogeneous skip-gram model can solve the representation learning problem owing to the sophisticated relationships among the applicants' information. Furthermore, the final subtasks can be predicted using a multi-task learning approach with two prediction layers. The first layer provides the probabilities of general fraud categories as auxiliary information for the second layer, which is for specific subtask prediction. Finally, we conducted extensive experiments based on a real-world dataset to demonstrate the effectiveness of the proposed approach.展开更多
基金the support of the NSFC Project of International Cooperation and Exchanges under Grant No.72010107004National Natural Science Foundation of China(72101176)Beijing Fantaike Technology Co.Ltd.
文摘Fraud problems in loan application assessment cause significant losses for finance companies worldwide, and much research has focused on machine learning methods to improve the efficacy of fraud detection in some financial domains. However, diverse information falsification in individual fraud remains one of the most challenging problems in loan applications. To this end, we conducted an empirical study to explore the relationships between various fraud types and analyzed the factors influencing information fabrication. Weak relationships exist among different falsification types, and some essential factors play the same roles in different fraud types. In contrast, others have various or opposing effects on these types of frauds. Based on this finding, we propose a novel hierarchical multi-task learning approach to refine fraud-detection systems. Specifically, we first developed a hierarchical fraud category method to break down this problem into several subtasks according to the information types falsified by customers, reducing fraud identification's difficulty. Second, a heterogeneous network with a meta-path-based random walk and heterogeneous skip-gram model can solve the representation learning problem owing to the sophisticated relationships among the applicants' information. Furthermore, the final subtasks can be predicted using a multi-task learning approach with two prediction layers. The first layer provides the probabilities of general fraud categories as auxiliary information for the second layer, which is for specific subtask prediction. Finally, we conducted extensive experiments based on a real-world dataset to demonstrate the effectiveness of the proposed approach.