Motivated by the Bagging Partial Least Squares(Bagging PLS)and Principal Component Analysis(PCA)algorithms,a novel approach known as Principal Model Analysis(PMA)method is introduced in this paper.In the proposed PMA ...Motivated by the Bagging Partial Least Squares(Bagging PLS)and Principal Component Analysis(PCA)algorithms,a novel approach known as Principal Model Analysis(PMA)method is introduced in this paper.In the proposed PMA algorithm,the PCA and the Bagging PLS are combined.In this method,multiple PLS models are trained on sub-training sets,derived from the training set using the random sampling with replacement approach.The regression coefficients of all the sub-PLS models are fused in a joint regression coefficient matrix.The final projection direction is then estimated by performing the PCA on the joint regression coefficient matrix.Subsequently,the proposed PMA method is compared with other traditional dimension reduction methods,such as PLS,Bagging PLS,Linear discriminant analysis(LDA)and PLS-LDA.Experimental results on six public datasets demonstrate that our proposed method consistently outperforms other approaches in terms of classification performance and exhibits greater stability.Additionally,it is employed in the application of financial statement fraud identification.PMA and other five algorithms are utilized to financial statement fraud which concerned by the academic community,and the results indicate that the classification of PMA surpassed that of the other methods.展开更多
The detection of anomalous events in huge amounts of data is sought in many domains.For instance,in the context of financial data,the detection of suspicious events is a prerequisite to identify and prevent attempts t...The detection of anomalous events in huge amounts of data is sought in many domains.For instance,in the context of financial data,the detection of suspicious events is a prerequisite to identify and prevent attempts to defraud.Hence,various financial fraud detection approaches have started to exploit Visual Analytics techniques.However,there is no study available giving a systematic outline of the different approaches in this field to understand common strategies but also differences.Thus,we present a survey of existing approaches of visual fraud detection in order to classify different tasks and solutions,to identify and to propose further research opportunities.In this work,fraud detection solutions are explored through five main domains:banks,the stock market,telecommunication companies,insurance companies,and internal frauds.The selected domains explored in this survey were chosen for sharing similar time-oriented and multivariate data characteristics.In this survey,we(1)analyze the current state of the art in this field;(2)define a categorization scheme covering different application domains,visualization methods,interaction techniques,and analytical methods which are used in the context of fraud detection;(3)describe and discuss each approach according to the proposed scheme;and(4)identify challenges and future research topics.展开更多
基金Supported by the Beijing Municipal Social Science Foundation(SZ202210005004)Beijing Natural Science Foundation(9242004)。
文摘Motivated by the Bagging Partial Least Squares(Bagging PLS)and Principal Component Analysis(PCA)algorithms,a novel approach known as Principal Model Analysis(PMA)method is introduced in this paper.In the proposed PMA algorithm,the PCA and the Bagging PLS are combined.In this method,multiple PLS models are trained on sub-training sets,derived from the training set using the random sampling with replacement approach.The regression coefficients of all the sub-PLS models are fused in a joint regression coefficient matrix.The final projection direction is then estimated by performing the PCA on the joint regression coefficient matrix.Subsequently,the proposed PMA method is compared with other traditional dimension reduction methods,such as PLS,Bagging PLS,Linear discriminant analysis(LDA)and PLS-LDA.Experimental results on six public datasets demonstrate that our proposed method consistently outperforms other approaches in terms of classification performance and exhibits greater stability.Additionally,it is employed in the application of financial statement fraud identification.PMA and other five algorithms are utilized to financial statement fraud which concerned by the academic community,and the results indicate that the classification of PMA surpassed that of the other methods.
基金The research leading to these results has received funding from the Centre for Visual Analytics Science and Technology(CVAST),funded by the Austrian Federal Ministry of Science,Research,and Economy in the exceptional Laura Bassi Centres of Excellence initiative(#822746).
文摘The detection of anomalous events in huge amounts of data is sought in many domains.For instance,in the context of financial data,the detection of suspicious events is a prerequisite to identify and prevent attempts to defraud.Hence,various financial fraud detection approaches have started to exploit Visual Analytics techniques.However,there is no study available giving a systematic outline of the different approaches in this field to understand common strategies but also differences.Thus,we present a survey of existing approaches of visual fraud detection in order to classify different tasks and solutions,to identify and to propose further research opportunities.In this work,fraud detection solutions are explored through five main domains:banks,the stock market,telecommunication companies,insurance companies,and internal frauds.The selected domains explored in this survey were chosen for sharing similar time-oriented and multivariate data characteristics.In this survey,we(1)analyze the current state of the art in this field;(2)define a categorization scheme covering different application domains,visualization methods,interaction techniques,and analytical methods which are used in the context of fraud detection;(3)describe and discuss each approach according to the proposed scheme;and(4)identify challenges and future research topics.