A dandelion algorithm(DA) is a recently developed intelligent optimization algorithm for function optimization problems. Many of its parameters need to be set by experience in DA,which might not be appropriate for all...A dandelion algorithm(DA) is a recently developed intelligent optimization algorithm for function optimization problems. Many of its parameters need to be set by experience in DA,which might not be appropriate for all optimization problems. A self-adapting and efficient dandelion algorithm is proposed in this work to lower the number of DA's parameters and simplify DA's structure. Only the normal sowing operator is retained;while the other operators are discarded. An adaptive seeding radius strategy is designed for the core dandelion. The results show that the proposed algorithm achieves better performance on the standard test functions with less time consumption than its competitive peers. In addition, the proposed algorithm is applied to feature selection for credit card fraud detection(CCFD), and the results indicate that it can obtain higher classification and detection performance than the-state-of-the-art methods.展开更多
Credit Card Fraud Detection(CCFD)is an essential technology for banking institutions to control fraud risks and safeguard their reputation.Class imbalance and insufficient representation of feature data relating to cr...Credit Card Fraud Detection(CCFD)is an essential technology for banking institutions to control fraud risks and safeguard their reputation.Class imbalance and insufficient representation of feature data relating to credit card transactions are two prevalent issues in the current study field of CCFD,which significantly impact classification models’performance.To address these issues,this research proposes a novel CCFD model based on Multifeature Fusion and Generative Adversarial Networks(MFGAN).The MFGAN model consists of two modules:a multi-feature fusion module for integrating static and dynamic behavior data of cardholders into a unified highdimensional feature space,and a balance module based on the generative adversarial network to decrease the class imbalance ratio.The effectiveness of theMFGAN model is validated on two actual credit card datasets.The impacts of different class balance ratios on the performance of the four resamplingmodels are analyzed,and the contribution of the two different modules to the performance of the MFGAN model is investigated via ablation experiments.Experimental results demonstrate that the proposed model does better than state-of-the-art models in terms of recall,F1,and Area Under the Curve(AUC)metrics,which means that the MFGAN model can help banks find more fraudulent transactions and reduce fraud losses.展开更多
BACKGROUND: Spinal muscular atrophy (SMA) is a kind of degenerative disease of nervous system. There are 4 types in clinic, especially types Ⅰ, Ⅱ and Ⅲ are common, and the researches on those 3 types are relativ...BACKGROUND: Spinal muscular atrophy (SMA) is a kind of degenerative disease of nervous system. There are 4 types in clinic, especially types Ⅰ, Ⅱ and Ⅲ are common, and the researches on those 3 types are relative mature. Type IV is a kind of adult spinal muscular atrophy (ASMA), which has low incidence rate and is often misdiagnosed as amyotrophic lateral sclerosis, muscular dystrophy, cervical syndrome, or others.OBJEETIVE: To observe the clinical features of 46 ASMA patients and analyze the relationship between course and activity of daily living. DESIGN : Case analysis.SETTING: Departments of Neurology of the 81 Hospital of Chinese PLA, the Second Affiliated Hospital of Nanjing Medical College and General Hospital of Nanjing Military Area Command of Chinese PLA.PARTICIPANTS : A total of 46 ASMA patients were selected from the Departments of Neurology of the 81 Hospital of Chinese PLA, the Second Affiliated Hospital of Nanjing Medical College and General Hospital of Nanjing Military Area Command of Chinese PLA between April 1998 and January 2002. All patients were consentient. Among 46 cases, there were 37 males and 9 females with the mean age of 42 years. The patients' courses in all ranged from 6 months to 23 years, concretely, courses of 37 cases were less than or equal to 5 years, and those of 9 cases were more than or equal to 6 years.METHODS: ① All the 46 ASMA patients were asked to check blood sedimentation, anti O, serum creatinine, creatine, blood creatine phosphokinase (CPK) and muscular biopsy as early as possible. ②X-ray was used to measure plain film of cervical vertebra borderline film of cranium and neck at proximal end of upper limb of 25 cases and plain film of abdominal vertebra at proximal end of lower limb of 17 cases. ③ Cerebrospinal fluid of lumbar puncture was checked on 42 cases, for routine examination, biochemical examination, and immunoglobulin examination. Electromyogram (EMG) was also examined to 42 cases. ④ Barthel index was used to evaluate activities of daily living (ADL) of patients with various courses. The index ranged from 1 to 100. The more the index of a ASMA was, the stronger his independence was. ⑤ The Barthel indexes of patients with courses ≤ 5 years and those ≥ 6 years were compared with univariate analysis of variance. MAIN OUTCOME MEASURES: ① Incidences of all patients at the first time; ② values of relative blood and blood biochemistry; ③results of muscular biopsy; ④ results of EMG and relative X-ray plain film of 42 cases; ⑤ results of cerebrospinal fluid of 42 cases; ⑥ comparisons of Barthel index of patients with various courses.RESULTS: A total of 46 ASMA patients were involved in the final analysis. ① Incidence on the first time: 25 patients had the disease at the proximal end of upper limb, 17 at the proximal end of lower limb, and 4 at the four limbs. ② Value of serum-blood CPK of one fourth patients was increased slightly (3.034-9.735 μkat/L; normal value: 0.400-3.001 μkat/L), and other values of blood and blood biochemical indicator were normal. ③Results of muscle biopsy of all patients showed that a small group of muscular atrophy could be observed mostly, and muscle group in the same type and compensatory hypertrophy of muscle fibres were also observed with ATP enzyme staining. ④ Results of EMG of 42 cases suggested that 37 patients had mild and moderate nerve-derived injury and 3 had mild muscle-derived injury. Results of all the X-ray plain films in this study were normal. ⑤ Results of routine, biochemical and immunoglobulin examination in cerebrospinal fluid of lumbar puncture in 42 cases were all normal. ⑥The difference between Barthel indexes of patients with courses ≤ 5 years and those ≥ 6 years was not significant [(64.73±20.38) vs (68.89±21.76) points, P〉 0.05]. CONCLUSION : ① Amyasthenia is mainly occurred at the proximal end of the four limbs of ASMA patients. A small group of muscular atrophy is its mostly pathological change, and the progression of the disease is slow. ② Most patients have mild and moderate nerve-derived injury under EMG examination.③ The duration of a patient suffered from the disease has no obvious effect on his ADL ability.展开更多
Tax fraud is one of the substantial issues affecting governments around the world.It is defined as the intentional alteration of information provided on a tax return to reduce someone’s tax liability.This is done by ...Tax fraud is one of the substantial issues affecting governments around the world.It is defined as the intentional alteration of information provided on a tax return to reduce someone’s tax liability.This is done by either reducing sales or increasing purchases.According to recent studies,governments lose over$500 billion annually due to tax fraud.A loss of this magnitude motivates tax authorities worldwide to implement efficient fraud detection strategies.Most of the work done in tax fraud using machine learning is centered on supervised models.A significant drawback of this approach is that it requires tax returns that have been previously audited,which constitutes a small percentage of the data.Other strategies focus on using unsupervised models that utilize the whole data when they search for patterns,though ignore whether the tax returns are fraudulent or not.Therefore,unsupervised models are limited in their usefulness if they are used independently to detect tax fraud.The work done in this paper focuses on addressing such limitations by proposing a fraud detection framework that utilizes supervised and unsupervised models to exploit the entire set of tax returns.The framework consists of four modules:A supervised module,which utilizes a tree-based model to extract knowledge from the data;an unsupervised module,which calculates anomaly scores;a behavioral module,which assigns a compliance score for each taxpayer;and a prediction module,which utilizes the output of the previous modules to output a probability of fraud for each tax return.We demonstrate the effectiveness of our framework by testing it on existent tax returns provided by the Saudi tax authority.展开更多
Spam is a universal problem with which everyone is familiar. A number of approaches are used for Spam filtering. The most common filtering technique is content-based filtering which uses the actual text of message to ...Spam is a universal problem with which everyone is familiar. A number of approaches are used for Spam filtering. The most common filtering technique is content-based filtering which uses the actual text of message to determine whether it is Spam or not. The content is very dynamic and it is very challenging to represent all information in a mathematical model of classification. For instance, in content-based Spam filtering, the characteristics used by the filter to identify Spam message are constantly changing over time. Na?ve Bayes method represents the changing nature of message using probability theory and support vector machine (SVM) represents those using different features. These two methods of classification are efficient in different domains and the case of Nepali SMS or Text classification has not yet been in consideration;these two methods do not consider the issue and it is interesting to find out the performance of both the methods in the problem of Nepali Text classification. In this paper, the Na?ve Bayes and SVM-based classification techniques are implemented to classify the Nepali SMS as Spam and non-Spam. An empirical analysis for various text cases has been done to evaluate accuracy measure of the classification methodologies used in this study. And, it is found to be 87.15% accurate in SVM and 92.74% accurate in the case of Na?ve Bayes.展开更多
基金supported by the Institutional Fund Projects(IFPIP-1481-611-1443)the Key Projects of Natural Science Research in Anhui Higher Education Institutions(2022AH051909)+1 种基金the Provincial Quality Project of Colleges and Universities in Anhui Province(2022sdxx020,2022xqhz044)Bengbu University 2021 High-Level Scientific Research and Cultivation Project(2021pyxm04)。
文摘A dandelion algorithm(DA) is a recently developed intelligent optimization algorithm for function optimization problems. Many of its parameters need to be set by experience in DA,which might not be appropriate for all optimization problems. A self-adapting and efficient dandelion algorithm is proposed in this work to lower the number of DA's parameters and simplify DA's structure. Only the normal sowing operator is retained;while the other operators are discarded. An adaptive seeding radius strategy is designed for the core dandelion. The results show that the proposed algorithm achieves better performance on the standard test functions with less time consumption than its competitive peers. In addition, the proposed algorithm is applied to feature selection for credit card fraud detection(CCFD), and the results indicate that it can obtain higher classification and detection performance than the-state-of-the-art methods.
基金supported by the National Key R&D Program of China(Nos.2022YFB3104103,and 2019QY1406)the National Natural Science Foundation of China(Nos.61732022,61732004,61672020,and 62072131).
文摘Credit Card Fraud Detection(CCFD)is an essential technology for banking institutions to control fraud risks and safeguard their reputation.Class imbalance and insufficient representation of feature data relating to credit card transactions are two prevalent issues in the current study field of CCFD,which significantly impact classification models’performance.To address these issues,this research proposes a novel CCFD model based on Multifeature Fusion and Generative Adversarial Networks(MFGAN).The MFGAN model consists of two modules:a multi-feature fusion module for integrating static and dynamic behavior data of cardholders into a unified highdimensional feature space,and a balance module based on the generative adversarial network to decrease the class imbalance ratio.The effectiveness of theMFGAN model is validated on two actual credit card datasets.The impacts of different class balance ratios on the performance of the four resamplingmodels are analyzed,and the contribution of the two different modules to the performance of the MFGAN model is investigated via ablation experiments.Experimental results demonstrate that the proposed model does better than state-of-the-art models in terms of recall,F1,and Area Under the Curve(AUC)metrics,which means that the MFGAN model can help banks find more fraudulent transactions and reduce fraud losses.
文摘BACKGROUND: Spinal muscular atrophy (SMA) is a kind of degenerative disease of nervous system. There are 4 types in clinic, especially types Ⅰ, Ⅱ and Ⅲ are common, and the researches on those 3 types are relative mature. Type IV is a kind of adult spinal muscular atrophy (ASMA), which has low incidence rate and is often misdiagnosed as amyotrophic lateral sclerosis, muscular dystrophy, cervical syndrome, or others.OBJEETIVE: To observe the clinical features of 46 ASMA patients and analyze the relationship between course and activity of daily living. DESIGN : Case analysis.SETTING: Departments of Neurology of the 81 Hospital of Chinese PLA, the Second Affiliated Hospital of Nanjing Medical College and General Hospital of Nanjing Military Area Command of Chinese PLA.PARTICIPANTS : A total of 46 ASMA patients were selected from the Departments of Neurology of the 81 Hospital of Chinese PLA, the Second Affiliated Hospital of Nanjing Medical College and General Hospital of Nanjing Military Area Command of Chinese PLA between April 1998 and January 2002. All patients were consentient. Among 46 cases, there were 37 males and 9 females with the mean age of 42 years. The patients' courses in all ranged from 6 months to 23 years, concretely, courses of 37 cases were less than or equal to 5 years, and those of 9 cases were more than or equal to 6 years.METHODS: ① All the 46 ASMA patients were asked to check blood sedimentation, anti O, serum creatinine, creatine, blood creatine phosphokinase (CPK) and muscular biopsy as early as possible. ②X-ray was used to measure plain film of cervical vertebra borderline film of cranium and neck at proximal end of upper limb of 25 cases and plain film of abdominal vertebra at proximal end of lower limb of 17 cases. ③ Cerebrospinal fluid of lumbar puncture was checked on 42 cases, for routine examination, biochemical examination, and immunoglobulin examination. Electromyogram (EMG) was also examined to 42 cases. ④ Barthel index was used to evaluate activities of daily living (ADL) of patients with various courses. The index ranged from 1 to 100. The more the index of a ASMA was, the stronger his independence was. ⑤ The Barthel indexes of patients with courses ≤ 5 years and those ≥ 6 years were compared with univariate analysis of variance. MAIN OUTCOME MEASURES: ① Incidences of all patients at the first time; ② values of relative blood and blood biochemistry; ③results of muscular biopsy; ④ results of EMG and relative X-ray plain film of 42 cases; ⑤ results of cerebrospinal fluid of 42 cases; ⑥ comparisons of Barthel index of patients with various courses.RESULTS: A total of 46 ASMA patients were involved in the final analysis. ① Incidence on the first time: 25 patients had the disease at the proximal end of upper limb, 17 at the proximal end of lower limb, and 4 at the four limbs. ② Value of serum-blood CPK of one fourth patients was increased slightly (3.034-9.735 μkat/L; normal value: 0.400-3.001 μkat/L), and other values of blood and blood biochemical indicator were normal. ③Results of muscle biopsy of all patients showed that a small group of muscular atrophy could be observed mostly, and muscle group in the same type and compensatory hypertrophy of muscle fibres were also observed with ATP enzyme staining. ④ Results of EMG of 42 cases suggested that 37 patients had mild and moderate nerve-derived injury and 3 had mild muscle-derived injury. Results of all the X-ray plain films in this study were normal. ⑤ Results of routine, biochemical and immunoglobulin examination in cerebrospinal fluid of lumbar puncture in 42 cases were all normal. ⑥The difference between Barthel indexes of patients with courses ≤ 5 years and those ≥ 6 years was not significant [(64.73±20.38) vs (68.89±21.76) points, P〉 0.05]. CONCLUSION : ① Amyasthenia is mainly occurred at the proximal end of the four limbs of ASMA patients. A small group of muscular atrophy is its mostly pathological change, and the progression of the disease is slow. ② Most patients have mild and moderate nerve-derived injury under EMG examination.③ The duration of a patient suffered from the disease has no obvious effect on his ADL ability.
基金This work was supported by ZATCAThe author is grateful for the help provided by the risk and intelligence department as well as the continued support of the governor for advancing the field of AI and machine learning in government entities。
文摘Tax fraud is one of the substantial issues affecting governments around the world.It is defined as the intentional alteration of information provided on a tax return to reduce someone’s tax liability.This is done by either reducing sales or increasing purchases.According to recent studies,governments lose over$500 billion annually due to tax fraud.A loss of this magnitude motivates tax authorities worldwide to implement efficient fraud detection strategies.Most of the work done in tax fraud using machine learning is centered on supervised models.A significant drawback of this approach is that it requires tax returns that have been previously audited,which constitutes a small percentage of the data.Other strategies focus on using unsupervised models that utilize the whole data when they search for patterns,though ignore whether the tax returns are fraudulent or not.Therefore,unsupervised models are limited in their usefulness if they are used independently to detect tax fraud.The work done in this paper focuses on addressing such limitations by proposing a fraud detection framework that utilizes supervised and unsupervised models to exploit the entire set of tax returns.The framework consists of four modules:A supervised module,which utilizes a tree-based model to extract knowledge from the data;an unsupervised module,which calculates anomaly scores;a behavioral module,which assigns a compliance score for each taxpayer;and a prediction module,which utilizes the output of the previous modules to output a probability of fraud for each tax return.We demonstrate the effectiveness of our framework by testing it on existent tax returns provided by the Saudi tax authority.
文摘Spam is a universal problem with which everyone is familiar. A number of approaches are used for Spam filtering. The most common filtering technique is content-based filtering which uses the actual text of message to determine whether it is Spam or not. The content is very dynamic and it is very challenging to represent all information in a mathematical model of classification. For instance, in content-based Spam filtering, the characteristics used by the filter to identify Spam message are constantly changing over time. Na?ve Bayes method represents the changing nature of message using probability theory and support vector machine (SVM) represents those using different features. These two methods of classification are efficient in different domains and the case of Nepali SMS or Text classification has not yet been in consideration;these two methods do not consider the issue and it is interesting to find out the performance of both the methods in the problem of Nepali Text classification. In this paper, the Na?ve Bayes and SVM-based classification techniques are implemented to classify the Nepali SMS as Spam and non-Spam. An empirical analysis for various text cases has been done to evaluate accuracy measure of the classification methodologies used in this study. And, it is found to be 87.15% accurate in SVM and 92.74% accurate in the case of Na?ve Bayes.