The fraudulent website image is a vital information carrier for telecom fraud.The efficient and precise recognition of fraudulent website images is critical to combating and dealing with fraudulent websites.Current re...The fraudulent website image is a vital information carrier for telecom fraud.The efficient and precise recognition of fraudulent website images is critical to combating and dealing with fraudulent websites.Current research on image recognition of fraudulent websites is mainly carried out at the level of image feature extraction and similarity study,which have such disadvantages as difficulty in obtaining image data,insufficient image analysis,and single identification types.This study develops a model based on the entropy method for image leader decision and Inception-v3 transfer learning to address these disadvantages.The data processing part of the model uses a breadth search crawler to capture the image data.Then,the information in the images is evaluated with the entropy method,image weights are assigned,and the image leader is selected.In model training and prediction,the transfer learning of the Inception-v3 model is introduced into image recognition of fraudulent websites.Using selected image leaders to train the model,multiple types of fraudulent websites are identified with high accuracy.The experiment proves that this model has a superior accuracy in recognizing images on fraudulent websites compared to other current models.展开更多
The feature analysis of fraudulent websites is of great significance to the combat,prevention and control of telecom fraud crimes.Aiming to address the shortcomings of existing analytical approaches,i.e.single dimensi...The feature analysis of fraudulent websites is of great significance to the combat,prevention and control of telecom fraud crimes.Aiming to address the shortcomings of existing analytical approaches,i.e.single dimension and venerability to anti-reconnaissance,this paper adopts the Stacking,the ensemble learning algorithm,combines multiple modalities such as text,image and URL,and proposes a multimodal fraudulent website identification method by ensembling heterogeneous models.Crossvalidation is first used in the training of multiple largely different base classifiers that are strong in learning,such as BERT model,residual neural network(ResNet)and logistic regression model.Classification of the text,image and URL features are then performed respectively.The results of the base classifiers are taken as the input of the meta-classifier,and the output of which is eventually used as the final identification.The study indicates that the fusion method is more effective in identifying fraudulent websites than the single-modal method,and the recall is increased by at least 1%.In addition,the deployment of the algorithm to the real Internet environment shows the improvement of the identification accuracy by at least 1.9%compared with other fusion methods.展开更多
Currently,telecom fraud is expanding from the traditional telephone network to the Internet,and identifying fraudulent IPs is of great significance for reducing Internet telecom fraud and protecting consumer rights.Ho...Currently,telecom fraud is expanding from the traditional telephone network to the Internet,and identifying fraudulent IPs is of great significance for reducing Internet telecom fraud and protecting consumer rights.However,existing telecom fraud identification methods based on blacklists,reputation,content and behavioral characteristics have good identification performance in the telephone network,but it is difficult to apply to the Internet where IP(Internet Protocol)addresses change dynamically.To address this issue,we propose a fraudulent IP identification method based on homology detection and DBSCAN(Density-Based Spatial Clustering of Applications with Noise)clustering(DC-FIPD).First,we analyze the aggregation of fraudulent IP geographies and the homology of IP addresses.Next,the collected fraudulent IPs are clustered geographically to obtain the regional distribution of fraudulent IPs.Then,we constructed the fraudulent IP feature set,used the genetic optimization algorithm to determine the weights of the fraudulent IP features,and designed the calculation method of the IP risk value to give the risk value threshold of the fraudulent IP.Finally,the risk value of the target IP is calculated and the IP is identified based on the risk value threshold.Experimental results on a real-world telecom fraud detection dataset show that the DC-FIPD method achieves an average identification accuracy of 86.64%for fraudulent IPs.Additionally,the method records a precision of 86.08%,a recall of 45.24%,and an F1-score of 59.31%,offering a comprehensive evaluation of its performance in fraud detection.These results highlight the DC-FIPD method’s effectiveness in addressing the challenges of fraudulent IP identification.展开更多
Telecommunication fraud has continuously been causing severe financial loss to telecommunication customers in China for several years.Traditional approaches to detect telecommunication frauds usually rely on construct...Telecommunication fraud has continuously been causing severe financial loss to telecommunication customers in China for several years.Traditional approaches to detect telecommunication frauds usually rely on constructing a blacklist of fraud telephone numbers.However,attackers can simply evade such detection by changing their numbers,which is very easy to achieve through VoIP(Voice over IP).To solve this problem,we detect telecommunication frauds from the contents of a call instead of simply through the caller’s telephone number.Particularly,we collect descriptions of telecommunication fraud from news reports and social media.We use machine learning algorithms to analyze data and to select the high-quality descriptions from the data collected previously to construct datasets.Then we leverage natural language processing to extract features from the textual data.After that,we build rules to identify similar contents within the same call for further telecommunication fraud detection.To achieve online detection of telecommunication frauds,we develop an Android application which can be installed on a customer’s smartphone.When an incoming fraud call is answered,the application can dynamically analyze the contents of the call in order to identify frauds.Our results show that we can protect customers effectively.展开更多
基金supported by the National Social Science Fund of China(23BGL272)。
文摘The fraudulent website image is a vital information carrier for telecom fraud.The efficient and precise recognition of fraudulent website images is critical to combating and dealing with fraudulent websites.Current research on image recognition of fraudulent websites is mainly carried out at the level of image feature extraction and similarity study,which have such disadvantages as difficulty in obtaining image data,insufficient image analysis,and single identification types.This study develops a model based on the entropy method for image leader decision and Inception-v3 transfer learning to address these disadvantages.The data processing part of the model uses a breadth search crawler to capture the image data.Then,the information in the images is evaluated with the entropy method,image weights are assigned,and the image leader is selected.In model training and prediction,the transfer learning of the Inception-v3 model is introduced into image recognition of fraudulent websites.Using selected image leaders to train the model,multiple types of fraudulent websites are identified with high accuracy.The experiment proves that this model has a superior accuracy in recognizing images on fraudulent websites compared to other current models.
基金supported by Zhejiang Provincial Natural Science Foundation of China(Grant No.LGF20G030001)Ministry of Public Security Science and Technology Plan Project(2022LL16)Key scientific research projects of agricultural and social development in Hangzhou in 2020(202004A06).
文摘The feature analysis of fraudulent websites is of great significance to the combat,prevention and control of telecom fraud crimes.Aiming to address the shortcomings of existing analytical approaches,i.e.single dimension and venerability to anti-reconnaissance,this paper adopts the Stacking,the ensemble learning algorithm,combines multiple modalities such as text,image and URL,and proposes a multimodal fraudulent website identification method by ensembling heterogeneous models.Crossvalidation is first used in the training of multiple largely different base classifiers that are strong in learning,such as BERT model,residual neural network(ResNet)and logistic regression model.Classification of the text,image and URL features are then performed respectively.The results of the base classifiers are taken as the input of the meta-classifier,and the output of which is eventually used as the final identification.The study indicates that the fusion method is more effective in identifying fraudulent websites than the single-modal method,and the recall is increased by at least 1%.In addition,the deployment of the algorithm to the real Internet environment shows the improvement of the identification accuracy by at least 1.9%compared with other fusion methods.
基金funded by the National Natural Science Foundation of China under Grant No.62002103Henan Province Science Foundation for Youths No.222300420058+1 种基金Henan Province Science and Technology Research Project No.232102321064Teacher Education Curriculum Reform Research Priority Project No.2023-JSJYZD-011.
文摘Currently,telecom fraud is expanding from the traditional telephone network to the Internet,and identifying fraudulent IPs is of great significance for reducing Internet telecom fraud and protecting consumer rights.However,existing telecom fraud identification methods based on blacklists,reputation,content and behavioral characteristics have good identification performance in the telephone network,but it is difficult to apply to the Internet where IP(Internet Protocol)addresses change dynamically.To address this issue,we propose a fraudulent IP identification method based on homology detection and DBSCAN(Density-Based Spatial Clustering of Applications with Noise)clustering(DC-FIPD).First,we analyze the aggregation of fraudulent IP geographies and the homology of IP addresses.Next,the collected fraudulent IPs are clustered geographically to obtain the regional distribution of fraudulent IPs.Then,we constructed the fraudulent IP feature set,used the genetic optimization algorithm to determine the weights of the fraudulent IP features,and designed the calculation method of the IP risk value to give the risk value threshold of the fraudulent IP.Finally,the risk value of the target IP is calculated and the IP is identified based on the risk value threshold.Experimental results on a real-world telecom fraud detection dataset show that the DC-FIPD method achieves an average identification accuracy of 86.64%for fraudulent IPs.Additionally,the method records a precision of 86.08%,a recall of 45.24%,and an F1-score of 59.31%,offering a comprehensive evaluation of its performance in fraud detection.These results highlight the DC-FIPD method’s effectiveness in addressing the challenges of fraudulent IP identification.
基金supported by National Key R&D Program of China(No.2016QY04W0805)NSFC U1536106,61728209+2 种基金National Top-notch Youth Talents Program of ChinaYouth Innovation Promotion Association CASBeijing Nova Program。
文摘Telecommunication fraud has continuously been causing severe financial loss to telecommunication customers in China for several years.Traditional approaches to detect telecommunication frauds usually rely on constructing a blacklist of fraud telephone numbers.However,attackers can simply evade such detection by changing their numbers,which is very easy to achieve through VoIP(Voice over IP).To solve this problem,we detect telecommunication frauds from the contents of a call instead of simply through the caller’s telephone number.Particularly,we collect descriptions of telecommunication fraud from news reports and social media.We use machine learning algorithms to analyze data and to select the high-quality descriptions from the data collected previously to construct datasets.Then we leverage natural language processing to extract features from the textual data.After that,we build rules to identify similar contents within the same call for further telecommunication fraud detection.To achieve online detection of telecommunication frauds,we develop an Android application which can be installed on a customer’s smartphone.When an incoming fraud call is answered,the application can dynamically analyze the contents of the call in order to identify frauds.Our results show that we can protect customers effectively.