The fraudulent website image is a vital information carrier for telecom fraud.The efficient and precise recognition of fraudulent website images is critical to combating and dealing with fraudulent websites.Current re...The fraudulent website image is a vital information carrier for telecom fraud.The efficient and precise recognition of fraudulent website images is critical to combating and dealing with fraudulent websites.Current research on image recognition of fraudulent websites is mainly carried out at the level of image feature extraction and similarity study,which have such disadvantages as difficulty in obtaining image data,insufficient image analysis,and single identification types.This study develops a model based on the entropy method for image leader decision and Inception-v3 transfer learning to address these disadvantages.The data processing part of the model uses a breadth search crawler to capture the image data.Then,the information in the images is evaluated with the entropy method,image weights are assigned,and the image leader is selected.In model training and prediction,the transfer learning of the Inception-v3 model is introduced into image recognition of fraudulent websites.Using selected image leaders to train the model,multiple types of fraudulent websites are identified with high accuracy.The experiment proves that this model has a superior accuracy in recognizing images on fraudulent websites compared to other current models.展开更多
The feature analysis of fraudulent websites is of great significance to the combat,prevention and control of telecom fraud crimes.Aiming to address the shortcomings of existing analytical approaches,i.e.single dimensi...The feature analysis of fraudulent websites is of great significance to the combat,prevention and control of telecom fraud crimes.Aiming to address the shortcomings of existing analytical approaches,i.e.single dimension and venerability to anti-reconnaissance,this paper adopts the Stacking,the ensemble learning algorithm,combines multiple modalities such as text,image and URL,and proposes a multimodal fraudulent website identification method by ensembling heterogeneous models.Crossvalidation is first used in the training of multiple largely different base classifiers that are strong in learning,such as BERT model,residual neural network(ResNet)and logistic regression model.Classification of the text,image and URL features are then performed respectively.The results of the base classifiers are taken as the input of the meta-classifier,and the output of which is eventually used as the final identification.The study indicates that the fusion method is more effective in identifying fraudulent websites than the single-modal method,and the recall is increased by at least 1%.In addition,the deployment of the algorithm to the real Internet environment shows the improvement of the identification accuracy by at least 1.9%compared with other fusion methods.展开更多
基金supported by the National Social Science Fund of China(23BGL272)。
文摘The fraudulent website image is a vital information carrier for telecom fraud.The efficient and precise recognition of fraudulent website images is critical to combating and dealing with fraudulent websites.Current research on image recognition of fraudulent websites is mainly carried out at the level of image feature extraction and similarity study,which have such disadvantages as difficulty in obtaining image data,insufficient image analysis,and single identification types.This study develops a model based on the entropy method for image leader decision and Inception-v3 transfer learning to address these disadvantages.The data processing part of the model uses a breadth search crawler to capture the image data.Then,the information in the images is evaluated with the entropy method,image weights are assigned,and the image leader is selected.In model training and prediction,the transfer learning of the Inception-v3 model is introduced into image recognition of fraudulent websites.Using selected image leaders to train the model,multiple types of fraudulent websites are identified with high accuracy.The experiment proves that this model has a superior accuracy in recognizing images on fraudulent websites compared to other current models.
基金supported by Zhejiang Provincial Natural Science Foundation of China(Grant No.LGF20G030001)Ministry of Public Security Science and Technology Plan Project(2022LL16)Key scientific research projects of agricultural and social development in Hangzhou in 2020(202004A06).
文摘The feature analysis of fraudulent websites is of great significance to the combat,prevention and control of telecom fraud crimes.Aiming to address the shortcomings of existing analytical approaches,i.e.single dimension and venerability to anti-reconnaissance,this paper adopts the Stacking,the ensemble learning algorithm,combines multiple modalities such as text,image and URL,and proposes a multimodal fraudulent website identification method by ensembling heterogeneous models.Crossvalidation is first used in the training of multiple largely different base classifiers that are strong in learning,such as BERT model,residual neural network(ResNet)and logistic regression model.Classification of the text,image and URL features are then performed respectively.The results of the base classifiers are taken as the input of the meta-classifier,and the output of which is eventually used as the final identification.The study indicates that the fusion method is more effective in identifying fraudulent websites than the single-modal method,and the recall is increased by at least 1%.In addition,the deployment of the algorithm to the real Internet environment shows the improvement of the identification accuracy by at least 1.9%compared with other fusion methods.