The fraudulent website image is a vital information carrier for telecom fraud.The efficient and precise recognition of fraudulent website images is critical to combating and dealing with fraudulent websites.Current re...The fraudulent website image is a vital information carrier for telecom fraud.The efficient and precise recognition of fraudulent website images is critical to combating and dealing with fraudulent websites.Current research on image recognition of fraudulent websites is mainly carried out at the level of image feature extraction and similarity study,which have such disadvantages as difficulty in obtaining image data,insufficient image analysis,and single identification types.This study develops a model based on the entropy method for image leader decision and Inception-v3 transfer learning to address these disadvantages.The data processing part of the model uses a breadth search crawler to capture the image data.Then,the information in the images is evaluated with the entropy method,image weights are assigned,and the image leader is selected.In model training and prediction,the transfer learning of the Inception-v3 model is introduced into image recognition of fraudulent websites.Using selected image leaders to train the model,multiple types of fraudulent websites are identified with high accuracy.The experiment proves that this model has a superior accuracy in recognizing images on fraudulent websites compared to other current models.展开更多
Website fingerprinting,also known asWF,is a traffic analysis attack that enables local eavesdroppers to infer a user’s browsing destination,even when using the Tor anonymity network.While advanced attacks based on de...Website fingerprinting,also known asWF,is a traffic analysis attack that enables local eavesdroppers to infer a user’s browsing destination,even when using the Tor anonymity network.While advanced attacks based on deep neural network(DNN)can performfeature engineering and attain accuracy rates of over 98%,research has demonstrated thatDNNis vulnerable to adversarial samples.As a result,many researchers have explored using adversarial samples as a defense mechanism against DNN-based WF attacks and have achieved considerable success.However,these methods suffer from high bandwidth overhead or require access to the target model,which is unrealistic.This paper proposes CMAES-WFD,a black-box WF defense based on adversarial samples.The process of generating adversarial examples is transformed into a constrained optimization problem solved by utilizing the Covariance Matrix Adaptation Evolution Strategy(CMAES)optimization algorithm.Perturbations are injected into the local parts of the original traffic to control bandwidth overhead.According to the experiment results,CMAES-WFD was able to significantly decrease the accuracy of Deep Fingerprinting(DF)and VarCnn to below 8.3%and the bandwidth overhead to a maximum of only 14.6%and 20.5%,respectively.Specially,for Automated Website Fingerprinting(AWF)with simple structure,CMAES-WFD reduced the classification accuracy to only 6.7%and the bandwidth overhead to less than 7.4%.Moreover,it was demonstrated that CMAES-WFD was robust against adversarial training to a certain extent.展开更多
E-Business now is developing fast in China and the development of E-Business strategy is crucial as the right strategy can make the E-Business select the correct direction and get a better development. In this paper, ...E-Business now is developing fast in China and the development of E-Business strategy is crucial as the right strategy can make the E-Business select the correct direction and get a better development. In this paper, combined with SWOT parameters module, we will make a thorough research on the strategies used by Vip.eom from different angles. After this research, we will have a good understanding of strategies used by VIP website and learn to make better decisions in the e-business world.展开更多
The feature analysis of fraudulent websites is of great significance to the combat,prevention and control of telecom fraud crimes.Aiming to address the shortcomings of existing analytical approaches,i.e.single dimensi...The feature analysis of fraudulent websites is of great significance to the combat,prevention and control of telecom fraud crimes.Aiming to address the shortcomings of existing analytical approaches,i.e.single dimension and venerability to anti-reconnaissance,this paper adopts the Stacking,the ensemble learning algorithm,combines multiple modalities such as text,image and URL,and proposes a multimodal fraudulent website identification method by ensembling heterogeneous models.Crossvalidation is first used in the training of multiple largely different base classifiers that are strong in learning,such as BERT model,residual neural network(ResNet)and logistic regression model.Classification of the text,image and URL features are then performed respectively.The results of the base classifiers are taken as the input of the meta-classifier,and the output of which is eventually used as the final identification.The study indicates that the fusion method is more effective in identifying fraudulent websites than the single-modal method,and the recall is increased by at least 1%.In addition,the deployment of the algorithm to the real Internet environment shows the improvement of the identification accuracy by at least 1.9%compared with other fusion methods.展开更多
Phishing websites present a severe cybersecurity risk since they can lead to financial losses,data breaches,and user privacy violations.This study uses machine learning approaches to solve the problem of phishing webs...Phishing websites present a severe cybersecurity risk since they can lead to financial losses,data breaches,and user privacy violations.This study uses machine learning approaches to solve the problem of phishing website detection.Using artificial intelligence,the project aims to provide efficient techniques for locating and thwarting these dangerous websites.The study goals were attained by performing a thorough literature analysis to investigate several models and methods often used in phishing website identification.Logistic Regression,K-Nearest Neighbors,Decision Trees,Random Forests,Support Vector Classifiers,Linear Support Vector Classifiers,and Naive Bayes were all used in the inquiry.This research covers the benefits and drawbacks of several Machine Learning approaches,illuminating how well-suited each is to overcome the difficulties in locating and countering phishing website predictions.The insights gained from this literature review guide the selection and implementation of appropriate models and methods in future research and real-world applications related to phishing detections.The study evaluates and compares accuracy,precision and recalls of several machine learning models in detecting phishing website URL’s detection.展开更多
Phishing attacks pose a significant security threat by masquerading as trustworthy entities to steal sensitive information,a problem that persists despite user awareness.This study addresses the pressing issue of phis...Phishing attacks pose a significant security threat by masquerading as trustworthy entities to steal sensitive information,a problem that persists despite user awareness.This study addresses the pressing issue of phishing attacks on websites and assesses the performance of three prominent Machine Learning(ML)models—Artificial Neural Networks(ANN),Convolutional Neural Networks(CNN),and Long Short-Term Memory(LSTM)—utilizing authentic datasets sourced from Kaggle and Mendeley repositories.Extensive experimentation and analysis reveal that the CNN model achieves a better accuracy of 98%.On the other hand,LSTM shows the lowest accuracy of 96%.These findings underscore the potential of ML techniques in enhancing phishing detection systems and bolstering cybersecurity measures against evolving phishing tactics,offering a promising avenue for safeguarding sensitive information and online security.展开更多
In order to improve the accuracy and integrality of mining data records from the web, the concepts of isomorphic page and directory page and three algorithms are proposed. An isomorphic web page is a set of web pages ...In order to improve the accuracy and integrality of mining data records from the web, the concepts of isomorphic page and directory page and three algorithms are proposed. An isomorphic web page is a set of web pages that have uniform structure, only differing in main information. A web page which contains many links that link to isomorphic web pages is called a directory page. Algorithm 1 can find directory web pages in a web using adjacent links similar analysis method. It first sorts the link, and then counts the links in each directory. If the count is greater than a given valve then finds the similar sub-page links in the directory and gives the results. A function for an isomorphic web page judgment is also proposed. Algorithm 2 can mine data records from an isomorphic page using a noise information filter. It is based on the fact that the noise information is the same in two isomorphic pages, only the main information is different. Algorithm 3 can mine data records from an entire website using the technology of spider. The experiment shows that the proposed algorithms can mine data records more intactly than the existing algorithms. Mining data records from isomorphic pages is an efficient method.展开更多
Agricultural product trading website is not only an important way to realize the agriculture informatization,but also the main manifestation of the agricultural informatization. Based on the preliminary understanding ...Agricultural product trading website is not only an important way to realize the agriculture informatization,but also the main manifestation of the agricultural informatization. Based on the preliminary understanding of the content and characteristics of China's agricultural product trading website,the paper builds a scientific evaluation indicator system and evaluates 50 typical agricultural product trading websites objectively by using classification and grading method. The results show that the overall construction level of China's agricultural product trading websites is general,and there are obvious differences between regions; the lack of website commercial function and the lag of informatization are the main factors restricting the development of agricultural product trading websites.展开更多
Nowadays, an increasing number of web applications require identification registration. However, the behavior of website registration has not ever been thoroughly studied. We use the database provided by the Chinese S...Nowadays, an increasing number of web applications require identification registration. However, the behavior of website registration has not ever been thoroughly studied. We use the database provided by the Chinese Software Develop Net (CSDN) to provide a complete perspective on this research point. We concentrate on the following three aspects: complexity, correlation, and preference. From these analyses, we draw the following conclusions: firstly, a considerable number of users have not realized the importance of identification and are using very simple identifications that can be attacked very easily. Secondly, there is a strong complexity correlation among the three parts of identification. Thirdly, the top three passwords that users like are 123456789, 12345678 and 11111111, and the top three email providers that they prefer are NETEASE, qq and sina. Further, we provide some suggestions to improve the quality of user passwords.展开更多
A hl-quality website is crucial to a company for a successful e-business. The technique maintainers are always faced with the problem how to locate the prime factors which affect the quality of the websites. In view o...A hl-quality website is crucial to a company for a successful e-business. The technique maintainers are always faced with the problem how to locate the prime factors which affect the quality of the websites. In view of the complexity and fuzziness of BtoC webslte, a quality diagnosis method based on the multl-attribute and multi-layer fuzzy comprehensive evaluation model including all the quality factors is proposed. A simple example of diagnosis on a famous domestic BtoC websites shows the specific steps of this method and proves its validity. The process of quality evaluation and diagnosis system is illustrated and the computer program of diagnosis is Oven.展开更多
Phishing attacks are security attacks that do not affect only individuals’or organizations’websites but may affect Internet of Things(IoT)devices and net-works.IoT environment is an exposed environment for such atta...Phishing attacks are security attacks that do not affect only individuals’or organizations’websites but may affect Internet of Things(IoT)devices and net-works.IoT environment is an exposed environment for such attacks.Attackers may use thingbots software for the dispersal of hidden junk emails that are not noticed by users.Machine and deep learning and other methods were used to design detection methods for these attacks.However,there is still a need to enhance detection accuracy.Optimization of an ensemble classification method for phishing website(PW)detection is proposed in this study.A Genetic Algo-rithm(GA)was used for the proposed method optimization by tuning several ensemble Machine Learning(ML)methods parameters,including Random Forest(RF),AdaBoost(AB),XGBoost(XGB),Bagging(BA),GradientBoost(GB),and LightGBM(LGBM).These were accomplished by ranking the optimized classi-fiers to pick out the best classifiers as a base for the proposed method.A PW data-set that is made up of 4898 PWs and 6157 legitimate websites(LWs)was used for this study's experiments.As a result,detection accuracy was enhanced and reached 97.16 percent.展开更多
基金supported by the National Social Science Fund of China(23BGL272)。
文摘The fraudulent website image is a vital information carrier for telecom fraud.The efficient and precise recognition of fraudulent website images is critical to combating and dealing with fraudulent websites.Current research on image recognition of fraudulent websites is mainly carried out at the level of image feature extraction and similarity study,which have such disadvantages as difficulty in obtaining image data,insufficient image analysis,and single identification types.This study develops a model based on the entropy method for image leader decision and Inception-v3 transfer learning to address these disadvantages.The data processing part of the model uses a breadth search crawler to capture the image data.Then,the information in the images is evaluated with the entropy method,image weights are assigned,and the image leader is selected.In model training and prediction,the transfer learning of the Inception-v3 model is introduced into image recognition of fraudulent websites.Using selected image leaders to train the model,multiple types of fraudulent websites are identified with high accuracy.The experiment proves that this model has a superior accuracy in recognizing images on fraudulent websites compared to other current models.
基金the Key JCJQ Program of China:2020-JCJQ-ZD-021-00 and 2020-JCJQ-ZD-024-12.
文摘Website fingerprinting,also known asWF,is a traffic analysis attack that enables local eavesdroppers to infer a user’s browsing destination,even when using the Tor anonymity network.While advanced attacks based on deep neural network(DNN)can performfeature engineering and attain accuracy rates of over 98%,research has demonstrated thatDNNis vulnerable to adversarial samples.As a result,many researchers have explored using adversarial samples as a defense mechanism against DNN-based WF attacks and have achieved considerable success.However,these methods suffer from high bandwidth overhead or require access to the target model,which is unrealistic.This paper proposes CMAES-WFD,a black-box WF defense based on adversarial samples.The process of generating adversarial examples is transformed into a constrained optimization problem solved by utilizing the Covariance Matrix Adaptation Evolution Strategy(CMAES)optimization algorithm.Perturbations are injected into the local parts of the original traffic to control bandwidth overhead.According to the experiment results,CMAES-WFD was able to significantly decrease the accuracy of Deep Fingerprinting(DF)and VarCnn to below 8.3%and the bandwidth overhead to a maximum of only 14.6%and 20.5%,respectively.Specially,for Automated Website Fingerprinting(AWF)with simple structure,CMAES-WFD reduced the classification accuracy to only 6.7%and the bandwidth overhead to less than 7.4%.Moreover,it was demonstrated that CMAES-WFD was robust against adversarial training to a certain extent.
文摘E-Business now is developing fast in China and the development of E-Business strategy is crucial as the right strategy can make the E-Business select the correct direction and get a better development. In this paper, combined with SWOT parameters module, we will make a thorough research on the strategies used by Vip.eom from different angles. After this research, we will have a good understanding of strategies used by VIP website and learn to make better decisions in the e-business world.
基金supported by Zhejiang Provincial Natural Science Foundation of China(Grant No.LGF20G030001)Ministry of Public Security Science and Technology Plan Project(2022LL16)Key scientific research projects of agricultural and social development in Hangzhou in 2020(202004A06).
文摘The feature analysis of fraudulent websites is of great significance to the combat,prevention and control of telecom fraud crimes.Aiming to address the shortcomings of existing analytical approaches,i.e.single dimension and venerability to anti-reconnaissance,this paper adopts the Stacking,the ensemble learning algorithm,combines multiple modalities such as text,image and URL,and proposes a multimodal fraudulent website identification method by ensembling heterogeneous models.Crossvalidation is first used in the training of multiple largely different base classifiers that are strong in learning,such as BERT model,residual neural network(ResNet)and logistic regression model.Classification of the text,image and URL features are then performed respectively.The results of the base classifiers are taken as the input of the meta-classifier,and the output of which is eventually used as the final identification.The study indicates that the fusion method is more effective in identifying fraudulent websites than the single-modal method,and the recall is increased by at least 1%.In addition,the deployment of the algorithm to the real Internet environment shows the improvement of the identification accuracy by at least 1.9%compared with other fusion methods.
文摘Phishing websites present a severe cybersecurity risk since they can lead to financial losses,data breaches,and user privacy violations.This study uses machine learning approaches to solve the problem of phishing website detection.Using artificial intelligence,the project aims to provide efficient techniques for locating and thwarting these dangerous websites.The study goals were attained by performing a thorough literature analysis to investigate several models and methods often used in phishing website identification.Logistic Regression,K-Nearest Neighbors,Decision Trees,Random Forests,Support Vector Classifiers,Linear Support Vector Classifiers,and Naive Bayes were all used in the inquiry.This research covers the benefits and drawbacks of several Machine Learning approaches,illuminating how well-suited each is to overcome the difficulties in locating and countering phishing website predictions.The insights gained from this literature review guide the selection and implementation of appropriate models and methods in future research and real-world applications related to phishing detections.The study evaluates and compares accuracy,precision and recalls of several machine learning models in detecting phishing website URL’s detection.
文摘Phishing attacks pose a significant security threat by masquerading as trustworthy entities to steal sensitive information,a problem that persists despite user awareness.This study addresses the pressing issue of phishing attacks on websites and assesses the performance of three prominent Machine Learning(ML)models—Artificial Neural Networks(ANN),Convolutional Neural Networks(CNN),and Long Short-Term Memory(LSTM)—utilizing authentic datasets sourced from Kaggle and Mendeley repositories.Extensive experimentation and analysis reveal that the CNN model achieves a better accuracy of 98%.On the other hand,LSTM shows the lowest accuracy of 96%.These findings underscore the potential of ML techniques in enhancing phishing detection systems and bolstering cybersecurity measures against evolving phishing tactics,offering a promising avenue for safeguarding sensitive information and online security.
文摘In order to improve the accuracy and integrality of mining data records from the web, the concepts of isomorphic page and directory page and three algorithms are proposed. An isomorphic web page is a set of web pages that have uniform structure, only differing in main information. A web page which contains many links that link to isomorphic web pages is called a directory page. Algorithm 1 can find directory web pages in a web using adjacent links similar analysis method. It first sorts the link, and then counts the links in each directory. If the count is greater than a given valve then finds the similar sub-page links in the directory and gives the results. A function for an isomorphic web page judgment is also proposed. Algorithm 2 can mine data records from an isomorphic page using a noise information filter. It is based on the fact that the noise information is the same in two isomorphic pages, only the main information is different. Algorithm 3 can mine data records from an entire website using the technology of spider. The experiment shows that the proposed algorithms can mine data records more intactly than the existing algorithms. Mining data records from isomorphic pages is an efficient method.
基金Supported by Shandong Provincial Natural Science Foundation(ZR2011DM008)
文摘Agricultural product trading website is not only an important way to realize the agriculture informatization,but also the main manifestation of the agricultural informatization. Based on the preliminary understanding of the content and characteristics of China's agricultural product trading website,the paper builds a scientific evaluation indicator system and evaluates 50 typical agricultural product trading websites objectively by using classification and grading method. The results show that the overall construction level of China's agricultural product trading websites is general,and there are obvious differences between regions; the lack of website commercial function and the lag of informatization are the main factors restricting the development of agricultural product trading websites.
基金supported by the Foundation for Key Program of Ministry of Education, China under Grant No.311007National Science Foundation Project of China under Grants No. 61202079, No.61170225, No.61271199+1 种基金the Fundamental Research Funds for the Central Universities under Grant No.FRF-TP-09-015Athe Fundamental Research Funds in Beijing Jiaotong University under Grant No.W11JB00630
文摘Nowadays, an increasing number of web applications require identification registration. However, the behavior of website registration has not ever been thoroughly studied. We use the database provided by the Chinese Software Develop Net (CSDN) to provide a complete perspective on this research point. We concentrate on the following three aspects: complexity, correlation, and preference. From these analyses, we draw the following conclusions: firstly, a considerable number of users have not realized the importance of identification and are using very simple identifications that can be attacked very easily. Secondly, there is a strong complexity correlation among the three parts of identification. Thirdly, the top three passwords that users like are 123456789, 12345678 and 11111111, and the top three email providers that they prefer are NETEASE, qq and sina. Further, we provide some suggestions to improve the quality of user passwords.
基金Supported by Key Discipline Project fromScience and Technology Committee of Shanghai(No.04JC14009) and the Research Fund ofDonghua University(No.108 10 0044934)
文摘A hl-quality website is crucial to a company for a successful e-business. The technique maintainers are always faced with the problem how to locate the prime factors which affect the quality of the websites. In view of the complexity and fuzziness of BtoC webslte, a quality diagnosis method based on the multl-attribute and multi-layer fuzzy comprehensive evaluation model including all the quality factors is proposed. A simple example of diagnosis on a famous domestic BtoC websites shows the specific steps of this method and proves its validity. The process of quality evaluation and diagnosis system is illustrated and the computer program of diagnosis is Oven.
基金This research has been funded by the Scientific Research Deanship at University of Ha'il-Saudi Arabia through Project Number RG-20023.
文摘Phishing attacks are security attacks that do not affect only individuals’or organizations’websites but may affect Internet of Things(IoT)devices and net-works.IoT environment is an exposed environment for such attacks.Attackers may use thingbots software for the dispersal of hidden junk emails that are not noticed by users.Machine and deep learning and other methods were used to design detection methods for these attacks.However,there is still a need to enhance detection accuracy.Optimization of an ensemble classification method for phishing website(PW)detection is proposed in this study.A Genetic Algo-rithm(GA)was used for the proposed method optimization by tuning several ensemble Machine Learning(ML)methods parameters,including Random Forest(RF),AdaBoost(AB),XGBoost(XGB),Bagging(BA),GradientBoost(GB),and LightGBM(LGBM).These were accomplished by ranking the optimized classi-fiers to pick out the best classifiers as a base for the proposed method.A PW data-set that is made up of 4898 PWs and 6157 legitimate websites(LWs)was used for this study's experiments.As a result,detection accuracy was enhanced and reached 97.16 percent.