Cybercriminals often use fraudulent emails and fictitious email accounts to deceive individuals into disclosing confidential information,a practice known as phishing.This study utilizes three distinct methodologies,Te...Cybercriminals often use fraudulent emails and fictitious email accounts to deceive individuals into disclosing confidential information,a practice known as phishing.This study utilizes three distinct methodologies,Term Frequency-Inverse Document Frequency,Word2Vec,and Bidirectional Encoder Representations from Transform-ers,to evaluate the effectiveness of various machine learning algorithms in detecting phishing attacks.The study uses feature extraction methods to assess the performance of Logistic Regression,Decision Tree,Random Forest,and Multilayer Perceptron algorithms.The best results for each classifier using Term Frequency-Inverse Document Frequency were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).Word2Vec’s best results were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).The highest performance was achieved using the Bidirectional Encoder Representations from the Transformers model,with Precision,Recall,F1-score,and Accuracy all reaching 0.99.This study highlights how advanced pre-trained models,such as Bidirectional Encoder Representations from Transformers,can significantly enhance the accuracy and reliability of fraud detection systems.展开更多
文摘Cybercriminals often use fraudulent emails and fictitious email accounts to deceive individuals into disclosing confidential information,a practice known as phishing.This study utilizes three distinct methodologies,Term Frequency-Inverse Document Frequency,Word2Vec,and Bidirectional Encoder Representations from Transform-ers,to evaluate the effectiveness of various machine learning algorithms in detecting phishing attacks.The study uses feature extraction methods to assess the performance of Logistic Regression,Decision Tree,Random Forest,and Multilayer Perceptron algorithms.The best results for each classifier using Term Frequency-Inverse Document Frequency were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).Word2Vec’s best results were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).The highest performance was achieved using the Bidirectional Encoder Representations from the Transformers model,with Precision,Recall,F1-score,and Accuracy all reaching 0.99.This study highlights how advanced pre-trained models,such as Bidirectional Encoder Representations from Transformers,can significantly enhance the accuracy and reliability of fraud detection systems.