This study was conducted to enable prompt classification of malware,which was becoming increasingly sophisticated.To do this,we analyzed the important features of malware and the relative importance of selected featur...This study was conducted to enable prompt classification of malware,which was becoming increasingly sophisticated.To do this,we analyzed the important features of malware and the relative importance of selected features according to a learning model to assess how those important features were identified.Initially,the analysis features were extracted using Cuckoo Sandbox,an open-source malware analysis tool,then the features were divided into five categories using the extracted information.The 804 extracted features were reduced by 70%after selecting only the most suitable ones for malware classification using a learning model-based feature selection method called the recursive feature elimination.Next,these important features were analyzed.The level of contribution from each one was assessed by the Random Forest classifier method.The results showed that System call features were mostly allocated.At the end,it was possible to accurately identify the malware type using only 36 to 76 features for each of the four types of malware with the most analysis samples available.These were the Trojan,Adware,Downloader,and Backdoor malware.展开更多
With the development of information technology,malware threats to the industrial system have become an emergent issue,since various industrial infrastructures have been deeply integrated into our modern works and live...With the development of information technology,malware threats to the industrial system have become an emergent issue,since various industrial infrastructures have been deeply integrated into our modern works and lives.To identify and classify new malware variants,different types of deep learning models have been widely explored recently.Generally,sufficient data is usually required to achieve a well-trained deep learning classifier with satisfactory generalization ability.However,in current practical applications,an ample supply of data is absent in most specific industrial malware detection scenarios.Transfer learning as an effective approach can be used to alleviate the influence of the small sample size problem.In addition,it can also reuse the knowledge from pretrained models,which is beneficial to the real-time requirement in industrial malware detection.In this paper,we investigate the transferable features learned by a 1D-convolutional network and evaluate our proposed methods on 6 transfer learning tasks.The experiment results show that 1D-convolutional architecture is effective to learn transferable features for malware classification,and indicate that transferring the first 2 layers of our proposed 1D-convolutional network is the most efficient way to reuse the learned features.展开更多
The Corona Virus Disease 2019(COVID-19)effect has made telecommuting and remote learning the norm.The growing number of Internet-connected devices provides cyber attackers with more attack vectors.The development of m...The Corona Virus Disease 2019(COVID-19)effect has made telecommuting and remote learning the norm.The growing number of Internet-connected devices provides cyber attackers with more attack vectors.The development of malware by criminals also incorporates a number of sophisticated obfuscation techniques,making it difficult to classify and detect malware using conventional approaches.Therefore,this paper proposes a novel visualization-based malware classification system using transfer and ensemble learning(VMCTE).VMCTE has a strong anti-interference ability.Even if malware uses obfuscation,fuzzing,encryption,and other techniques to evade detection,it can be accurately classified into its corresponding malware family.Unlike traditional dynamic and static analysis techniques,VMCTE does not require either reverse engineering or the aid of domain expert knowledge.The proposed classification system combines three strong deep convolutional neural networks(ResNet50,MobilenetV1,and MobilenetV2)as feature extractors,lessens the dimension of the extracted features using principal component analysis,and employs a support vector machine to establish the classification model.The semantic representations of malware images can be extracted using various convolutional neural network(CNN)architectures,obtaining higher-quality features than traditional methods.Integrating fine-tuned and non-fine-tuned classification models based on transfer learning can greatly enhance the capacity to classify various families ofmalware.The experimental findings on the Malimg dataset demonstrate that VMCTE can attain 99.64%,99.64%,99.66%,and 99.64%accuracy,F1-score,precision,and recall,respectively.展开更多
The explosive growth ofmalware variants poses a major threat to information security. Traditional anti-virus systems based on signatures fail to classify unknown malware into their corresponding families and to detect...The explosive growth ofmalware variants poses a major threat to information security. Traditional anti-virus systems based on signatures fail to classify unknown malware into their corresponding families and to detect new kinds of malware pro- grams. Therefore, we propose a machine learning based malware analysis system, which is composed of three modules: data processing, decision making, and new malware detection. The data processing module deals with gray-scale images, Opcode n-gram, and import fimctions, which are employed to extract the features of the malware. The decision-making module uses the features to classify the malware and to identify suspicious malware. Finally, the detection module uses the shared nearest neighbor (SNN) clustering algorithm to discover new malware families. Our approach is evaluated on more than 20 000 malware instances, which were collected by Kingsoft, ESET NOD32, and Anubis. The results show that our system can effectively classify the un- known malware with a best accuracy of 98.9%, and successfully detects 86.7% of the new malware.展开更多
Although Android becomes a leading operating system in market,Android users suffer from security threats due to malwares.To protect users from the threats,the solutions to detect and identify the malware variant are e...Although Android becomes a leading operating system in market,Android users suffer from security threats due to malwares.To protect users from the threats,the solutions to detect and identify the malware variant are essential.However,modern malware evades existing solutions by applying code obfuscation and native code.To resolve this problem,we introduce an ensemble-based malware classification algorithm using malware family grouping.The proposed family grouping algorithm finds the optimal combination of families belonging to the same group while the total number of families is fixed to the optimal total number.It also adopts unified feature extraction technique for handling seamless both bytecode and native code.We propose a unique feature selection algorithm that improves classification performance and time simultaneously.2-gram based features are generated from the instructions and segments,and then selected by using multiple filters to choose most effective features.Through extensive simulation with many obfuscated and native code malware applications,we confirm that it can classify malwares with high accuracy and short processing time.Most existing approaches failed to achieve classification speed and detection time simultaneously.Therefore,the approach can help Android users to keep themselves safe from various and evolving cyber-attacks very effectively.展开更多
Antivirus vendors and the research community employ Machine Learning(ML)or Deep Learning(DL)-based static analysis techniques for efficient identification of new threats,given the continual emergence of novel malware ...Antivirus vendors and the research community employ Machine Learning(ML)or Deep Learning(DL)-based static analysis techniques for efficient identification of new threats,given the continual emergence of novel malware variants.On the other hand,numerous researchers have reported that Adversarial Examples(AEs),generated by manipulating previously detected malware,can successfully evade ML/DL-based classifiers.Commercial antivirus systems,in particular,have been identified as vulnerable to such AEs.This paper firstly focuses on conducting black-box attacks to circumvent ML/DL-based malware classifiers.Our attack method utilizes seven different perturbations,including Overlay Append,Section Append,and Break Checksum,capitalizing on the ambiguities present in the PE format,as previously employed in evasion attack research.By directly applying the perturbation techniques to PE binaries,our attack method eliminates the need to grapple with the problem-feature space dilemma,a persistent challenge in many evasion attack studies.Being a black-box attack,our method can generate AEs that successfully evade both DL-based and ML-based classifiers.Also,AEs generated by the attack method retain their executability and malicious behavior,eliminating the need for functionality verification.Through thorogh evaluations,we confirmed that the attack method achieves an evasion rate of 65.6%against well-known ML-based malware detectors and can reach a remarkable 99%evasion rate against well-known DL-based malware detectors.Furthermore,our AEs demonstrated the capability to bypass detection by 17%of vendors out of the 64 on VirusTotal(VT).In addition,we propose a defensive approach that utilizes Trend Locality Sensitive Hashing(TLSH)to construct a similarity-based defense model.Through several experiments on the approach,we verified that our defense model can effectively counter AEs generated by the perturbation techniques.In conclusion,our defense model alleviates the limitation of the most promising defense method,adversarial training,which is only effective against the AEs that are included in the training classifiers.展开更多
Malware is emerging day by day.To evade detection,many malware obfuscation techniques have emerged.Dynamicmalware detectionmethods based on data flow graphs have attracted much attention since they can deal with the o...Malware is emerging day by day.To evade detection,many malware obfuscation techniques have emerged.Dynamicmalware detectionmethods based on data flow graphs have attracted much attention since they can deal with the obfuscation problem to a certain extent.Many malware classification methods based on data flow graphs have been proposed.Some of them are based on userdefined features or graph similarity of data flow graphs.Graph neural networks have also recently been used to implement malware classification recently.This paper provides an overview of current data flow graph-based malware classification methods.Their respective advantages and disadvantages are summarized as well.In addition,the future trend of the data flow graph-based malware classification method is analyzed,which is of great significance for promoting the development of malware detection technology.展开更多
There has been an increase in attacks on mobile devices,such as smartphones and tablets,due to their growing popularity.Mobile malware is one of the most dangerous threats,causing both security breaches and financial ...There has been an increase in attacks on mobile devices,such as smartphones and tablets,due to their growing popularity.Mobile malware is one of the most dangerous threats,causing both security breaches and financial losses.Mobile malware is likely to continue to evolve and proliferate to carry out a variety of cybercrimes on mobile devices.Mobile malware specifically targets Android operating system as it has grown in popularity.The rapid proliferation of Android malware apps poses a significant security risk to users,making static and manual analysis of malicious files difficult.Therefore,efficient identification and classification of Androidmalicious files is crucial.Several ConvolutionalNeuralNetwork(CNN)basedmethods have been proposed in this regard;however,there is still room for performance improvement.In this work,we propose a transfer learning and stacking approach to efficiently detect the Android malware files by utilizing two wellknown machine learning models,ResNet-50 and Support Vector Machine(SVM).The proposed model is trained on the DREBIN dataset by transforming malicious APK files into grayscale images.Our model yields higher performance measures than state-of-the-art works on the DREBIN dataset,where the reported measures are accuracy,recall,precision,and F1 measures of 97.8%,95.8%,95.7%,and 95.7%,respectively.展开更多
A ransomware attack that interrupted the operation of Colonial Pipeline(a large U.S.oil pipeline company),showed that security threats by malware have become serious enough to affect industries and social infrastructu...A ransomware attack that interrupted the operation of Colonial Pipeline(a large U.S.oil pipeline company),showed that security threats by malware have become serious enough to affect industries and social infrastructure rather than individuals alone.The agents and characteristics of attacks should be identified,and appropriate strategies should be established accordingly in order to respond to such attacks.For this purpose,the first task that must be performed is malware classification.Malware creators are well aware of this and apply various concealment and avoidance techniques,making it difficult to classify malware.This study focuses on new features and classification techniques to overcome these difficulties.We propose a behavioral performance visualization method using utilization patterns of system resources,such as the central processing unit,memory,and input/output,that are commonly used in performance analysis or tuning of programs.We extracted the usage patterns of the system resources for ransomware to performbehavioral performance visualization.The results of the classification performance evaluation using the visualization results indicate an accuracy of at least 98.94%with a 3.69%loss rate.Furthermore,we designed and implemented a framework to perform the entire process—from data extraction to behavioral performance visualization and classification performance measurement—that is expected to contribute to related studies in the future.展开更多
In computer security,the number of malware threats is increasing and causing damage to systems for individuals or organizations,necessitating a new detection technique capable of detecting a new variant of malware mor...In computer security,the number of malware threats is increasing and causing damage to systems for individuals or organizations,necessitating a new detection technique capable of detecting a new variant of malware more efficiently than traditional anti-malware methods.Traditional antimalware software cannot detect new malware variants,and conventional techniques such as static analysis,dynamic analysis,and hybrid analysis are time-consuming and rely on domain experts.Visualization-based malware detection has recently gained popularity due to its accuracy,independence from domain experts,and faster detection time.Visualization-based malware detection uses the image representation of the malware binary and applies image processing techniques to the image.This paper aims to provide readers with a comprehensive understanding of malware detection and focuses on visualization-based malware detection.展开更多
As the smartphone market leader,Android has been a prominent target for malware attacks.The number of malicious applications(apps)identified for it has increased continually over the past decade,creating an immense ch...As the smartphone market leader,Android has been a prominent target for malware attacks.The number of malicious applications(apps)identified for it has increased continually over the past decade,creating an immense challenge for all parties involved.For market holders and researchers,in particular,the large number of samples has made manual malware detection unfeasible,leading to an influx of research that investigate Machine Learning(ML)approaches to automate this process.However,while some of the proposed approaches achieve high performance,rapidly evolving Android malware has made them unable to maintain their accuracy over time.This has created a need in the community to conduct further research,and build more flexible ML pipelines.Doing so,however,is currently hindered by a lack of systematic overview of the existing literature,to learn from and improve upon the existing solutions.Existing survey papers often focus only on parts of the ML process(e.g,data collection or model deployment),while omitting other important stages,such as model evaluation and explanation.n this paper,we address this problem with a review of 42 highly-cited papers,spanning a decade of research(from 2011 to 2021).We introduce a novel procedural taxonomy of the published literature,covering how they have used ML algorithms,what features they have engineered,which dimensionality reduction techniques they have employed,what datasets they have employed for training,and what their evaluation and explanation strategies are.Drawing from this taxonomy,we also identify gaps in knowledge and provide ideas for improvement and future work.展开更多
Although using machine learning techniques to solve computer security challenges is not a new idea,the rapidly emerging Deep Learning technology has recently triggered a substantial amount of interests in the computer...Although using machine learning techniques to solve computer security challenges is not a new idea,the rapidly emerging Deep Learning technology has recently triggered a substantial amount of interests in the computer security community.This paper seeks to provide a dedicated review of the very recent research works on using Deep Learning techniques to solve computer security challenges.In particular,the review covers eight computer security problems being solved by applications of Deep Learning:security-oriented program analysis,defending return-oriented programming(ROP)attacks,achieving control-flow integrity(CFI),defending network attacks,malware classification,system-event-based anomaly detection,memory forensics,and fuzzing for software security.展开更多
Although using machine learning techniques to solve computer security challenges is not a new idea,the rapidly emerging Deep Learning technology has recently triggered a substantial amount of interests in the computer...Although using machine learning techniques to solve computer security challenges is not a new idea,the rapidly emerging Deep Learning technology has recently triggered a substantial amount of interests in the computer security community.This paper seeks to provide a dedicated review of the very recent research works on using Deep Learning techniques to solve computer security challenges.In particular,the review covers eight computer security problems being solved by applications of Deep Learning:security-oriented program analysis,defending return-oriented programming(ROP)attacks,achieving control-flow integrity(CFI),defending network attacks,malware classification,system-event-based anomaly detection,memory forensics,and fuzzing for software security.展开更多
基金supported by the Research Program through the National Research Foundation of Korea,NRF-2018R1D1A1B07050864.
文摘This study was conducted to enable prompt classification of malware,which was becoming increasingly sophisticated.To do this,we analyzed the important features of malware and the relative importance of selected features according to a learning model to assess how those important features were identified.Initially,the analysis features were extracted using Cuckoo Sandbox,an open-source malware analysis tool,then the features were divided into five categories using the extracted information.The 804 extracted features were reduced by 70%after selecting only the most suitable ones for malware classification using a learning model-based feature selection method called the recursive feature elimination.Next,these important features were analyzed.The level of contribution from each one was assessed by the Random Forest classifier method.The results showed that System call features were mostly allocated.At the end,it was possible to accurately identify the malware type using only 36 to 76 features for each of the four types of malware with the most analysis samples available.These were the Trojan,Adware,Downloader,and Backdoor malware.
基金the National Natural Science Foundation of China under Grants U1836106 and 81961138010the Beijing Natural Science Foundation under Grants 19L2029 and M21032+2 种基金the Scientific and Technological Innovation Foundation of Foshan under Grants BK20BF010 and BK21BF001the Scientific and Technological Innovation Foundation of Shunde Graduate School,USTB,under Grant BK19BF006by the Fundamental Research Funds for the University of Science and Technology Beijing under Grant FRF-BD-19-012A.
文摘With the development of information technology,malware threats to the industrial system have become an emergent issue,since various industrial infrastructures have been deeply integrated into our modern works and lives.To identify and classify new malware variants,different types of deep learning models have been widely explored recently.Generally,sufficient data is usually required to achieve a well-trained deep learning classifier with satisfactory generalization ability.However,in current practical applications,an ample supply of data is absent in most specific industrial malware detection scenarios.Transfer learning as an effective approach can be used to alleviate the influence of the small sample size problem.In addition,it can also reuse the knowledge from pretrained models,which is beneficial to the real-time requirement in industrial malware detection.In this paper,we investigate the transferable features learned by a 1D-convolutional network and evaluate our proposed methods on 6 transfer learning tasks.The experiment results show that 1D-convolutional architecture is effective to learn transferable features for malware classification,and indicate that transferring the first 2 layers of our proposed 1D-convolutional network is the most efficient way to reuse the learned features.
基金This work is supported,in part,by the National Natural Science Foundation of China Grant No.62102190 and 62272236in part,by the Natural Science Foundation of Jiangsu Province under Grant No.BK20201136 and BK20191401.
文摘The Corona Virus Disease 2019(COVID-19)effect has made telecommuting and remote learning the norm.The growing number of Internet-connected devices provides cyber attackers with more attack vectors.The development of malware by criminals also incorporates a number of sophisticated obfuscation techniques,making it difficult to classify and detect malware using conventional approaches.Therefore,this paper proposes a novel visualization-based malware classification system using transfer and ensemble learning(VMCTE).VMCTE has a strong anti-interference ability.Even if malware uses obfuscation,fuzzing,encryption,and other techniques to evade detection,it can be accurately classified into its corresponding malware family.Unlike traditional dynamic and static analysis techniques,VMCTE does not require either reverse engineering or the aid of domain expert knowledge.The proposed classification system combines three strong deep convolutional neural networks(ResNet50,MobilenetV1,and MobilenetV2)as feature extractors,lessens the dimension of the extracted features using principal component analysis,and employs a support vector machine to establish the classification model.The semantic representations of malware images can be extracted using various convolutional neural network(CNN)architectures,obtaining higher-quality features than traditional methods.Integrating fine-tuned and non-fine-tuned classification models based on transfer learning can greatly enhance the capacity to classify various families ofmalware.The experimental findings on the Malimg dataset demonstrate that VMCTE can attain 99.64%,99.64%,99.66%,and 99.64%accuracy,F1-score,precision,and recall,respectively.
基金Project supported by the Natiooal Natural Science Foundation of China (No. 61303264) and the National Basic Research Program (973) of China (Nos. 2012CB315906 and 0800065111001)
文摘The explosive growth ofmalware variants poses a major threat to information security. Traditional anti-virus systems based on signatures fail to classify unknown malware into their corresponding families and to detect new kinds of malware pro- grams. Therefore, we propose a machine learning based malware analysis system, which is composed of three modules: data processing, decision making, and new malware detection. The data processing module deals with gray-scale images, Opcode n-gram, and import fimctions, which are employed to extract the features of the malware. The decision-making module uses the features to classify the malware and to identify suspicious malware. Finally, the detection module uses the shared nearest neighbor (SNN) clustering algorithm to discover new malware families. Our approach is evaluated on more than 20 000 malware instances, which were collected by Kingsoft, ESET NOD32, and Anubis. The results show that our system can effectively classify the un- known malware with a best accuracy of 98.9%, and successfully detects 86.7% of the new malware.
基金This work was supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(NRF-2019R1F1A1062320).
文摘Although Android becomes a leading operating system in market,Android users suffer from security threats due to malwares.To protect users from the threats,the solutions to detect and identify the malware variant are essential.However,modern malware evades existing solutions by applying code obfuscation and native code.To resolve this problem,we introduce an ensemble-based malware classification algorithm using malware family grouping.The proposed family grouping algorithm finds the optimal combination of families belonging to the same group while the total number of families is fixed to the optimal total number.It also adopts unified feature extraction technique for handling seamless both bytecode and native code.We propose a unique feature selection algorithm that improves classification performance and time simultaneously.2-gram based features are generated from the instructions and segments,and then selected by using multiple filters to choose most effective features.Through extensive simulation with many obfuscated and native code malware applications,we confirm that it can classify malwares with high accuracy and short processing time.Most existing approaches failed to achieve classification speed and detection time simultaneously.Therefore,the approach can help Android users to keep themselves safe from various and evolving cyber-attacks very effectively.
基金supported by Institute of Information&Communications Technology Planning&Evaluation(IITP)Grant funded by the Korea government,Ministry of Science and ICT(MSIT)(No.2017-0-00168,Automatic Deep Malware Analysis Technology for Cyber Threat Intelligence).
文摘Antivirus vendors and the research community employ Machine Learning(ML)or Deep Learning(DL)-based static analysis techniques for efficient identification of new threats,given the continual emergence of novel malware variants.On the other hand,numerous researchers have reported that Adversarial Examples(AEs),generated by manipulating previously detected malware,can successfully evade ML/DL-based classifiers.Commercial antivirus systems,in particular,have been identified as vulnerable to such AEs.This paper firstly focuses on conducting black-box attacks to circumvent ML/DL-based malware classifiers.Our attack method utilizes seven different perturbations,including Overlay Append,Section Append,and Break Checksum,capitalizing on the ambiguities present in the PE format,as previously employed in evasion attack research.By directly applying the perturbation techniques to PE binaries,our attack method eliminates the need to grapple with the problem-feature space dilemma,a persistent challenge in many evasion attack studies.Being a black-box attack,our method can generate AEs that successfully evade both DL-based and ML-based classifiers.Also,AEs generated by the attack method retain their executability and malicious behavior,eliminating the need for functionality verification.Through thorogh evaluations,we confirmed that the attack method achieves an evasion rate of 65.6%against well-known ML-based malware detectors and can reach a remarkable 99%evasion rate against well-known DL-based malware detectors.Furthermore,our AEs demonstrated the capability to bypass detection by 17%of vendors out of the 64 on VirusTotal(VT).In addition,we propose a defensive approach that utilizes Trend Locality Sensitive Hashing(TLSH)to construct a similarity-based defense model.Through several experiments on the approach,we verified that our defense model can effectively counter AEs generated by the perturbation techniques.In conclusion,our defense model alleviates the limitation of the most promising defense method,adversarial training,which is only effective against the AEs that are included in the training classifiers.
文摘Malware is emerging day by day.To evade detection,many malware obfuscation techniques have emerged.Dynamicmalware detectionmethods based on data flow graphs have attracted much attention since they can deal with the obfuscation problem to a certain extent.Many malware classification methods based on data flow graphs have been proposed.Some of them are based on userdefined features or graph similarity of data flow graphs.Graph neural networks have also recently been used to implement malware classification recently.This paper provides an overview of current data flow graph-based malware classification methods.Their respective advantages and disadvantages are summarized as well.In addition,the future trend of the data flow graph-based malware classification method is analyzed,which is of great significance for promoting the development of malware detection technology.
文摘There has been an increase in attacks on mobile devices,such as smartphones and tablets,due to their growing popularity.Mobile malware is one of the most dangerous threats,causing both security breaches and financial losses.Mobile malware is likely to continue to evolve and proliferate to carry out a variety of cybercrimes on mobile devices.Mobile malware specifically targets Android operating system as it has grown in popularity.The rapid proliferation of Android malware apps poses a significant security risk to users,making static and manual analysis of malicious files difficult.Therefore,efficient identification and classification of Androidmalicious files is crucial.Several ConvolutionalNeuralNetwork(CNN)basedmethods have been proposed in this regard;however,there is still room for performance improvement.In this work,we propose a transfer learning and stacking approach to efficiently detect the Android malware files by utilizing two wellknown machine learning models,ResNet-50 and Support Vector Machine(SVM).The proposed model is trained on the DREBIN dataset by transforming malicious APK files into grayscale images.Our model yields higher performance measures than state-of-the-art works on the DREBIN dataset,where the reported measures are accuracy,recall,precision,and F1 measures of 97.8%,95.8%,95.7%,and 95.7%,respectively.
基金This work was supported by the Institute of Information&Communications Technology Planning&Evaluation(IITP)(Project No.2019-0-00426%,10%)the ICT R&D Program of MSIT/IITP(Project No.2021-0-01816,A Research on Core Technology of Autonomous Twins for Metaverse,10%)National Research Foundation of Korea(NRF)grant funded by the Korean government(Project No.NRF-2020R1A2C4002737%,80%).
文摘A ransomware attack that interrupted the operation of Colonial Pipeline(a large U.S.oil pipeline company),showed that security threats by malware have become serious enough to affect industries and social infrastructure rather than individuals alone.The agents and characteristics of attacks should be identified,and appropriate strategies should be established accordingly in order to respond to such attacks.For this purpose,the first task that must be performed is malware classification.Malware creators are well aware of this and apply various concealment and avoidance techniques,making it difficult to classify malware.This study focuses on new features and classification techniques to overcome these difficulties.We propose a behavioral performance visualization method using utilization patterns of system resources,such as the central processing unit,memory,and input/output,that are commonly used in performance analysis or tuning of programs.We extracted the usage patterns of the system resources for ransomware to performbehavioral performance visualization.The results of the classification performance evaluation using the visualization results indicate an accuracy of at least 98.94%with a 3.69%loss rate.Furthermore,we designed and implemented a framework to perform the entire process—from data extraction to behavioral performance visualization and classification performance measurement—that is expected to contribute to related studies in the future.
文摘In computer security,the number of malware threats is increasing and causing damage to systems for individuals or organizations,necessitating a new detection technique capable of detecting a new variant of malware more efficiently than traditional anti-malware methods.Traditional antimalware software cannot detect new malware variants,and conventional techniques such as static analysis,dynamic analysis,and hybrid analysis are time-consuming and rely on domain experts.Visualization-based malware detection has recently gained popularity due to its accuracy,independence from domain experts,and faster detection time.Visualization-based malware detection uses the image representation of the malware binary and applies image processing techniques to the image.This paper aims to provide readers with a comprehensive understanding of malware detection and focuses on visualization-based malware detection.
文摘As the smartphone market leader,Android has been a prominent target for malware attacks.The number of malicious applications(apps)identified for it has increased continually over the past decade,creating an immense challenge for all parties involved.For market holders and researchers,in particular,the large number of samples has made manual malware detection unfeasible,leading to an influx of research that investigate Machine Learning(ML)approaches to automate this process.However,while some of the proposed approaches achieve high performance,rapidly evolving Android malware has made them unable to maintain their accuracy over time.This has created a need in the community to conduct further research,and build more flexible ML pipelines.Doing so,however,is currently hindered by a lack of systematic overview of the existing literature,to learn from and improve upon the existing solutions.Existing survey papers often focus only on parts of the ML process(e.g,data collection or model deployment),while omitting other important stages,such as model evaluation and explanation.n this paper,we address this problem with a review of 42 highly-cited papers,spanning a decade of research(from 2011 to 2021).We introduce a novel procedural taxonomy of the published literature,covering how they have used ML algorithms,what features they have engineered,which dimensionality reduction techniques they have employed,what datasets they have employed for training,and what their evaluation and explanation strategies are.Drawing from this taxonomy,we also identify gaps in knowledge and provide ideas for improvement and future work.
基金This work was supported by ARO W911NF-13-1-0421(MURI),NSF CNS-1814679,and ARO W911NF-15-1-0576.
文摘Although using machine learning techniques to solve computer security challenges is not a new idea,the rapidly emerging Deep Learning technology has recently triggered a substantial amount of interests in the computer security community.This paper seeks to provide a dedicated review of the very recent research works on using Deep Learning techniques to solve computer security challenges.In particular,the review covers eight computer security problems being solved by applications of Deep Learning:security-oriented program analysis,defending return-oriented programming(ROP)attacks,achieving control-flow integrity(CFI),defending network attacks,malware classification,system-event-based anomaly detection,memory forensics,and fuzzing for software security.
基金supported by ARO W911NF-13-1-0421(MURI),NSF CNS-1814679,and ARO W911NF-15-1-0576.
文摘Although using machine learning techniques to solve computer security challenges is not a new idea,the rapidly emerging Deep Learning technology has recently triggered a substantial amount of interests in the computer security community.This paper seeks to provide a dedicated review of the very recent research works on using Deep Learning techniques to solve computer security challenges.In particular,the review covers eight computer security problems being solved by applications of Deep Learning:security-oriented program analysis,defending return-oriented programming(ROP)attacks,achieving control-flow integrity(CFI),defending network attacks,malware classification,system-event-based anomaly detection,memory forensics,and fuzzing for software security.