The detection of software vulnerabilities written in C and C++languages takes a lot of attention and interest today.This paper proposes a new framework called DrCSE to improve software vulnerability detection.It uses ...The detection of software vulnerabilities written in C and C++languages takes a lot of attention and interest today.This paper proposes a new framework called DrCSE to improve software vulnerability detection.It uses an intelligent computation technique based on the combination of two methods:Rebalancing data and representation learning to analyze and evaluate the code property graph(CPG)of the source code for detecting abnormal behavior of software vulnerabilities.To do that,DrCSE performs a combination of 3 main processing techniques:(i)building the source code feature profiles,(ii)rebalancing data,and(iii)contrastive learning.In which,the method(i)extracts the source code’s features based on the vertices and edges of the CPG.The method of rebalancing data has the function of supporting the training process by balancing the experimental dataset.Finally,contrastive learning techniques learn the important features of the source code by finding and pulling similar ones together while pushing the outliers away.The experiment part of this paper demonstrates the superiority of the DrCSE Framework for detecting source code security vulnerabilities using the Verum dataset.As a result,the method proposed in the article has brought a pretty good performance in all metrics,especially the Precision and Recall scores of 39.35%and 69.07%,respectively,proving the efficiency of the DrCSE Framework.It performs better than other approaches,with a 5%boost in Precision and a 5%boost in Recall.Overall,this is considered the best research result for the software vulnerability detection problem using the Verum dataset according to our survey to date.展开更多
Prior studies have demonstrated that deep learning-based approaches can enhance the performance of source code vulnerability detection by training neural networks to learn vulnerability patterns in code representation...Prior studies have demonstrated that deep learning-based approaches can enhance the performance of source code vulnerability detection by training neural networks to learn vulnerability patterns in code representations.However,due to limitations in code representation and neural network design,the validity and practicality of the model still need to be improved.Additionally,due to differences in programming languages,most methods lack cross-language detection generality.To address these issues,in this paper,we analyze the shortcomings of previous code representations and neural networks.We propose a novel hierarchical code representation that combines Concrete Syntax Trees(CST)with Program Dependence Graphs(PDG).Furthermore,we introduce a Tree-Graph-Gated-Attention(TGGA)network based on gated recurrent units and attention mechanisms to build a Hierarchical Code Representation learning-based Vulnerability Detection(HCRVD)system.This system enables cross-language vulnerability detection at the function-level.The experiments show that HCRVD surpasses many competitors in vulnerability detection capabilities.It benefits from the hierarchical code representation learning method,and outperforms baseline in cross-language vulnerability detection by 9.772%and 11.819%in the C/C++and Java datasets,respectively.Moreover,HCRVD has certain ability to detect vulnerabilities in unknown programming languages and is useful in real open-source projects.HCRVD shows good validity,generality and practicality.展开更多
The widespread adoption of blockchain technology has led to the exploration of its numerous applications in various fields.Cryptographic algorithms and smart contracts are critical components of blockchain security.De...The widespread adoption of blockchain technology has led to the exploration of its numerous applications in various fields.Cryptographic algorithms and smart contracts are critical components of blockchain security.Despite the benefits of virtual currency,vulnerabilities in smart contracts have resulted in substantial losses to users.While researchers have identified these vulnerabilities and developed tools for detecting them,the accuracy of these tools is still far from satisfactory,with high false positive and false negative rates.In this paper,we propose a new method for detecting vulnerabilities in smart contracts using the BERT pre-training model,which can quickly and effectively process and detect smart contracts.More specifically,we preprocess and make symbol substitution in the contract,which can make the pre-training model better obtain contract features.We evaluate our method on four datasets and compare its performance with other deep learning models and vulnerability detection tools,demonstrating its superior accuracy.展开更多
In recent years,the number of smart contracts deployed on blockchain has exploded.However,the issue of vulnerability has caused incalculable losses.Due to the irreversible and immutability of smart contracts,vulnerabi...In recent years,the number of smart contracts deployed on blockchain has exploded.However,the issue of vulnerability has caused incalculable losses.Due to the irreversible and immutability of smart contracts,vulnerability detection has become particularly important.With the popular use of neural network model,there has been a growing utilization of deep learning-based methods and tools for the identification of vulnerabilities within smart contracts.This paper commences by providing a succinct overview of prevalent categories of vulnerabilities found in smart contracts.Subsequently,it categorizes and presents an overview of contemporary deep learning-based tools developed for smart contract detection.These tools are categorized based on their open-source status,the data format and the type of feature extraction they employ.Then we conduct a comprehensive comparative analysis of these tools,selecting representative tools for experimental validation and comparing them with traditional tools in terms of detection coverage and accuracy.Finally,Based on the insights gained from the experimental results and the current state of research in the field of smart contract vulnerability detection tools,we suppose to provide a reference standard for developers of contract vulnerability detection tools.Meanwhile,forward-looking research directions are also proposed for deep learning-based smart contract vulnerability detection.展开更多
Smart contracts have led to more efficient development in finance and healthcare,but vulnerabilities in contracts pose high risks to their future applications.The current vulnerability detection methods for contracts ...Smart contracts have led to more efficient development in finance and healthcare,but vulnerabilities in contracts pose high risks to their future applications.The current vulnerability detection methods for contracts are either based on fixed expert rules,which are inefficient,or rely on simplistic deep learning techniques that do not fully leverage contract semantic information.Therefore,there is ample room for improvement in terms of detection precision.To solve these problems,this paper proposes a vulnerability detector based on deep learning techniques,graph representation,and Transformer,called GRATDet.The method first performs swapping,insertion,and symbolization operations for contract functions,increasing the amount of small sample data.Each line of code is then treated as a basic semantic element,and information such as control and data relationships is extracted to construct a new representation in the form of a Line Graph(LG),which shows more structural features that differ from the serialized presentation of the contract.Finally,the node information and edge information of the graph are jointly learned using an improved Transformer-GP model to extract information globally and locally,and the fused features are used for vulnerability detection.The effectiveness of the method in reentrancy vulnerability detection is verified in experiments,where the F1 score reaches 95.16%,exceeding stateof-the-art methods.展开更多
With the development of the 5th generation of mobile communi-cation(5G)networks and artificial intelligence(AI)technologies,the use of the Internet of Things(IoT)has expanded throughout industry.Although IoT networks ...With the development of the 5th generation of mobile communi-cation(5G)networks and artificial intelligence(AI)technologies,the use of the Internet of Things(IoT)has expanded throughout industry.Although IoT networks have improved industrial productivity and convenience,they are highly dependent on nonstandard protocol stacks and open-source-based,poorly validated software,resulting in several security vulnerabilities.How-ever,conventional AI-based software vulnerability discovery technologies cannot be applied to IoT because they require excessive memory and com-puting power.This study developed a technique for optimizing training data size to detect software vulnerabilities rapidly while maintaining learning accuracy.Experimental results using a software vulnerability classification dataset showed that different optimal data sizes did not affect the learning performance of the learning models.Moreover,the minimal data size required to train a model without performance degradation could be determined in advance.For example,the random forest model saved 85.18%of memory and improved latency by 97.82%while maintaining a learning accuracy similar to that achieved when using 100%of data,despite using only 1%.展开更多
By the analysis of vulnerabilities of Android native system services,we find that some vulnerabilities are caused by inconsistent data transmission and inconsistent data processing logic between client and server.The ...By the analysis of vulnerabilities of Android native system services,we find that some vulnerabilities are caused by inconsistent data transmission and inconsistent data processing logic between client and server.The existing research cannot find the above two types of vulnerabilities and the test cases of them face the problem of low coverage.In this paper,we propose an extraction method of test cases based on the native system services of the client and design a case construction method that supports multi-parameter mutation based on genetic algorithm and priority strategy.Based on the above method,we implement a detection tool-BArcherFuzzer to detect vulnerabilities of Android native system services.The experiment results show that BArcherFuzzer found four vulnerabilities of hundreds of exception messages,all of them were confirmed by Google and one was assigned a Common Vulnerabilities and Exposures(CVE)number(CVE-2020-0363).展开更多
Software an important way to vulnerability mining is detect whether there are some loopholes existing in the software, and also is an important way to ensure the secu- rity of information systems. With the rapid devel...Software an important way to vulnerability mining is detect whether there are some loopholes existing in the software, and also is an important way to ensure the secu- rity of information systems. With the rapid development of information technology and software industry, most of the software has not been rigorously tested before being put in use, so that the hidden vulnerabilities in software will be exploited by the attackers. Therefore, it is of great significance for us to actively de- tect the software vulnerabilities in the security maintenance of information systems. In this paper, we firstly studied some of the common- ly used vulnerability detection methods and detection tools, and analyzed the advantages and disadvantages of each method in different scenarios. Secondly, we designed a set of eval- uation criteria for different mining methods in the loopholes evaluation. Thirdly, we also proposed and designed an integration testing framework, on which we can test the typical static analysis methods and dynamic mining methods as well as make the comparison, so that we can obtain an intuitive comparative analysis for the experimental results. Final- ly, we reported the experimental analysis to verify the feasibility and effectiveness of the proposed evaluation method and the testingframework, with the results showing that the final test results will serve as a form of guid- ance to aid the selection of the most appropri- ate and effective method or tools in vulnera- bility detection activity.展开更多
The most resource-intensive and laborious part of debugging is finding the exact location of the fault from the more significant number of code snippets.Plenty of machine intelligence models has offered the effective ...The most resource-intensive and laborious part of debugging is finding the exact location of the fault from the more significant number of code snippets.Plenty of machine intelligence models has offered the effective localization of defects.Some models can precisely locate the faulty with more than 95%accuracy,resulting in demand for trustworthy models in fault localization.Confidence and trustworthiness within machine intelligencebased software models can only be achieved via explainable artificial intelligence in Fault Localization(XFL).The current study presents a model for generating counterfactual interpretations for the fault localization model’s decisions.Neural system approximations and disseminated presentation of input information may be achieved by building a nonlinear neural network model.That demonstrates a high level of proficiency in transfer learning,even with minimal training data.The proposed XFL would make the decisionmaking transparent simultaneously without impacting the model’s performance.The proposed XFL ranks the software program statements based on the possible vulnerability score approximated from the training data.The model’s performance is further evaluated using various metrics like the number of assessed statements,confidence level of fault localization,and TopN evaluation strategies.展开更多
SQL injection poses a major threat to the application level security of the database and there is no systematic solution to these attacks.Different from traditional run time security strategies such as IDS and fire-wa...SQL injection poses a major threat to the application level security of the database and there is no systematic solution to these attacks.Different from traditional run time security strategies such as IDS and fire-wall,this paper focuses on the solution at the outset;it presents a method to find vulnerabilities by analyzing the source codes.The concept of validated tree is developed to track variables referenced by database operations in scripts.By checking whether these variables are influenced by outside inputs,the database operations are proved to be secure or not.This method has advantages of high accuracy and efficiency as well as low costs,and it is universal to any type of web application platforms.It is implemented by the software code vulnerabilities of SQL injection detector(CVSID).The validity and efficiency are demonstrated with an example.展开更多
Software vulnerabilities are the root cause of various information security incidents while dynamic taint analysis is an emerging program analysis technique. In this paper, to maximize the use of the technique to dete...Software vulnerabilities are the root cause of various information security incidents while dynamic taint analysis is an emerging program analysis technique. In this paper, to maximize the use of the technique to detect software vulnerabilities, we present SwordDTA, a tool that can perform dynamic taint analysis for binaries. This tool is flexible and extensible that it can work with commodity software and hardware. It can be used to detect software vulnerabilities with vulnerability modeling and taint check. We evaluate it with a number of commonly used real-world applications. The experimental results show that SwordDTA is capable of detecting at least four kinds of softavare vulnerabilities including buffer overflow, integer overflow, division by zero and use-after-free, and is applicable for a wide range of software.展开更多
It is difficult to formalize the causes of vulnerability, and there is no effective model to reveal the causes and characteristics of vulnerability. In this paper, a vulnerability model construction method is proposed...It is difficult to formalize the causes of vulnerability, and there is no effective model to reveal the causes and characteristics of vulnerability. In this paper, a vulnerability model construction method is proposed to realize the description of vulnerability attribute and the construction of a vulnerability model. A vulnerability model based on chemical abstract machine(CHAM) is constructed to realize the CHAM description of vulnerability model, and the framework of vulnerability model is also discussed. Case study is carried out to verify the feasibility and effectiveness of the proposed model. In addition, a prototype system is also designed and implemented based on the proposed vulnerability model. Experimental results show that the proposed model is more effective than other methods in the detection of software vulnerabilities.展开更多
Ethereum smart contracts are computer programs that are deployed and executed on the Ethereum blockchain to enforce agreements among untrusting parties.Being the most prominent platform that supports smart contracts,E...Ethereum smart contracts are computer programs that are deployed and executed on the Ethereum blockchain to enforce agreements among untrusting parties.Being the most prominent platform that supports smart contracts,Ethereum has been targeted by many attacks and plagued by security incidents.Consequently,many smart contract vulnerabilities have been discovered in the past decade.To detect and prevent such vulnerabilities,different security analysis tools,including static and dynamic analysis tools,have been created,but their performance decreases drastically when codes to be analyzed are constantly being rewritten.In this paper,we propose Eth2Vec,a machine-learning-based static analysis tool that detects smart contract vulnerabilities.Eth2Vec maintains its robustness against code rewrites;i.e.,it can detect vulnerabilities even in rewritten codes.Other machine-learning-based static analysis tools require features,which analysts create manually,as inputs.In contrast,Eth2Vec uses a neural network for language processing to automatically learn the features of vulnerable contracts.In doing so,Eth2Vec can detect vulnerabilities in smart contracts by comparing the similarities between the codes of a target contract and those of the learned contracts.We performed experiments with existing open databases,such as Etherscan,and Eth2Vec was able to outperform a recent model based on support vector machine in terms of well-known metrics,i.e.,precision,recall,and F1-score.展开更多
Smart contracts hold billions of dollars in digital currency,and their security vulnerabilities have drawn a lot of attention in recent years.Traditional methods for detecting smart contract vulnerabilities rely prima...Smart contracts hold billions of dollars in digital currency,and their security vulnerabilities have drawn a lot of attention in recent years.Traditional methods for detecting smart contract vulnerabilities rely primarily on symbol execution,which makes them time-consuming with high false positive rates.Recently,deep learning approaches have alleviated these issues but still face several major limitations,such as lack of interpretability and susceptibility to evasion techniques.In this paper,we propose a feature selection method for uplifting modeling.The fundamental concept of this method is a feature selection algorithm,utilizing interpretation outcomes to select critical features,thereby reducing the scales of features.The learning process could be accelerated significantly because of the reduction of the feature size.The experiment shows that our proposed model performs well in six types of vulnerability detection.The accuracy of each type is higher than 93%and the average detection time of each smart contract is less than 1 ms.Notably,through our proposed feature selection algorithm,the training time of each type of vulnerability is reduced by nearly 80%compared with that of its original.展开更多
Mass monitor logs are produced during the process of component security testing. In order to mine the explicit and implicit security exception information of the tested component, the log should be searched for keywor...Mass monitor logs are produced during the process of component security testing. In order to mine the explicit and implicit security exception information of the tested component, the log should be searched for keyword strings. However, existing string-searching algorithms are not very efficient or appropriate for the operation of searching monitor logs during component security testing. For mining abnormal information effectively in monitor logs, an improved string-searching algorithm is proposed. The main idea of this algorithm is to search for the first occurrence of a character in the main string. The character should be different and farther from the last character in the pattern string. With this algorithm, the backward moving distance of the pattern string will be increased and the matching time will be optimized. In the end, we conduct an experimental study based on our approach, the results of which show that the proposed algorithm finds strings in monitor logs 11.5% more efficiently than existing approaches.展开更多
SOHO(small office/home office)routers provide services for end devices to connect to the Internet,playing an important role in cyberspace.Unfortunately,security vulnerabilities pervasively exist in these routers,espec...SOHO(small office/home office)routers provide services for end devices to connect to the Internet,playing an important role in cyberspace.Unfortunately,security vulnerabilities pervasively exist in these routers,especially in the web server modules,greatly endangering end users.To discover these vulnerabilities,fuzzing web server modules of SOHO routers is the most popular solution.However,its effectiveness is limited due to the lack of input specification,lack of routers’internal running states,and lack of testing environment recovery mechanisms.Moreover,existing works for device fuzzing are more likely to detect memory corruption vulnerabilities.In this paper,we propose a solution ESRFuzzer to address these issues.It is a fully automated fuzzing framework for testing physical SOHO devices.It continuously and effectively generates test cases by leveraging two input semantic models,i.e.,KEY-VALUE data model and CONF-READ communication model,and automatically recovers the testing environment with power management.It also coordinates diversified mutation rules with multiple monitoring mechanisms to trigger multi-type vulnerabilities.With the guidance of the two semantic models,ESRFuzzer can work in two ways:general mode fuzzing and D-CONF mode fuzzing.General mode fuzzing can discover both issues which occur in the CONF and READ operation,while D-CONF mode fuzzing focus on the READ-op issues especially missed by general mode fuzzing.We ran ESRFuzzer on 10 popular routers across five vendors.In total,it discovered 136 unique issues,120 of which have been confirmed as 0-day vulnerabilities we found.As an improvement of SRFuzzer,ESRFuzzer have discovered 35 previous undiscovered READ-op issues that belong to three vulnerability types,and 23 of them have been confirmed as 0-day vulnerabilities by vendors.The experimental results show that ESRFuzzer outperforms state-of-the-art solutions in terms of types and number of vulnerabilities found.展开更多
Allocation,dereferencing,and freeing of memory data in kernels are coherently linked.There widely exist real cases where the correctness of memory is compromised.This incorrectness in kernel memory brings about signif...Allocation,dereferencing,and freeing of memory data in kernels are coherently linked.There widely exist real cases where the correctness of memory is compromised.This incorrectness in kernel memory brings about significant security issues,e.g.,information leaking.Though memory allocation,dereferencing,and freeing are closely related,previous work failed to realize they are closely related.In this paper,we study the life-cycle of kernel memory,which consists of allocation,dereferencing,and freeing.Errors in them are called memory life-cycle(MLC)bugs.We propose an in-depth study of MLC bugs and implement a memory life-cycle bug sanitizer(MEBS)for MLC bug detection.Utilizing an interprocedural global call graph and novel identification approaches,MEBS can reveal memory allocation,dereferencing,and freeing sites in kernels.By constructing a modified define-use chain and examining the errors in the life-cycle,MLC bugs can be identified.Moreover,the experimental results on the latest kernels demonstrate that MEBS can effectively detect MLC bugs,and MEBS can be scaled to different kernels.More than 100 new bugs are exposed in Linux and FreeBSD,and 12 common vulnerabilities and exposures(CVE)are assigned.展开更多
Fuzzing is known to be one of the most effective techniques to uncover security vulnerabilities of large-scale software systems.During fuzzing,it is crucial to distribute the fuzzing resource appropriately so as to ac...Fuzzing is known to be one of the most effective techniques to uncover security vulnerabilities of large-scale software systems.During fuzzing,it is crucial to distribute the fuzzing resource appropriately so as to achieve the best fuzzing performance under a limited budget.Existing distribution strategies of American Fuzzy Lop(AFL)based greybox fuzzing focus on increasing coverage blindly without considering the metrics of code regions,thus lacking the insight regarding which region is more likely to be vulnerable and deserves more fuzzing resources.We tackle the above drawback by proposing a vulnerable region-aware greybox fuzzing approach.Specifically,we distribute more fuzzing resources towards regions that are more likely to be vulnerable based on four kinds of code metrics.We implemented the approach as an extension to AFL named RegionFuzz.Large-scale experimental evaluations validate the effectiveness and efficiency of RegionFuzz-11 new bugs including three new CVEs are successfully uncovered by RegionFuzz.展开更多
文摘The detection of software vulnerabilities written in C and C++languages takes a lot of attention and interest today.This paper proposes a new framework called DrCSE to improve software vulnerability detection.It uses an intelligent computation technique based on the combination of two methods:Rebalancing data and representation learning to analyze and evaluate the code property graph(CPG)of the source code for detecting abnormal behavior of software vulnerabilities.To do that,DrCSE performs a combination of 3 main processing techniques:(i)building the source code feature profiles,(ii)rebalancing data,and(iii)contrastive learning.In which,the method(i)extracts the source code’s features based on the vertices and edges of the CPG.The method of rebalancing data has the function of supporting the training process by balancing the experimental dataset.Finally,contrastive learning techniques learn the important features of the source code by finding and pulling similar ones together while pushing the outliers away.The experiment part of this paper demonstrates the superiority of the DrCSE Framework for detecting source code security vulnerabilities using the Verum dataset.As a result,the method proposed in the article has brought a pretty good performance in all metrics,especially the Precision and Recall scores of 39.35%and 69.07%,respectively,proving the efficiency of the DrCSE Framework.It performs better than other approaches,with a 5%boost in Precision and a 5%boost in Recall.Overall,this is considered the best research result for the software vulnerability detection problem using the Verum dataset according to our survey to date.
基金funded by the Major Science and Technology Projects in Henan Province,China,Grant No.221100210600.
文摘Prior studies have demonstrated that deep learning-based approaches can enhance the performance of source code vulnerability detection by training neural networks to learn vulnerability patterns in code representations.However,due to limitations in code representation and neural network design,the validity and practicality of the model still need to be improved.Additionally,due to differences in programming languages,most methods lack cross-language detection generality.To address these issues,in this paper,we analyze the shortcomings of previous code representations and neural networks.We propose a novel hierarchical code representation that combines Concrete Syntax Trees(CST)with Program Dependence Graphs(PDG).Furthermore,we introduce a Tree-Graph-Gated-Attention(TGGA)network based on gated recurrent units and attention mechanisms to build a Hierarchical Code Representation learning-based Vulnerability Detection(HCRVD)system.This system enables cross-language vulnerability detection at the function-level.The experiments show that HCRVD surpasses many competitors in vulnerability detection capabilities.It benefits from the hierarchical code representation learning method,and outperforms baseline in cross-language vulnerability detection by 9.772%and 11.819%in the C/C++and Java datasets,respectively.Moreover,HCRVD has certain ability to detect vulnerabilities in unknown programming languages and is useful in real open-source projects.HCRVD shows good validity,generality and practicality.
基金supported by the National Key Research and Development Plan in China(Grant No.2020YFB1005500)。
文摘The widespread adoption of blockchain technology has led to the exploration of its numerous applications in various fields.Cryptographic algorithms and smart contracts are critical components of blockchain security.Despite the benefits of virtual currency,vulnerabilities in smart contracts have resulted in substantial losses to users.While researchers have identified these vulnerabilities and developed tools for detecting them,the accuracy of these tools is still far from satisfactory,with high false positive and false negative rates.In this paper,we propose a new method for detecting vulnerabilities in smart contracts using the BERT pre-training model,which can quickly and effectively process and detect smart contracts.More specifically,we preprocess and make symbol substitution in the contract,which can make the pre-training model better obtain contract features.We evaluate our method on four datasets and compare its performance with other deep learning models and vulnerability detection tools,demonstrating its superior accuracy.
基金funded by the Major PublicWelfare Special Fund of Henan Province(No.201300210200)the Major Science and Technology Research Special Fund of Henan Province(No.221100210400).
文摘In recent years,the number of smart contracts deployed on blockchain has exploded.However,the issue of vulnerability has caused incalculable losses.Due to the irreversible and immutability of smart contracts,vulnerability detection has become particularly important.With the popular use of neural network model,there has been a growing utilization of deep learning-based methods and tools for the identification of vulnerabilities within smart contracts.This paper commences by providing a succinct overview of prevalent categories of vulnerabilities found in smart contracts.Subsequently,it categorizes and presents an overview of contemporary deep learning-based tools developed for smart contract detection.These tools are categorized based on their open-source status,the data format and the type of feature extraction they employ.Then we conduct a comprehensive comparative analysis of these tools,selecting representative tools for experimental validation and comparing them with traditional tools in terms of detection coverage and accuracy.Finally,Based on the insights gained from the experimental results and the current state of research in the field of smart contract vulnerability detection tools,we suppose to provide a reference standard for developers of contract vulnerability detection tools.Meanwhile,forward-looking research directions are also proposed for deep learning-based smart contract vulnerability detection.
基金supported by the Science and Technology Program Project(No.2020A02001-1)of Xinjiang Autonomous Region,China.
文摘Smart contracts have led to more efficient development in finance and healthcare,but vulnerabilities in contracts pose high risks to their future applications.The current vulnerability detection methods for contracts are either based on fixed expert rules,which are inefficient,or rely on simplistic deep learning techniques that do not fully leverage contract semantic information.Therefore,there is ample room for improvement in terms of detection precision.To solve these problems,this paper proposes a vulnerability detector based on deep learning techniques,graph representation,and Transformer,called GRATDet.The method first performs swapping,insertion,and symbolization operations for contract functions,increasing the amount of small sample data.Each line of code is then treated as a basic semantic element,and information such as control and data relationships is extracted to construct a new representation in the form of a Line Graph(LG),which shows more structural features that differ from the serialized presentation of the contract.Finally,the node information and edge information of the graph are jointly learned using an improved Transformer-GP model to extract information globally and locally,and the fused features are used for vulnerability detection.The effectiveness of the method in reentrancy vulnerability detection is verified in experiments,where the F1 score reaches 95.16%,exceeding stateof-the-art methods.
基金supported by a National Research Foundation of Korea (NRF)grant funded by the Ministry of Science and ICT (MSIT) (No.2020R1F1A1061107)the Korea Institute for Advancement of Technology (KIAT)grant funded by the Korean Government (MOTIE) (P0008703,The Competency Development Program for Industry Specialists)the MSIT under the ICAN (ICT Challenge and Advanced Network of HRD)program (No.IITP-2022-RS-2022-00156310)supervised by the Institute of Information&Communication Technology Planning and Evaluation (IITP).
文摘With the development of the 5th generation of mobile communi-cation(5G)networks and artificial intelligence(AI)technologies,the use of the Internet of Things(IoT)has expanded throughout industry.Although IoT networks have improved industrial productivity and convenience,they are highly dependent on nonstandard protocol stacks and open-source-based,poorly validated software,resulting in several security vulnerabilities.How-ever,conventional AI-based software vulnerability discovery technologies cannot be applied to IoT because they require excessive memory and com-puting power.This study developed a technique for optimizing training data size to detect software vulnerabilities rapidly while maintaining learning accuracy.Experimental results using a software vulnerability classification dataset showed that different optimal data sizes did not affect the learning performance of the learning models.Moreover,the minimal data size required to train a model without performance degradation could be determined in advance.For example,the random forest model saved 85.18%of memory and improved latency by 97.82%while maintaining a learning accuracy similar to that achieved when using 100%of data,despite using only 1%.
基金This work was supported by the National Key R&D Program of China(2023YFB3106800)the National Natural Science Foundation of China(Grant No.62072051).We are overwhelmed in all humbleness and gratefulness to acknowledge my depth to all those who have helped me to put these ideas.
文摘By the analysis of vulnerabilities of Android native system services,we find that some vulnerabilities are caused by inconsistent data transmission and inconsistent data processing logic between client and server.The existing research cannot find the above two types of vulnerabilities and the test cases of them face the problem of low coverage.In this paper,we propose an extraction method of test cases based on the native system services of the client and design a case construction method that supports multi-parameter mutation based on genetic algorithm and priority strategy.Based on the above method,we implement a detection tool-BArcherFuzzer to detect vulnerabilities of Android native system services.The experiment results show that BArcherFuzzer found four vulnerabilities of hundreds of exception messages,all of them were confirmed by Google and one was assigned a Common Vulnerabilities and Exposures(CVE)number(CVE-2020-0363).
基金partly supported by National Natural Science Foundation of China (NSFC grant numbers: 61202110 and 61502205)the project of Jiangsu provincial Six Talent Peaks (Grant numbers: XYDXXJS-016)
文摘Software an important way to vulnerability mining is detect whether there are some loopholes existing in the software, and also is an important way to ensure the secu- rity of information systems. With the rapid development of information technology and software industry, most of the software has not been rigorously tested before being put in use, so that the hidden vulnerabilities in software will be exploited by the attackers. Therefore, it is of great significance for us to actively de- tect the software vulnerabilities in the security maintenance of information systems. In this paper, we firstly studied some of the common- ly used vulnerability detection methods and detection tools, and analyzed the advantages and disadvantages of each method in different scenarios. Secondly, we designed a set of eval- uation criteria for different mining methods in the loopholes evaluation. Thirdly, we also proposed and designed an integration testing framework, on which we can test the typical static analysis methods and dynamic mining methods as well as make the comparison, so that we can obtain an intuitive comparative analysis for the experimental results. Final- ly, we reported the experimental analysis to verify the feasibility and effectiveness of the proposed evaluation method and the testingframework, with the results showing that the final test results will serve as a form of guid- ance to aid the selection of the most appropri- ate and effective method or tools in vulnera- bility detection activity.
文摘The most resource-intensive and laborious part of debugging is finding the exact location of the fault from the more significant number of code snippets.Plenty of machine intelligence models has offered the effective localization of defects.Some models can precisely locate the faulty with more than 95%accuracy,resulting in demand for trustworthy models in fault localization.Confidence and trustworthiness within machine intelligencebased software models can only be achieved via explainable artificial intelligence in Fault Localization(XFL).The current study presents a model for generating counterfactual interpretations for the fault localization model’s decisions.Neural system approximations and disseminated presentation of input information may be achieved by building a nonlinear neural network model.That demonstrates a high level of proficiency in transfer learning,even with minimal training data.The proposed XFL would make the decisionmaking transparent simultaneously without impacting the model’s performance.The proposed XFL ranks the software program statements based on the possible vulnerability score approximated from the training data.The model’s performance is further evaluated using various metrics like the number of assessed statements,confidence level of fault localization,and TopN evaluation strategies.
基金supported by the National Natural Science Foundation of China (Grant No.60574087)the Hi-Tech Research and Development Program of China (Nos.2007AA01Z475,2007AA01Z480,2007AA01Z464)the 111 International Collaboration Program of China.
文摘SQL injection poses a major threat to the application level security of the database and there is no systematic solution to these attacks.Different from traditional run time security strategies such as IDS and fire-wall,this paper focuses on the solution at the outset;it presents a method to find vulnerabilities by analyzing the source codes.The concept of validated tree is developed to track variables referenced by database operations in scripts.By checking whether these variables are influenced by outside inputs,the database operations are proved to be secure or not.This method has advantages of high accuracy and efficiency as well as low costs,and it is universal to any type of web application platforms.It is implemented by the software code vulnerabilities of SQL injection detector(CVSID).The validity and efficiency are demonstrated with an example.
基金Supported by the National High Technology Research and Development Program of China(863 Program)(2012AA012902)the“HGJ”National Major Technological Projects(2013ZX01045-004)
文摘Software vulnerabilities are the root cause of various information security incidents while dynamic taint analysis is an emerging program analysis technique. In this paper, to maximize the use of the technique to detect software vulnerabilities, we present SwordDTA, a tool that can perform dynamic taint analysis for binaries. This tool is flexible and extensible that it can work with commodity software and hardware. It can be used to detect software vulnerabilities with vulnerability modeling and taint check. We evaluate it with a number of commonly used real-world applications. The experimental results show that SwordDTA is capable of detecting at least four kinds of softavare vulnerabilities including buffer overflow, integer overflow, division by zero and use-after-free, and is applicable for a wide range of software.
基金Supported by the National Natural Science Foundation of China(61202110 and 61502205)the Project of Jiangsu Provincial Six Talent Peaks(XYDXXJS-016)
文摘It is difficult to formalize the causes of vulnerability, and there is no effective model to reveal the causes and characteristics of vulnerability. In this paper, a vulnerability model construction method is proposed to realize the description of vulnerability attribute and the construction of a vulnerability model. A vulnerability model based on chemical abstract machine(CHAM) is constructed to realize the CHAM description of vulnerability model, and the framework of vulnerability model is also discussed. Case study is carried out to verify the feasibility and effectiveness of the proposed model. In addition, a prototype system is also designed and implemented based on the proposed vulnerability model. Experimental results show that the proposed model is more effective than other methods in the detection of software vulnerabilities.
基金This research was supported in part by the Japan Society for the Promotion of Science KAKENHI Number 22H03591the MEXT"Innovation Platform for Society 5.0"Program Grant Number JPMXP0518071489.
文摘Ethereum smart contracts are computer programs that are deployed and executed on the Ethereum blockchain to enforce agreements among untrusting parties.Being the most prominent platform that supports smart contracts,Ethereum has been targeted by many attacks and plagued by security incidents.Consequently,many smart contract vulnerabilities have been discovered in the past decade.To detect and prevent such vulnerabilities,different security analysis tools,including static and dynamic analysis tools,have been created,but their performance decreases drastically when codes to be analyzed are constantly being rewritten.In this paper,we propose Eth2Vec,a machine-learning-based static analysis tool that detects smart contract vulnerabilities.Eth2Vec maintains its robustness against code rewrites;i.e.,it can detect vulnerabilities even in rewritten codes.Other machine-learning-based static analysis tools require features,which analysts create manually,as inputs.In contrast,Eth2Vec uses a neural network for language processing to automatically learn the features of vulnerable contracts.In doing so,Eth2Vec can detect vulnerabilities in smart contracts by comparing the similarities between the codes of a target contract and those of the learned contracts.We performed experiments with existing open databases,such as Etherscan,and Eth2Vec was able to outperform a recent model based on support vector machine in terms of well-known metrics,i.e.,precision,recall,and F1-score.
基金funded by the National Natural Science Foundation of China No.61902157 and No.62002139.
文摘Smart contracts hold billions of dollars in digital currency,and their security vulnerabilities have drawn a lot of attention in recent years.Traditional methods for detecting smart contract vulnerabilities rely primarily on symbol execution,which makes them time-consuming with high false positive rates.Recently,deep learning approaches have alleviated these issues but still face several major limitations,such as lack of interpretability and susceptibility to evasion techniques.In this paper,we propose a feature selection method for uplifting modeling.The fundamental concept of this method is a feature selection algorithm,utilizing interpretation outcomes to select critical features,thereby reducing the scales of features.The learning process could be accelerated significantly because of the reduction of the feature size.The experiment shows that our proposed model performs well in six types of vulnerability detection.The accuracy of each type is higher than 93%and the average detection time of each smart contract is less than 1 ms.Notably,through our proposed feature selection algorithm,the training time of each type of vulnerability is reduced by nearly 80%compared with that of its original.
基金supported by the National Natural Science Foundation of China (Nos.61202110 and 61502205)the Postdoctoral Science Foundation of China (Nos.2015M571687 and 2015M581739)the Graduate Research Innovation Project of Jiangsu Province (No.KYLX15 1079)
文摘Mass monitor logs are produced during the process of component security testing. In order to mine the explicit and implicit security exception information of the tested component, the log should be searched for keyword strings. However, existing string-searching algorithms are not very efficient or appropriate for the operation of searching monitor logs during component security testing. For mining abnormal information effectively in monitor logs, an improved string-searching algorithm is proposed. The main idea of this algorithm is to search for the first occurrence of a character in the main string. The character should be different and farther from the last character in the pattern string. With this algorithm, the backward moving distance of the pattern string will be increased and the matching time will be optimized. In the end, we conduct an experimental study based on our approach, the results of which show that the proposed algorithm finds strings in monitor logs 11.5% more efficiently than existing approaches.
基金Chinese National Natural Science Foundation(61802394,U1836209,62032010)National Key Research and Development Program of China(2016QY071405)+2 种基金Strategic Priority Research Program of the CAS(XDC02040100,XDC02030200,XDC02020200)Program No.2017-JCJQ-ZD-043-01BNRist Network and Software Security Research Program(BNR2019TD01004,BNR2019RC01-009).
文摘SOHO(small office/home office)routers provide services for end devices to connect to the Internet,playing an important role in cyberspace.Unfortunately,security vulnerabilities pervasively exist in these routers,especially in the web server modules,greatly endangering end users.To discover these vulnerabilities,fuzzing web server modules of SOHO routers is the most popular solution.However,its effectiveness is limited due to the lack of input specification,lack of routers’internal running states,and lack of testing environment recovery mechanisms.Moreover,existing works for device fuzzing are more likely to detect memory corruption vulnerabilities.In this paper,we propose a solution ESRFuzzer to address these issues.It is a fully automated fuzzing framework for testing physical SOHO devices.It continuously and effectively generates test cases by leveraging two input semantic models,i.e.,KEY-VALUE data model and CONF-READ communication model,and automatically recovers the testing environment with power management.It also coordinates diversified mutation rules with multiple monitoring mechanisms to trigger multi-type vulnerabilities.With the guidance of the two semantic models,ESRFuzzer can work in two ways:general mode fuzzing and D-CONF mode fuzzing.General mode fuzzing can discover both issues which occur in the CONF and READ operation,while D-CONF mode fuzzing focus on the READ-op issues especially missed by general mode fuzzing.We ran ESRFuzzer on 10 popular routers across five vendors.In total,it discovered 136 unique issues,120 of which have been confirmed as 0-day vulnerabilities we found.As an improvement of SRFuzzer,ESRFuzzer have discovered 35 previous undiscovered READ-op issues that belong to three vulnerability types,and 23 of them have been confirmed as 0-day vulnerabilities by vendors.The experimental results show that ESRFuzzer outperforms state-of-the-art solutions in terms of types and number of vulnerabilities found.
基金supported by the National High-Level Personnel for Defense Technology Program of China under Grant No.2017-JCJQ-ZQ-013the National Natural Science Foundation of China under Grant Nos.61902405 and 61902412+2 种基金the Natural Science Foundation of Hunan Province of China under Grant No.2021JJ40692the Parallel and Distributed Processing Research Foundation under Grant No.6142110190404the Research Project of National University of Defense Technology under Grant Nos.ZK20-09 and ZK20-17.
文摘Allocation,dereferencing,and freeing of memory data in kernels are coherently linked.There widely exist real cases where the correctness of memory is compromised.This incorrectness in kernel memory brings about significant security issues,e.g.,information leaking.Though memory allocation,dereferencing,and freeing are closely related,previous work failed to realize they are closely related.In this paper,we study the life-cycle of kernel memory,which consists of allocation,dereferencing,and freeing.Errors in them are called memory life-cycle(MLC)bugs.We propose an in-depth study of MLC bugs and implement a memory life-cycle bug sanitizer(MEBS)for MLC bug detection.Utilizing an interprocedural global call graph and novel identification approaches,MEBS can reveal memory allocation,dereferencing,and freeing sites in kernels.By constructing a modified define-use chain and examining the errors in the life-cycle,MLC bugs can be identified.Moreover,the experimental results on the latest kernels demonstrate that MEBS can effectively detect MLC bugs,and MEBS can be scaled to different kernels.More than 100 new bugs are exposed in Linux and FreeBSD,and 12 common vulnerabilities and exposures(CVE)are assigned.
基金(partially)supported by the National Key Research and Development Program of China under Grant No.2017YFA0700604the National Natural Science Foundation of China under Grant Nos.62032010 and 61802168+1 种基金the Leading-Edge Technology Program of Jiangsu Natural Science Foundation under Grant No.BK20202001the 2021 Double Entrepreneurship Big Data and Theoretical Research Project of Nanjing University.
文摘Fuzzing is known to be one of the most effective techniques to uncover security vulnerabilities of large-scale software systems.During fuzzing,it is crucial to distribute the fuzzing resource appropriately so as to achieve the best fuzzing performance under a limited budget.Existing distribution strategies of American Fuzzy Lop(AFL)based greybox fuzzing focus on increasing coverage blindly without considering the metrics of code regions,thus lacking the insight regarding which region is more likely to be vulnerable and deserves more fuzzing resources.We tackle the above drawback by proposing a vulnerable region-aware greybox fuzzing approach.Specifically,we distribute more fuzzing resources towards regions that are more likely to be vulnerable based on four kinds of code metrics.We implemented the approach as an extension to AFL named RegionFuzz.Large-scale experimental evaluations validate the effectiveness and efficiency of RegionFuzz-11 new bugs including three new CVEs are successfully uncovered by RegionFuzz.