Proprietary(or semi-proprietary)protocols are widely adopted in industrial control systems(ICSs).Inferring protocol format by reverse engineering is important for many network security applications,e.g.,program tests ...Proprietary(or semi-proprietary)protocols are widely adopted in industrial control systems(ICSs).Inferring protocol format by reverse engineering is important for many network security applications,e.g.,program tests and intrusion detection.Conventional protocol reverse engineering methods have been proposed which are considered time-consuming,tedious,and error-prone.Recently,automatical protocol reverse engineering methods have been proposed which are,however,neither effective in handling binary-based ICS protocols based on network traffic analysis nor accurate in extracting protocol fields from protocol implementations.In this paper,we present a framework called the industrial control system protocol reverse engineering framework(ICSPRF)that aims to extract ICS protocol fields with high accuracy.ICSPRF is based on the key insight that an individual field in a message is typically handled in the same execution context,e.g.,basic block(BBL)group.As a result,by monitoring program execution,we can collect the tainted data information processed in every BBL group in the execution trace and cluster it to derive the protocol format.We evaluate our approach with six open-source ICS protocol implementations.The results show that ICSPRF can identify individual protocol fields with high accuracy(on average a 94.3%match ratio).ICSPRF also has a low coarse-grained and overly fine-grained match ratio.For the same metric,ICSPRF is more accurate than AutoFormat(88.5%for all evaluated protocols and 80.0%for binary-based protocols).展开更多
With the rapid development of the Internet,a large number of private protocols emerge on the network.However,some of them are constructed by attackers to avoid being analyzed,posing a threat to computer network securi...With the rapid development of the Internet,a large number of private protocols emerge on the network.However,some of them are constructed by attackers to avoid being analyzed,posing a threat to computer network security.The blockchain uses the P2P protocol to implement various functions across the network.Furthermore,the P2P protocol format of blockchain may differ from the standard format specification,which leads to sniffing tools such as Wireshark and Fiddler not being able to recognize them.Therefore,the ability to distinguish different types of unknown network protocols is vital for network security.In this paper,we propose an unsupervised clustering algorithm based on maximum frequent sequences for binary protocols,which can distinguish various unknown protocols to provide support for analyzing unknown protocol formats.We mine the maximum frequent sequences of protocolmessage sets in bytes.Andwe calculate the fuzzymembership of the protocolmessage to each maximum frequent sequence,which is based on fuzzy set theory.Then we construct the fuzzy membership vector for each protocol message.Finally,we adopt K-means++to split different types of protocol messages into several clusters and evaluate the performance by calculating homogeneity,integrity,and Fowlkes and Mallows Index(FMI).Besides,the clustering algorithms based onNeedleman–Wunsch and the fixed-length prefix are compared with the algorithm presented in this paper.Compared with these traditional clustering methods,we demonstrate a certain improvement in the clustering performance of our work.展开更多
Internet communication protocols define the behavior rules of network components when they communicate with each other.With the continuous development of network technologies,many private or unknown network protocols ...Internet communication protocols define the behavior rules of network components when they communicate with each other.With the continuous development of network technologies,many private or unknown network protocols are emerging in endlessly various network environments.Herein,relevant protocol specifications become difficult or unavailable to translate in many situations such as network security management and intrusion detection.Although protocol reverse engineering is being investigated in recent years to perform reverse analysis on the specifications of unknown protocols,most existing methods have proven to be time-consuming with limited efficiency,especially when applied on unknown protocol state machines.This paper proposes a state merging algorithm based on EDSM(Evidence-Driven State Merging)to infer the transition rules of unknown protocols in form of state machines with high efficiency.Compared with another classical state machine inferring method based on Exbar algorithm,the experiment results demonstrate that our proposed method could run faster,especially when dealing with massive training data sets.In addition,this method can also make the state machines have higher similarities with the reference state machines constructed from public specifications.展开更多
基金supported by the National Natural Science Foundation of China(No.61833015)。
文摘Proprietary(or semi-proprietary)protocols are widely adopted in industrial control systems(ICSs).Inferring protocol format by reverse engineering is important for many network security applications,e.g.,program tests and intrusion detection.Conventional protocol reverse engineering methods have been proposed which are considered time-consuming,tedious,and error-prone.Recently,automatical protocol reverse engineering methods have been proposed which are,however,neither effective in handling binary-based ICS protocols based on network traffic analysis nor accurate in extracting protocol fields from protocol implementations.In this paper,we present a framework called the industrial control system protocol reverse engineering framework(ICSPRF)that aims to extract ICS protocol fields with high accuracy.ICSPRF is based on the key insight that an individual field in a message is typically handled in the same execution context,e.g.,basic block(BBL)group.As a result,by monitoring program execution,we can collect the tainted data information processed in every BBL group in the execution trace and cluster it to derive the protocol format.We evaluate our approach with six open-source ICS protocol implementations.The results show that ICSPRF can identify individual protocol fields with high accuracy(on average a 94.3%match ratio).ICSPRF also has a low coarse-grained and overly fine-grained match ratio.For the same metric,ICSPRF is more accurate than AutoFormat(88.5%for all evaluated protocols and 80.0%for binary-based protocols).
基金National Natural Science Foundation of China under Grant No.61872111Sichuan Science and Technology Program(No.2019YFSY0049)the“Project for the Development and Application of Safety Testing and Verification Platform for Industrial Robots”of the Ministry of Industry and Information Technology.
文摘With the rapid development of the Internet,a large number of private protocols emerge on the network.However,some of them are constructed by attackers to avoid being analyzed,posing a threat to computer network security.The blockchain uses the P2P protocol to implement various functions across the network.Furthermore,the P2P protocol format of blockchain may differ from the standard format specification,which leads to sniffing tools such as Wireshark and Fiddler not being able to recognize them.Therefore,the ability to distinguish different types of unknown network protocols is vital for network security.In this paper,we propose an unsupervised clustering algorithm based on maximum frequent sequences for binary protocols,which can distinguish various unknown protocols to provide support for analyzing unknown protocol formats.We mine the maximum frequent sequences of protocolmessage sets in bytes.Andwe calculate the fuzzymembership of the protocolmessage to each maximum frequent sequence,which is based on fuzzy set theory.Then we construct the fuzzy membership vector for each protocol message.Finally,we adopt K-means++to split different types of protocol messages into several clusters and evaluate the performance by calculating homogeneity,integrity,and Fowlkes and Mallows Index(FMI).Besides,the clustering algorithms based onNeedleman–Wunsch and the fixed-length prefix are compared with the algorithm presented in this paper.Compared with these traditional clustering methods,we demonstrate a certain improvement in the clustering performance of our work.
基金This work is supported by the National Natural Science Foundation of China(Grant Number:61471141,61361166006,61301099)Basic Research Project of Shenzhen,China(Grant Number:JCYJ20150513151706561)National Defense Basic Scientific Research Program of China(Grant Number:JCKY2018603B006).
文摘Internet communication protocols define the behavior rules of network components when they communicate with each other.With the continuous development of network technologies,many private or unknown network protocols are emerging in endlessly various network environments.Herein,relevant protocol specifications become difficult or unavailable to translate in many situations such as network security management and intrusion detection.Although protocol reverse engineering is being investigated in recent years to perform reverse analysis on the specifications of unknown protocols,most existing methods have proven to be time-consuming with limited efficiency,especially when applied on unknown protocol state machines.This paper proposes a state merging algorithm based on EDSM(Evidence-Driven State Merging)to infer the transition rules of unknown protocols in form of state machines with high efficiency.Compared with another classical state machine inferring method based on Exbar algorithm,the experiment results demonstrate that our proposed method could run faster,especially when dealing with massive training data sets.In addition,this method can also make the state machines have higher similarities with the reference state machines constructed from public specifications.