[期刊论文][Full-length article]


R-gram: Inferring message formats of service protocols with relative positional n-grams

作   者:
Jiaojiao Jiang;Jiaojiao Jiang;Jean-Guy Schneider;Jean-Guy Schneider;Steve Versteeg;Steve Versteeg;Jun Han;Jun Han;MD Arafat Hossain;MD Arafat Hossain;Chengfei Liu;Chengfei Liu;

出版年:2021

页    码:103247 - 103247
出版社:Elsevier BV


摘   要:

Automatically discovering message formats of unknown service or system protocols from network traces has become important for a variety of applications, such as emulating the behavior of an unknown protocol in service virtualization, or enabling deep packet inspection in network security. Among existing schemes, the keyword extraction based approaches have been shown to be effective. Inspired by the template structure of protocol messages, recent works leverage the positions of keywords to extract message keywords more accurately. However, these methods are deficient for messages with large variations in length. To address this problem, we propose R-gram , which exploits the relative positions of keywords in messages, allowing the keywords to be robustly detected in variable length messages. It first extracts the common template of the messages in a given message trace with a fast sampling technique, and segments each message into blocks according to the relative positions of the common keywords in the template. It then identifies message keywords in each block by using a new concept and technique — r elative positional n -gram ( r -gram in short). Finally, the message keywords are used to separate all the messages into type-specific clusters and consequently derive the message format for each cluster. We have implemented and evaluated R-gram on real-world service traces containing either textual or binary protocol messages. Our experimental results show that R-gram is more accurate and robust than existing state-of-the-art tools in protocol message format extraction. Furthermore, R-gram is efficient for processing large-scale message traces.



关键字:

Protocol messages ; Format extraction ; Relative positional keywords ; R-gram


所属期刊
Journal of Network and Computer Applications
ISSN: 1084-8045
来自:Elsevier BV