Deterministic Finite Automatons (DFAs) are widely used to perform
regular expression based pattern matching for network security. Most of DFA
compression algorithms consider only one type of redundant information in
DFA. After a comprehensive analysis of different redundancies in DFA
structure, we summarize four types of redundancies with large compression
potentials. An improved DFA method, Tag-DFA, is proposed to exploit more
than one kind of redundancy to compress DFA transitions and a reference
table is added to accelerate DFA inspection. Compared with a well-known
algorithm D2FA, Tag-DFA achieves more than 90% compression ratio and it
requries at most two states traversal for each character, rather than
multiple default transitions in D2FA. Despite a larger contruction
complexity than D2FA, Tag-DFA guarantees a higher throughput with less
memory cost.
|