Page

Security Notes: RegEx
Even RegEx can be vulnerable
updated almost 5 years ago
Denial-of-Service Regex Vulnerability
One of the more suprising, and yet hard-to-spot vulnerabilities I’ve found is related to regular expressions. Either poorly written or poorly implemented.
Memory/CPU can be exhausted with large or specially crafted user input.
Warning Signs
- You have multiple capture groups
- Global matching
- Expression is used with un-checked user input
Mitigation / Resolution
-
RegEx is hard
- For example, here is how the really smart folks at OWASP recommend handling IP validation:
^(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$
- That’s longer than a tweet, for a 4-byte IP Address!!!
- For example, here is how the really smart folks at OWASP recommend handling IP validation:
- Make sure user input isn’t unduly long, when I know input data is reliably less than 40 chars, I’ll make sure I prevent anything over 64 - otherwise, an attacker could overwhelm my system with a flood of 4Kb requests.
- This affects almost every language and platform .NET/Node/Python/PERL/Java