Security Notes: RegEx
Can RegEx be vulnerable?
RegEx Denial-of-Service: ReDOS
One of the more suprising, and yet hard-to-spot vulnerabilities I’ve found is related to regular expressions. Either poorly written or poorly implemented.
Memory/CPU can be exhausted with large or specially crafted user input.
This is not as much a security issue, but a performance issue.
Warning Signs
- You have multiple capture groups
- Global matching
- Expression is used with un-checked user input
Mitigation / Resolution
- RegEx is hard.
- For example, here is how the really smart folks at [OWASP recommend handling IP validation][owasp]:
^(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$
- That’s longer than an (old school) tweet, for a 4-byte IP Address!!!
- For example, here is how the really smart folks at [OWASP recommend handling IP validation][owasp]:
- Make sure user input isn’t improperly formatted.
- This affects almost every language and platform .NET/Node/Python/PERL/Java