Skip to content

Catastrophic backtracking on '.fr' domains #156

@colemujadzic

Description

@colemujadzic

Hello!

While I understand this project may no longer be maintained (based on the latest commit being over six years old, etc.), because of the potentiaI for this issue to negatively affect production applications, I figured I'd create this to bring attention to / warn others it might impact

Description:

At some point in the course of parsing a WHOIS record (provided by the AFNIC WHOIS server) associated with a french domain (using the '.fr' TLD), it appears the library attempts to match the entire record string against this regular expression:

/nic-hdl:\s*(?P<handle>.+)\ntype:\s*(?P<type>.+)\ncontact:\s*(?P<name>.+)\n(?:.+\n)*?(?:address:\s*(?P<street1>.+)\n)?(?:address:\s*(?P<street2>.+)\n)?(?:address:\s*(?P<street3>.+)\n)?(?:phone:\s*(?P<phone>.+)\n)?(?:fax-no:\s*(?P<fax>.+)\n)?(?:.+\n)*?(?:e-mail:\s*(?P<email>.+)\n)?(?:.+\n)*?changed:\s*(?P<changedate>[0-9]{2}\/[0-9]{2}\/[0-9]{4}).*/

I think it's this one:

"nic-hdl:\s*(?P<handle>.+)\ntype:\s*(?P<type>.+)\ncontact:\s*(?P<name>.+)\n(?:.+\n)*?(?:address:\s*(?P<street1>.+)\n)?(?:address:\s*(?P<street2>.+)\n)?(?:address:\s*(?P<street3>.+)\n)?(?:phone:\s*(?P<phone>.+)\n)?(?:fax-no:\s*(?P<fax>.+)\n)?(?:.+\n)*?(?:e-mail:\s*(?P<email>.+)\n)?(?:.+\n)*?changed:\s*(?P<changedate>[0-9]{2}\/[0-9]{2}\/[0-9]{4}).*\n", # AFNIC madness any country -at all-

This evaluation results in catastrophic backtracking and never recovers, causing the application to hang and CPU usage to increase dramatically. Off hand -- records provided by AFNIC seem to have multiple repeated fields like 'ADDRESS' and 'TROUBLE', so it's possible that's where the evaluation is getting tripped up.

Reproduction Steps:

  1. Clone the repository, or install the package via pip. I used virtualenv and created a sample environment to test this in. I'm also using python 2.7.10.

  2. Use the included pwhois script (or a provided method like pythonwhois.get_whois(domain)) to run the lookup against a .fr domain like 'afnic.fr', e.g. pwhois afnic.fr. The process should hang and CPU usage should rapidly increase. I can provide a proof of concept via an online regular expression evaluator if that would be helpful!

If this description is at all unclear or if you would like me to provide additional information, just let me know!

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions