Skip to content

APS captcha support / option to discard cookies #5

@kassu

Description

@kassu

Since some time, APS (American Physical Society) asks for a captcha when downloading more than 2 pdfs (the captcha is to select a picture of Albert Einstein).

I was looking at the source code to see if it would be easy to support handling this captcha, and it is probably not, but I noticed from the logs that there is a very easy way to bypass it: simply discard the cookie (aps_session_id).

Although this is a hack (I guess over time they will check also the IP), would it make sense to add an option/rule to discard cookies before starting a new download command?

Note: Locally I have solved the problem for me by resetting cookies each time.
Added DownloadHttpSession.java:

public void clearCookies() {
    console.output("Clear all cookies",false);
    cm = new CookieManager();
}

Added in Download.java, at the beginning of downloadFromPageFile:

    session.clearCookies();

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions