-
Notifications
You must be signed in to change notification settings - Fork 10
Open
Description
Since some time, APS (American Physical Society) asks for a captcha when downloading more than 2 pdfs (the captcha is to select a picture of Albert Einstein).
I was looking at the source code to see if it would be easy to support handling this captcha, and it is probably not, but I noticed from the logs that there is a very easy way to bypass it: simply discard the cookie (aps_session_id).
Although this is a hack (I guess over time they will check also the IP), would it make sense to add an option/rule to discard cookies before starting a new download command?
Note: Locally I have solved the problem for me by resetting cookies each time.
Added DownloadHttpSession.java:
public void clearCookies() {
console.output("Clear all cookies",false);
cm = new CookieManager();
}
Added in Download.java, at the beginning of downloadFromPageFile:
session.clearCookies();
Metadata
Metadata
Assignees
Labels
No labels