-
Notifications
You must be signed in to change notification settings - Fork 20
Open
Description
We are encountering a large number of files in the wild with junk at the end (usually html from buggy download pages).
The current open() function in Basic/PDF/File.pm stops after the first 1kB.
The below change continues all the way to the beginning of the file (in a horribly inefficient way - 1k sliding window), but it seems to work:
#foreach my $offset (1..64) {
# $fh->seek($end - 16 * $offset, 0);
# $fh->read($buffer, 16 * $offset);
# last if $buffer =~ m/startxref($cr|\s*)\d+($cr|\s*)\%\%eof.*?/i;
#}
my $scan_length = 16;
my $scan_start = $end - $scan_length;
for(;;) {
$fh->seek($scan_start, 0);
$fh->read($buffer, $scan_length);
last if $buffer =~ m/startxref($cr|\s*)\d+($cr|\s*)\%\%eof.*?/i;
last if $scan_start < 16;
$scan_start -= 16;
if($scan_length < 1024) { $scan_length += 16; }
}
Metadata
Metadata
Assignees
Labels
No labels