-
Notifications
You must be signed in to change notification settings - Fork 88
Open
Description
I don't want to take the time right now to submit performance enhancements, but perhaps @moyix or some other person reading this not would like to do the work.
I find that a tremendous amount of time is spent with file reads, string concatenations, and substring operations. There are two ways to speed things up that I have seen, and would be simple to implement:
- In
StreamFileclass, cache the stream pages, so you only have to read them once from the file. Or better, if the platform supportsmmap, just mmap the entire PDB file, create a buffer for it, and take a slice of the buffer for a stream page whenever you need it. In the non-mmap case, you could add a method to clear the cache, to be called, for instance, after parsing the entire stream. - In
StreamFile._read, see how many pages are spanned by the request. Use the above cache / mmap to get slices of individual pages. Return the slice, or a concatenation of two slices, or use CStringIO to assemble more than two slices. Using_read_pagesis inefficient because then you have to take a slice of the result.
I think this would eliminate most of the time spent in parsing a PDB as a whole. You could try profiling pdbparse with a large file, such as ntoskrnl.pdb.
automatedbugreportingfacility
Metadata
Metadata
Assignees
Labels
No labels