Conversation
…ndexing job, and loading it (followed by a quick refresh) at the next startup
|
Thanks! For the hashing, you may want to have a look at some code I wrote here (ff4de3e) instead of using the ID. I'll test the code this weekend. |
I did see that, actually! It's a great approach for epub files, which seem to be Zip-based, but it unfortunately won't work with most other ebook formats like mobi, pdf, etc., which are not. That was part of the rationale behind my filename/mtime/size approach -- it'll work with literally any ebook format. The other part was that it's a lot faster to just stat() each file during the index refresh rather than opening and reading from each file. It brought the initial (uncached) indexing time down from minutes to seconds on my own library. |
| booklist booklist.BookList | ||
| mu sync.Mutex | ||
| indMu sync.Mutex | ||
| seen *SeenCache |
There was a problem hiding this comment.
This should be an interface to allow for different cache implementations.
| } | ||
|
|
||
| return &Indexer{paths: paths, coverpath: cp, exts: exts}, nil | ||
| return &Indexer{paths: paths, coverpath: cp, exts: exts, seen: NewSeenCache()}, nil |
There was a problem hiding this comment.
I think the cache should be passed as an argument to New.
| return &Indexer{paths: paths, coverpath: cp, exts: exts, seen: NewSeenCache()}, nil | ||
| } | ||
|
|
||
| func (i *Indexer) Load() error { |
There was a problem hiding this comment.
Some of this code may fit better as part of the cache. Ideally, the cache would handle it's own loading, and the indexer would query the cache during the indexing.
|
|
||
| debug.FreeOSMemory() | ||
|
|
||
| err = s.Indexer.Save() |
There was a problem hiding this comment.
This should be able to be disabled using a command line flag.
Added support for saving the index to a simple JSON file after each indexing job, and loading it (followed by a quick refresh) at the next startup. This allows the book library to be populated immediately at startup, rather than waiting a potentially long time for the initial indexing job to complete.
The index file is saved as index.json in the existing cover path.
There's a lot of room for improvement here; this was a quick implementation to eliminate the 15-minute-plus delay at each startup while BookBrowser indexed my tens of thousands of ebooks stored on a remote filesystem.