-
Notifications
You must be signed in to change notification settings - Fork 37
Description
The Lucene FSTOrdPostingsFormat (Solr schema postingsFormat="FSTOrd50") Is like FSTPostingsFormat but has "ordinals" -- term ordinals. Ordinals are not supported by most postings formats but this one has it. In TermPrefixCursor.java I left a comment that it could be more efficient we we could use ordinals. I think this might be true. Instead of eagerly reading & caching the postings (list of docIDs), we could just capture the ordinal (an int). This'd replace some of the "IntsRef" with this integer ordinal. TPC wouldn't need docIdsCache either. Later when we resolve it in getDocIds(), that's when we do the actual work which is perhaps not expensive. Sometimes we're never consulted to even do that, thus saving some time. The tag may have been eliminated due to overlapping, or it may have effectively been cached at a higher level (TaggerRequestHandler transforms to the uniqueKey values then caches that).
I'm not sure how much benefit this would bring; it could be net loss; hard to be sure.
Down side is we'd basically be limited to this PostingsFormat. At least the PostingsWriterBase aspect of this one is pluggable (kinda) should we want some future improvements to allow a total in-memory option. To ameliorate this down-side, we could support any PF via grabbing the "TermsState" instead, and presumably the termState of FSTOrdPostingsFormat is effectively the ordinal.