Skip to content

Update to apache lucene 9.X#9207

Draft
matthiasblaesing wants to merge 4 commits intoapache:masterfrom
matthiasblaesing:update-lucene3
Draft

Update to apache lucene 9.X#9207
matthiasblaesing wants to merge 4 commits intoapache:masterfrom
matthiasblaesing:update-lucene3

Conversation

@matthiasblaesing
Copy link
Contributor

  • Additional package lucene-analysis-common required (KeywordAnalyzer, WhitespaceAnalyzer, PerFieldAnalyzerWrapper, CharTokenizer, LimitTokenCountAnalyzer)
  • FieldTypes were introduced to carry the field behavior (tokenization state, indexing options, storage settings)
  • BooleanQuery construction moved to builder
  • Collector interface was modified
  • TermEnum was replaced by TermsEnum
  • Terms are stored per field, so queries have to be specified with the target field
  • QuerySelectors were replaced by String sets
  • RAMDirecory was removed and is replaced by ByteBufferDirectories
  • Lock handling was reworked

@matthiasblaesing matthiasblaesing added this to the NB30 milestone Feb 14, 2026
@matthiasblaesing matthiasblaesing added ci:all-tests [ci] enable all tests ci:dev-build [ci] produce a dev-build zip artifact (7 days expiration, see link on workflow summary page) labels Feb 14, 2026
@matthiasblaesing matthiasblaesing marked this pull request as draft February 15, 2026 15:16
@lkishalmi
Copy link
Contributor

@matthiasblaesing This looks great! Thank you!

As far as I've tested. this generrally works, however there is a bug in the MemoryIndex that sometimes tries to search on a closed index.

@lkishalmi
Copy link
Contributor

Also is it possible to upgrade the base JDK version to Java 21 for parsing.lucene module?

@matthiasblaesing
Copy link
Contributor Author

Also is it possible to upgrade the base JDK version to Java 21 for parsing.lucene module?

I had an initial branch that did that (and bumped to lucene 10), but that will cause issues in the enterprise cluster, that runs with JDK17 for tests. @mbien gave the hint that lucene 9 allows to stay with the older java level.

As far as I've tested. this generrally works, however there is a bug in the MemoryIndex that sometimes tries to search on a closed index.

I'm seeing issues with LuceneIndex. There is a strange construction in LuceneIndex#doClear, which knocks away all locks on invocation. I think there is a race somewhere.

@lkishalmi
Copy link
Contributor

I think I can solve the issue of Java 21 on Enterprise Tests. What I see the majority of the fails comes from the Micronaut integration. I had some clash with that lately.

I also see that my PR broke your paperwork as the sig file needs to be re-generated for parsing.lucene

@matthiasblaesing
Copy link
Contributor Author

The micronaut problems are real. Real as in, I manually tested the case (started IDE with example project) and found completion broken. I'll look a bit further.

@lkishalmi
Copy link
Contributor

@matthiasblaesing please rebase on master, @mbien fixed Micronaut tests with Java 21.

matthiasblaesing and others added 3 commits February 17, 2026 18:48
- Additional package lucene-analysis-common required
  (KeywordAnalyzer, WhitespaceAnalyzer, PerFieldAnalyzerWrapper, CharTokenizer,
  LimitTokenCountAnalyzer)
- FieldTypes were introduced to carry the field behavior (tokenization
  state, indexing options, storage settings)
- BooleanQuery construction moved to builder
- Collector interface was modified
- TermEnum was replaced by TermsEnum
- Terms are stored per field, so queries have to be specified with the
  target field
- QuerySelectors were replaced by String sets
- RAMDirecory was removed and is replaced by ByteBufferDirectories
- Lock handling was reworked
@matthiasblaesing
Copy link
Contributor Author

The commits are rebased, but that does not seem to be a problem. I can reproduce the micronaut problem locally. The problem is not deterministic. I tested it by:

  • removing the test index folder used by the test rm -rf /tmp/nbcache/index/
  • run the micronaut tests
  • check the diffs from the test failures

I noticed changes in the packages there were not listed anymore. I then located the index folders for the jars and indeed the indices were at least partially broken.

That matches outputs in the unittests reporting: WARNUNG: Locked index folder: /tmp/nbcache/index/s8/java/15/refs. I'll stare a bit more at the LuceneIndex and its embedded DirCache. The lock schema feels off, but I can't point to the core problem yet.

@matthiasblaesing
Copy link
Contributor Author

I found at least one problem with the updated code: the locks move from per-factory to global and that explains the seemingly random behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci:all-tests [ci] enable all tests ci:dev-build [ci] produce a dev-build zip artifact (7 days expiration, see link on workflow summary page) Editor Upgrade Library Library (Dependency) Upgrade

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update Lucene 3.x editor dependency Apache Lucene 3.6.2 critical vulnerability issue - CVE-2017-12629

3 participants

Comments