fix(indexing): Fix indexing full text terms to support exact match; fix isbn seach term processor #888
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.



Purpose
Fix indexing full text terms to support exact match; fix isbn seach term processor
Approach
Changes Checklist
Related Issues
MSEARCH-1011
Learning and Resources (if applicable)
Problem with
047144250Xterm is that it contains numbers + letters so it gets indexed into terms047144250andxand is unsearchable bytermquery for value047144250x.Problem with asterisks is that normalization in processor might remove them.
catenate_allas opposed tocatenate_wordswill increase index size since it'll catenate numbers and numbers+words and require full reindexing. However this also should fix other searches where exact match is performed on a full-text index.We might just consider using different type for indexing isbn if it doesn't break any other requirements.
Adding just
catenate_allfixes all cases except401,"isbn = ""{value}""",9781609383657*when we have the full term and asterisc in the end.Adding just a fix to the search processor without catenate all will have these cases still failing:
397,"isbn = ""{value}""",047144250X*398,"isbn == ""{value}""",047144250X