Conversation
da83b59 to
c507a3c
Compare
c507a3c to
af0929e
Compare
|
Is this normalizer necessary? The API is already doing the lowercase transformation for The code looks good 😄 |
|
Good point, I was more thinking about trying to prevent bugs by ensuring the tokens were normalized. It's not a big deal and I'd be fine with not merging this if it comes with a performance hit. |
|
I found this old test case super confusing because it's asserting that the keyword field is case-sensitive even though it should never be the case 🤷♂ [admission of guilt] it was written by me 😝 |
|
Ha ha ha, it's ok, 4 years ago, the statute of limitations has passed :p |
af0929e to
7a674fa
Compare
cherry-picked from #412 and based on #414.
This PR adds a
normalizerwhich is the nearest thing to ananalyzerforkeywordfields.more info here: elastic/elasticsearch#18064
This allows us to perform some basic normalization to fields such as
layer,sourceandcategory, forcing them to be lowercased and doing some ICU normalization.One notable change here is that those fields were previously case-sensitive and will now be case-insensitive, which I think is preferable despite there being a test which was covering this behaviour.
Note that not all
keywordfields should have a normalizer specified, for instance, verbatim fields such asbounding_boxandaddendumare probably best left with the defaultnullnormalizer.Normalizers are applied both at index-time and at query-time.
I would like to add some additional filters such as
trimanduniquebut they are not available until version6.4of elasticsearch and so will come in a subsequent PR which can be merged independently of this.