-
-
Notifications
You must be signed in to change notification settings - Fork 81
Closed
Labels
Description
Since #453, it looks like any Pelias users who wish to define custom multi-token synonyms are out of luck.
Here's a small test script to demonstrate the change in behavior:
git checkout v5.6.0 # after major synonyms upgrade (https://github.com/pelias/schema/pull/453)
# clear pelias index for clean slate
node scripts/drop_index.js --force-yes &> /dev/null
# set up a custom multi token synonym
echo "aaaa bbbb cccc dddd, abcd" >> synonyms/custom_street.txt
node scripts/create_index.js &> /dev/null
curl -s "localhost:9200/pelias/_analyze" -H 'Content-Type: application/json' \
-d '{ "text": "aaaa bbbb cccc dddd", "analyzer": "peliasStreet" }' | jq '.tokens[] | {token}'
git checkout v5.5.1 # before major synonyms upgrade (https://github.com/pelias/schema/pull/453)
# clear pelias index for clean slate
node scripts/drop_index.js --force-yes &> /dev/null
# set up a custom multi token synonym
echo "aaaa bbbb cccc dddd, abcd" >> synonyms/custom_street.txt
node scripts/create_index.js &> /dev/null
curl -s "localhost:9200/pelias/_analyze" -H 'Content-Type: application/json' \
-d '{ "text": "aaaa bbbb cccc dddd", "analyzer": "peliasStreet" }' | jq '.tokens[] | {token}'On my machine, this prints only the 4 input tokens on the latest version of schema:
{
"token": "aaaa"
}
{
"token": "bbbb"
}
{
"token": "cccc"
}
{
"token": "dddd"
}
But on the version prior to the synonyms upgrade, it prints 5 tokens, including the extra synonym term
{
"token": "aaaa"
}
{
"token": "abcd"
}
{
"token": "bbbb"
}
{
"token": "cccc"
}
{
"token": "dddd"
}
Reactions are currently unavailable