Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 9 additions & 3 deletions settings.js
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,9 @@ function generate(){
"icu_folding",
"trim",
"custom_name",
"street_suffix",
"street_synonyms_en",
"street_synonyms_usps",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about having something with a regex like street_synonyms_* and will fetch all the files ?
This may help for #393 or if we need to add custom street_synonyms by lang.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good point, we should rethink these a bit and make them easier to customise.

Right now we have fairly limited synonyms coverage (ie. 'just the basics').
I'd like to explore what would happen if we added significantly more synonyms, I suspect that the impact would be very positive for search quality and hopefully doesn't affect performance much.

@Joxit could you please tell me a bit about how you're adding custom synonyms for your builds?

I'm guessing you're adding a bunch of French-specific and some other ones which you include in the custom_street.txt file?

Pending successful testing of this I would next like to expand on French, German and Spanish synonyms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit annoying but better since #375 and #407

I have a fork joxit/pelias-schema, when I start a new build, I fetch the upstream and add this commit.

For the record, I added cutom_street in peliasPhrase and peliasQuery in build-2019-09 (before pelias/parser release). peliasQuery is used for autocomplete where we can find streets names... And peliasPhrase because street_suffix is also present.

I add my synonyms in synonyms/custom_street.txt

bd, bld, blvd, boul, bvd, boulevard #only bld is missing in this PR
ave, av, avenue # OK with this PR
saint, st, street # missing
sainte, ste, suite # missing

This list was created from our common uses (when we test the pelias). I have always been afraid of slowing down queries with too many synonyms 😅.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you possibly know of an official source which publishes a French equivalent of this?
https://pe.usps.com/text/pub28/28apc_002.htm

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened #449 to hopefully clean up your build process a little.

Ideally you shouldn't have to do any synonyms work in peliasQuery, I think you can remove that and you'll get a small perf benefit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this one : https://blog.bureaudeposte.net/rediger-une-adresse-aux-normes-postales/ for abbreviations

Name abbreviation
Allée ALL
Avenue AV
Boulevard BD
Carrefour CARR
Chaussée CHS
Chemin CHEM
Cours CRS
Faubourg FG
Immeuble IMM
Impasse IMP
Lieu-dit LD
Lotissement LOT
Montée MTE
Passage PAS
Place PL
Résidence RES
Route RTE
Ruelle RLE

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Next time I will check differences between synonyms in peliasQuery and #449

Oh, and saint/sainte aren't for street suffix/prefix but a workaround for street names... 😅

"street_synonyms_de",
"directionals",
"ampersand",
"remove_ordinals",
Expand Down Expand Up @@ -107,7 +109,9 @@ function generate(){
"remove_duplicate_spaces",
"ampersand",
"custom_name",
"street_suffix",
"street_synonyms_en",
"street_synonyms_usps",
"street_synonyms_de",
"directionals",
"icu_folding",
"remove_ordinals",
Expand Down Expand Up @@ -154,7 +158,9 @@ function generate(){
"trim",
"remove_duplicate_spaces",
"custom_street",
"street_suffix",
"street_synonyms_en",
"street_synonyms_usps",
"street_synonyms_de",
"directionals",
"icu_folding",
"remove_ordinals",
Expand Down
16 changes: 7 additions & 9 deletions synonyms/custom_name.txt
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,6 @@ orchard,orch
paradise,pde,pdse
port,pt,prt
park,pk,prk
rear of,r / o,r o
river,riv,rvr,rivr
slope,slpe,slp
springs,spgs,sprngs
Expand All @@ -87,9 +86,9 @@ colline,coli
collines,colis
enceinte,en
fleuve,fl
grand,gd,gr,g
grand,gd,gr
mont,mt,mnt
petite,p,pt
petite,pt
porche,pch
rivière,riviere,riv
village,vge
Expand All @@ -108,7 +107,6 @@ kleines,kl
kogel,kg
niedere,nd
rhein,rh
see,s
spitze,sp
vordere,vd,vord
wiese,ws
Expand All @@ -131,8 +129,8 @@ cerro,crro
corral,crral
corralillo,crrlo
diseminado,disem
enero,en,eno,ene,en o
diciembre,dic,dicbre,dice,dbre,10bre,10 bre,xbre,x bre
enero,en,eno,ene
diciembre,dic,dicbre,dice,dbre,10bre,xbre
febrero,febo,febro,febr,feb
gobierno,gob,gobno
grande,gr
Expand All @@ -154,8 +152,8 @@ militar,milr
monte,mt,mte,mnte
montes,mts,mtes,mntes,mnts
nacional,nal,nacl
noviembre,nbre,nvre,nove,novre,novbre,9bre,9 bre
octubre,oct,octbre,octe,8bre,8 bre
noviembre,nbre,nvre,nove,novre,novbre,9bre
octubre,oct,octbre,octe,8bre
portillo,ptilo,ptllo
prado,prdo
primeros,pros
Expand All @@ -167,7 +165,7 @@ republica,rep
revolucion,rev
ribera,ribr
río,rio
septiembre,setbre,sepe,sepbre,7bre,7 re,7re,7 bre,sep,set
septiembre,setbre,sepe,sepbre,7bre,7re,sep,set
sierra,srra
valle,vlle
volcan,vlcn
Expand Down
15 changes: 12 additions & 3 deletions synonyms/linter.js
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,8 @@ function linter(synonyms) {

letterCasing(line, logprefix, tokens);
tokensSanityCheck(line, logprefix, tokens);
// multiWordCheck(line, logprefix, tokens);
multiWordCheck(line, logprefix, tokens);
// tokenLengthCheck(line, logprefix, tokens);
})
})
}
Expand All @@ -65,10 +66,18 @@ function tokensSanityCheck(line, logprefix, tokens) {
}
}

function multiWordCheck(line, tokens) {
function multiWordCheck(line, logprefix, tokens) {
_.each(tokens, token => {
if (/\s/.test(token)){
logger.warn(`multi word synonyms may cause issues with phrase queries:`, token);
logger.warn(`${logprefix} multi word synonyms may cause issues with phrase queries:`, token);
}
});
}

function tokenLengthCheck(line, logprefix, tokens) {
_.each(tokens, token => {
if (token.length <= 1) {
logger.warn(`${logprefix} short token:`, token);
}
});
}
Expand Down
129 changes: 0 additions & 129 deletions synonyms/street_suffix.txt

This file was deleted.

7 changes: 7 additions & 0 deletions synonyms/street_synonyms_de.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
straße => strasse, str
strasse, str
brücke => bruecke, brucke, br
bruecke, brucke, br
bahnhof, bhf, bf
chaussee, ch
platz, pl
Loading