Skip to content

Conversation

@tacsipacsi
Copy link
Contributor

The latest update for Budapest [0] complained about data/wikidata/Q44377.json missing. This is probably the result of the following:

  1. It discovered https://www.openstreetmap.org/relation/19901264.
  2. Based on its name:etymology:wikidata tag, it downloaded Q850862. Since we’re not interested in what the namesakes are named after, it didn’t bother about P138.
  3. Later it also discovered https://www.openstreetmap.org/way/156974362.
  4. It saw that Q850862 has already been downloaded, so it skipped downloading it – and it also skipped downloading its namesake, Q44377.
  5. Subsequently GeoJSONCommand couldn’t find Q44377.json.

To fix this, simply don’t skip reading back the entity and trying to extract P138 from it. This also allows moving the file_exists() check to save(), reducing code duplication.

This might result in a slight performance degradation (more local disk I/O if a street with its own Wikidata item consists of multiple ways not bundled in a relation), but no increase in HTTP requests (except for now doing the one or two HTTP requests per city that used to be erronously skipped), and correctness is worth that slight performance degradation.

I confirmed that running on Budapest locally, the warning is printed without the fix, and not printed with the fix. I also confirmed that with the fix, gender.csv correctly includes Q44377 for Budavári alagút.

Please note that while the current example doesn’t involve people, so it may seem borderline out of scope, it’s very possible for similar cases to involve people: e.g. https://www.openstreetmap.org/way/335395340 (Flórián tér overpass) could have name:etymology:wikidata=Q65215958 (Flórián tér) – I’m not 100% sure if this would be right, but arguably yes –, and that would potentially (depending on the processing order) break the yellow color of https://www.openstreetmap.org/way/908080555.

[0] https://github.com/EqualStreetNames/equalstreetnames-budapest/actions/runs/19877172828/job/56967142726

The latest update for Budapest [0] complained about
`data/wikidata/Q44377.json` missing. This is probably the result of the
following:

1. It discovered https://www.openstreetmap.org/relation/19901264.
2. Based on its `name:etymology:wikidata` tag, it downloaded Q850862.
   Since we’re not interested in what the namesakes are named after, it
   didn’t bother about P138.
3. Later it also discovered https://www.openstreetmap.org/way/156974362.
4. It saw that Q850862 has already been downloaded, so it skipped
   downloading it – and it also skipped downloading its namesake,
   Q44377.
5. Subsequently GeoJSONCommand couldn’t find `Q44377.json`.

To fix this, simply don’t skip reading back the entity and trying to
extract P138 from it. This also allows moving the `file_exists()` check
to `save()`, reducing code duplication.

This might result in a slight performance degradation (more local disk
I/O if a street with its own Wikidata item consists of multiple ways not
bundled in a relation), but no increase in HTTP requests (except for now
doing the one or two HTTP requests per city that used to be erronously
skipped), and correctness is worth that slight performance degradation.

I confirmed that running on Budapest locally, the warning is printed
without the fix, and not printed with the fix. I also confirmed that
with the fix, `gender.csv` correctly includes `Q44377` for Budavári
alagút.

Please note that while the current example doesn’t involve people, so it
may seem borderline out of scope, it’s very possible for similar cases
to involve people: e.g. https://www.openstreetmap.org/way/335395340
(Flórián tér overpass) could have `name:etymology:wikidata=Q65215958`
(Flórián tér) – I’m not 100% sure if this would be right, but arguably
yes –, and that would potentially (depending on the processing order)
break the yellow color of https://www.openstreetmap.org/way/908080555.

[0] https://github.com/EqualStreetNames/equalstreetnames-budapest/actions/runs/19877172828/job/56967142726
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant