Skip to content

Geo-tagging uses substring matching, causing articles to be placed in wrong map regions #324

@princelevant

Description

@princelevant

Variant

worldmonitor.app (Full / Geopolitical)

Affected area

Map / Globe, News Feed / RSS

Bug description

When zooming into Syria on the map, unrelated articles about French/European internal politics appear as news dots at Syria's coordinates. The root cause is that the geo-tagging system in src/services/geo-hub-index.ts uses substring matching (String.includes()) instead of word-boundary matching for keywords 5+ characters long.

Example: The keyword "assad" (Bashar al-Assad) is correctly associated with Syria. But because matching is done via titleLower.includes("assad"), it also matches inside the word "ambassador" — tagging any article mentioning ambassadors to Syria's coordinates (lat 34.8, lon 39.0).

A second instance exists in src/config/geo.ts where the Damascus hotspot has the keyword "hts" (Hay'at Tahrir al-Sham). The hotspot escalation code in DeckGLMap.ts:3386 and Map.ts:2780 uses plain .includes() with no length or boundary checks, so "hts" matches inside "rights", "fights", "flights", "insights", etc. — virtually any political article.

Additionally, the RSS parser (src/services/rss.ts:205-237) only extracts title, link, and pubDate from feed items. It ignores <category> / <dc:subject> tags that many feeds provide. Using these structured tags as a primary geo-signal (with keyword fallback) would significantly reduce false positives.

Affected files

File Line(s) Issue
src/services/geo-hub-index.ts 119-125 Keywords >= 5 chars use includes() instead of \b regex
src/components/DeckGLMap.ts 3386, 3415 Hotspot keyword matching: no length filter, no word boundaries
src/components/Map.ts 2741, 2780 Same as DeckGLMap (mobile map)
src/services/entity-index.ts 124-141 Entity keyword matching uses includes()
src/services/country-instability.ts 201, 461 Country keyword matching uses includes()
src/services/story-data.ts 91, 108 Country keyword matching uses includes()
src/services/related-assets.ts 38, 47 Asset/hotspot keyword matching uses includes()
src/config/geo.ts Damascus entry "hts" keyword is a 3-char substring of many English words
src/services/rss.ts 205-237 RSS <category> tags ignored; geo relies entirely on title keywords

Steps to reproduce

  1. Open worldmonitor.app (Full variant)
  2. Wait for news feeds to load
  3. Zoom into Syria on the map
  4. Observe news dots — some articles are about French/European politics, not Syria
  5. Any article with "ambassador" in the title will be geo-tagged to Syria due to the "assad" substring

Expected behavior

  • Keyword matching should use word-boundary regex (\b...\b) for all keyword lengths, not just < 5 chars
  • The "hts" keyword in geo.ts should be replaced with a longer, unambiguous form (e.g., "hayat tahrir", "tahrir al-sham")
  • RSS <category> tags should be parsed and used as a primary geo-signal when available

Screenshots / Console errors

N/A — logic bug, no console errors

Browser & OS

All platforms (logic is shared across web and desktop)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions