Skip to content

Support multiple GTFS-RT feeds from different agencies #448

@aaronbrethorst

Description

@aaronbrethorst

Summary

Maglev currently supports a single GTFS-RT feed configuration (one set of trip updates, vehicle positions, and alerts URLs). The existing Java OBA server supports multiple GTFS-RT feeds from different agencies against a single static GTFS bundle. Maglev should support this too.

Problem

Many transit regions have a single consolidated static GTFS bundle but multiple agencies each publishing their own GTFS-RT feeds. For example, a regional deployment might have:

  • Agency A publishing trip updates + vehicle positions at https://a.example.com/*.pb
  • Agency B publishing trip updates + vehicle positions + alerts at https://b.example.com/*.pb
  • Agency C publishing only vehicle positions at https://c.example.com/*.pb

Today, Maglev can only poll one set of GTFS-RT URLs, so it can't serve real-time data for multi-agency regions.

Background: How the Java version works

The Java implementation (GtfsRealtimeSource.java, ~1,375 lines) creates one poller instance per feed. Each instance:

  1. Polls its three feed URLs (trip updates, vehicle positions, alerts) on a configurable interval (default 30s)
  2. Parses the protobuf responses
  3. Groups trip updates + vehicle positions by trip
  4. Pushes VehicleLocationRecords to a downstream listener
  5. Manages alerts (create/update/delete)
  6. Expires stale vehicles not seen for 15 minutes

Multiple feeds are supported by instantiating multiple poller beans via Spring XML config — each with its own URLs, agency IDs, refresh interval, and HTTP headers.

Per-feed configuration options in Java

Property Description
tripUpdatesUrl HTTP endpoint for trip updates
vehiclePositionsUrl HTTP endpoint for vehicle positions
alertsUrl HTTP endpoint for service alerts
agencyId / agencyIds Which transit agency this feed belongs to
refreshInterval Polling period in seconds (default: 30)
headersMap Custom HTTP headers (e.g. API keys)
enabled On/off switch

Proposed solution

Config changes

Extend gtfs-rt-feeds in the JSON config to support per-feed configuration. Each feed entry gets its own agency ID(s), refresh interval, HTTP headers, and enabled flag:

{
  "gtfs-rt-feeds": [
    {
      "id": "puget-sound",
      "agency-ids": ["1"],
      "trip-updates-url": "https://api.example.com/trip-updates.pb",
      "vehicle-positions-url": "https://api.example.com/vehicle-positions.pb",
      "alerts-url": "https://api.example.com/alerts.pb",
      "refresh-interval": 30,
      "headers": {
        "Authorization": "Bearer my-token"
      },
      "enabled": true
    },
    {
      "id": "sound-transit",
      "agency-ids": ["40"],
      "trip-updates-url": "https://other.example.com/trip-updates.pb",
      "vehicle-positions-url": "https://other.example.com/vehicle-positions.pb",
      "alerts-url": "https://other.example.com/alerts.pb",
      "refresh-interval": 30,
      "enabled": true
    }
  ]
}

Architecture changes

  1. One poller goroutine per feed: Each feed entry gets its own goroutine with its own ticker and HTTP client. Feeds poll independently and don't block each other.

  2. Merged real-time data in GTFS Manager: The GTFS Manager's in-memory real-time stores (realTimeTrips, realTimeVehicles, realTimeAlerts) need to aggregate data from multiple feed pollers. Data from each feed is keyed/tagged by feed ID so that a refresh from feed A doesn't clobber data from feed B.

  3. Per-feed HTTP headers: Each feed can specify custom headers (for API keys, auth tokens, etc.) that are sent with every request to that feed's URLs.

  4. Per-feed refresh intervals: Each feed polls on its own schedule.

  5. Graceful error isolation: If one feed's HTTP request fails, it doesn't affect other feeds. Log the error and retry on the next tick.

  6. Stale vehicle expiry: Vehicles not seen for 15 minutes are expired, tracked per-feed.

  7. Clean shutdown: Cancelling the application context stops all feed pollers.

Key implementation considerations

  • Thread safety: The GTFS Manager already uses realTimeMutex for concurrent access. Multiple feed goroutines writing to the shared stores will need to coordinate through this mutex.
  • Data partitioning: When a feed refreshes, it should only replace its own data, not clear data from other feeds. This likely means tagging stored entries by feed ID or using per-feed sub-maps.
  • Backward compatibility: The existing single-feed config format should continue to work. If a feed entry doesn't specify an id or agency-ids, fall back to current behavior.

Out of scope for v1

These features exist in the Java version but can be deferred:

  • SFTP feed support (HTTP is universal now)
  • Fuzzy trip ID matching
  • Route ID remapping
  • Stop ID modification strategies
  • Bundle swap lifecycle (pausing polling during static GTFS reload)
  • Vehicle occupancy/crowding data
  • OBA-specific protobuf extensions (GtfsRealtimeOneBusAway, GtfsRealtimeNYCT)
  • Alert source prefix/merging from multiple sources
  • routeIdsToCancel (cancellation workaround)
  • scheduleAdherenceFromLocation (GPS-based delay computation)
  • maxDeltaLocationMeters (GPS sanity checking)
  • filterUnassigned (dropping unassigned trip updates)
  • useLabelAsId (using vehicle label as ID)

These can be added incrementally as agency-specific needs arise.

Acceptance criteria

  • Config supports multiple gtfs-rt-feeds entries, each with its own URLs, agency IDs, refresh interval, headers, and enabled flag
  • Each feed polls independently in its own goroutine
  • Real-time data from multiple feeds is merged correctly in the GTFS Manager (feed A's refresh doesn't clobber feed B's data)
  • Per-feed HTTP headers are sent with requests
  • Feed errors are isolated — one feed failing doesn't affect others
  • Stale vehicles are expired per-feed after 15 minutes of inactivity
  • Existing single-feed configs continue to work without changes
  • Config schema (config.schema.json) is updated
  • Clean shutdown stops all feed pollers
  • Tests cover multi-feed scenarios (two feeds with different data merging correctly)

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions