Skip to content

How Should We Serve Realtime Feeds?

March 20, 2026 — Experiment: rt-publish-latency

The Question

Our pipeline produces updated GTFS-RT feeds every ~20 seconds today, but modern CAD/AVL systems can push updates as fast as every 3-5 seconds — and we want an architecture that's ready for that future. How do we get those files to consumers like Google Maps and Transit App as fast as possible, without running a custom API server? We proposed static files on Google Cloud Storage, but we needed to know: is it fast enough for sub-5-second update cycles? What's the right caching setup? How much will it cost?

What We Tried

  • Direct GCS — consumers fetch straight from a storage bucket URL
  • Load Balancer, no caching — Google's global HTTP load balancer in front of the bucket
  • Load Balancer + CDN with 9 different caching configurations — varying cache duration from 0 to 15 seconds, different cache modes, and cache invalidation

What We Found

  1. The load balancer is actually faster than direct storage access — 115ms vs 415ms to see a new file. Google's LB keeps persistent connections to the storage backend, so it skips the handshake overhead on every request.

  2. 1-second caching is the sweet spot for realtime feeds — zero observable staleness in our tests, and it absorbs traffic spikes (thousands of consumers all hitting at once collapse into ~1 read per second from storage).

  3. 15-second caching is dangerous — consumers can miss entire feed updates. We saw stale data served for up to 11.7 seconds.

  4. Cache invalidation is too slow — the API takes ~1.4 seconds per call. For feeds updating every 20 seconds, short cache lifetimes are more practical than manually clearing the cache.

What It Looks Like

Here's the flow when our pipeline publishes an updated Vehicle Positions feed:

flowchart TD A[Pipeline publishes new file to GCS] -->|~0ms, internal network| B[GCS Bucket\nrt-feeds/vehicle_positions.pb] B --> C[Load Balancer + CDN\n1-second cache] C -->|~112ms from publish| D[Google Maps] C -->|~112ms from publish| E[Transit App] C -->|~112ms from publish| F[OneBusAway]

For comparison, here's what 15-second caching looks like — consumers could be seeing data that's almost 12 seconds old:

Timeline: ──0s────5s────10s────15s────20s──
Pipeline:  publish                    publish
Consumer:  ✓ fresh  ...stale...stale...  ✓ fresh
                    ↑ up to 11.7s behind

The Decision

Deploy with CDN enabled from day one, 1-second cache lifetime. Best latency, lowest cost, zero staleness risk. We can dial up to 5-second caching later if we need to reduce costs further — no infrastructure change needed, just update the cache header on the files.

What This Means

  • Sound Transit gets sub-200ms feed delivery to consumers worldwide
  • We get DDoS protection, rate limiting, and per-consumer URL tracking for free via the load balancer
  • Cost is ~$26-33/month at expected traffic volumes
  • No custom API server to maintain — it's just files on a bucket behind Google's infrastructure
  • Future-proof for sub-5s updates — at 112ms publish-to-consumer latency and 1s cache TTL, this architecture comfortably supports 3-5 second update cycles as modern CAD/AVL systems come online. We deliberately avoided approaches (like long cache TTLs or invalidation-based strategies) that would cap our update frequency.

Open Questions

  • How do we handle the schedule feed differently? Schedule updates are infrequent and large. A longer cache TTL (minutes, not seconds) would reduce costs and is safe since consumers poll less often. Should the schedule feed use a different CDN configuration than the RT feeds?
  • When do we need signed URLs? Sound Transit mentioned wanting to track individual consumers. Cloud Armor rules may be sufficient, but if they need per-consumer authentication or usage billing, signed URLs add complexity to the pipeline.