Dublin Bus Real-Time Pipeline
A complete data pipeline that collects, processes, and visualizes real-time bus data from Transport for Ireland, tracking 600+ vehicles across Dublin.
The Project
Transport for Ireland provides a rich GTFS-Realtime API with live positions and delay data for all public transit in Ireland. This project builds a complete data pipeline to collect, store, and analyze this data.
Goals:
- ✓Ingest real-time vehicle positions from GTFS-RT API
- ✓Store data efficiently for historical analysis
- ✓Create interactive visualizations (maps, charts)
- ✓Analyze delay patterns across routes
Architecture
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ TFI API │────▶│ Data Collector │────▶│ SQLite DB │
│ (GTFS-RT) │ │ (Python) │ │ │
└─────────────────┘ └─────────────────┘ └────────┬────────┘
│
Endpoints: ▼
• /Vehicles ┌─────────────────┐
• /TripUpdates │ Jupyter │
│ Analysis │
└────────┬────────┘
│
┌───────────────────────┼───────────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Live Map │ │ Delay │ │ Route │
│ (Folium) │ │ Analysis │ │ Stats │
└──────────┘ └──────────┘ └──────────┘Data Model
Vehicle Positions
GPS coordinates, vehicle ID, route ID, trip ID, timestamp, direction
Trip Updates
Arrival/departure delays, stop ID, trip ID, timestamp
Live Dashboard Preview
Dublin Bus Analytics
Real-time transit insights
Visualization Gallery
Live Bus Map
Interactive Plotly map with 708 buses plotted in real-time. Dark mode with marker clustering for dense areas.
Density Heatmap
Heatmap showing bus concentration. Dublin City Centre (O'Connell Street) clearly visible as the hotspot.
Delay Distribution
Histogram showing 71.2% on-time performance, with only 2% experiencing severe delays (>15 min).
Performance Gauge
Real-time gauge showing on-time performance against industry benchmark of 80%.
Technical Highlights
Efficient API Integration
Handles GTFS-RT protocol with proper error handling and rate limiting
Indexed SQLite Storage
Optimized schema with indexes for fast time-series queries
Incremental Collection
Configurable polling intervals with state management
Interactive Visualizations
Folium maps with marker clustering, Plotly dashboards
Technologies Used
View the Code
The full source code, including data collector, analysis notebooks, and documentation is available on GitHub.
View on GitHubWant to see more data projects?
Check out my other case studies showcasing enterprise-scale data engineering work.