Back to Case Studies
Personal Project

Dublin Bus Real-Time Pipeline

A complete data pipeline that collects, processes, and visualizes real-time bus data from Transport for Ireland, tracking 600+ vehicles across Dublin.

708
Vehicles Tracked
73K+
Delay Records
71.2%
On-Time Rate
198
Routes Covered

The Project

Transport for Ireland provides a rich GTFS-Realtime API with live positions and delay data for all public transit in Ireland. This project builds a complete data pipeline to collect, store, and analyze this data.

Goals:

  • Ingest real-time vehicle positions from GTFS-RT API
  • Store data efficiently for historical analysis
  • Create interactive visualizations (maps, charts)
  • Analyze delay patterns across routes

Architecture

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   TFI API       │────▶│  Data Collector │────▶│   SQLite DB     │
│  (GTFS-RT)      │     │   (Python)      │     │                 │
└─────────────────┘     └─────────────────┘     └────────┬────────┘
                                                         │
                        Endpoints:                       ▼
                        • /Vehicles                ┌─────────────────┐
                        • /TripUpdates             │  Jupyter        │
                                                   │  Analysis       │
                                                   └────────┬────────┘
                                                            │
                                    ┌───────────────────────┼───────────────────┐
                                    ▼                       ▼                   ▼
                              ┌──────────┐           ┌──────────┐       ┌──────────┐
                              │ Live Map │           │  Delay   │       │  Route   │
                              │ (Folium) │           │ Analysis │       │  Stats   │
                              └──────────┘           └──────────┘       └──────────┘

Data Model

Vehicle Positions

GPS coordinates, vehicle ID, route ID, trip ID, timestamp, direction

Trip Updates

Arrival/departure delays, stop ID, trip ID, timestamp

Live Dashboard Preview

Dublin Bus Analytics

Real-time transit insights

Live Data
708
Active Vehicles
198
Routes Covered
71.2%
On-Time Rate
73K
Data Points
Data source: Transport for Ireland GTFS-RT APIView Full Case Study →

Visualization Gallery

🗺️

Live Bus Map

Interactive Plotly map with 708 buses plotted in real-time. Dark mode with marker clustering for dense areas.

🔥

Density Heatmap

Heatmap showing bus concentration. Dublin City Centre (O'Connell Street) clearly visible as the hotspot.

📊

Delay Distribution

Histogram showing 71.2% on-time performance, with only 2% experiencing severe delays (>15 min).

🎯

Performance Gauge

Real-time gauge showing on-time performance against industry benchmark of 80%.

Technical Highlights

Efficient API Integration

Handles GTFS-RT protocol with proper error handling and rate limiting

Indexed SQLite Storage

Optimized schema with indexes for fast time-series queries

Incremental Collection

Configurable polling intervals with state management

Interactive Visualizations

Folium maps with marker clustering, Plotly dashboards

Technologies Used

PythonPandasSQLiteGTFS-RealtimeFoliumPlotlyJupyterRequests

View the Code

The full source code, including data collector, analysis notebooks, and documentation is available on GitHub.

View on GitHub

Want to see more data projects?

Check out my other case studies showcasing enterprise-scale data engineering work.