My Projects

A collection of projects I've built to solve real-world problems and explore new technologies.

15+
Projects Built
5+
Technologies
3+
Years Experience
100%
Passion

Featured Projects

Here are some of my most impactful projects that showcase my skills and passion for innovation.

⚑

Real-time Data Pipeline

Apache Kafka + Spark + PostgreSQL

Built a scalable real-time data processing pipeline that handles 1M+ events per day, featuring automatic scaling, monitoring, and data quality checks.

Apache KafkaApache SparkPostgreSQLDockerKubernetesPython
πŸ€–

ML Model Deployment Platform

FastAPI + TensorFlow + AWS

Developed an end-to-end ML platform for model training, deployment, and monitoring with automated A/B testing and model versioning capabilities.

PythonFastAPITensorFlowAWS SagemakerDockerReact

Interactive Data Pipeline

Click "Run Pipeline" to see how data flows through my ETL process

πŸ—„οΈ
PostgreSQL
🧹
Data Cleaning
⚑
Apache Spark
πŸ”§
Feature Engineering
πŸ“Š
Data Warehouse

Extract data from PostgreSQL database

5
Pipeline Stages
99.9%
Uptime
10TB+
Data Processed

Live Code Editor

Try out real data engineering code examples. Click "Run Code" to see the results!

Apache Spark ETL Pipeline - PYTHON
from pyspark.sql import SparkSession
from pyspark.sql.functions import col, when, isnan, isnull

# Initialize Spark session
spark = SparkSession.builder \
    .appName("CustomerDataETL") \
    .getOrCreate()

# Read data from source
df = spark.read \
    .format("jdbc") \
    .option("url", "jdbc:postgresql://localhost:5432/customers") \
    .option("dbtable", "customer_data") \
    .load()

# Data cleaning and transformation
cleaned_df = df \
    .filter(col("age").isNotNull()) \
    .filter(col("age") > 0) \
    .withColumn("age_group", 
        when(col("age") < 25, "Young")
        .when(col("age") < 50, "Middle")
        .otherwise("Senior")) \
    .withColumn("is_premium", col("purchase_amount") > 1000)

# Write to data warehouse
cleaned_df.write \
    .format("parquet") \
    .mode("overwrite") \
    .save("s3://data-lake/customers/cleaned/")

print(f"Processed {{cleaned_df.count()}} records")

Apache Spark ETL Pipeline

Extract, transform, and load data using PySpark

🐍
Python & PySpark
Real data processing code
⚑
Live Execution
Run code and see results
πŸ“Š
Real Examples
Production-ready patterns

Data Visualization

Interactive data visualizations and analytics dashboards I've created.

Skills & Technologies

Technologies and tools I use to build amazing projects and solve complex problems.

🐍
Python
βš›οΈ
React
β–²
Next.js
πŸ“˜
TypeScript
🐘
PostgreSQL
🐳
Docker
☁️
AWS
⚑
Kafka
πŸ”₯
Spark
☸️
Kubernetes
πŸ“š
Git
🐧
Linux

My Projects

Explore my portfolio of data engineering, web development, and ML projects

Category

Status

Complexity

Sort by:

Showing 8 of 8 projects

Featuredcompleted
advanced

Real-time ETL Pipeline

Built a scalable ETL pipeline using Apache Spark, Kafka, and PostgreSQL for processing 10TB+ of customer data daily.

Apache SparkKafkaPostgreSQL+2 more
Featuredcompleted
advanced

ML Recommendation Engine

Developed a collaborative filtering recommendation system using PySpark MLlib and deployed with MLflow.

PySparkMLlibMLflow+2 more
Featuredcompleted
intermediate

Interactive Data Dashboard

Created a real-time analytics dashboard using React, D3.js, and FastAPI for visualizing business metrics.

ReactD3.jsFastAPI+2 more
completed
advanced

Cloud Data Warehouse

Designed and implemented a cloud-based data warehouse using Snowflake and dbt for modern analytics.

SnowflakedbtAirflow+2 more
in progress
intermediate

Data Pipeline Monitoring

Built a comprehensive monitoring system using Grafana, Prometheus, and custom alerting for data pipelines.

GrafanaPrometheusPython+2 more
completed
intermediate

A/B Testing Platform

Developed a statistical analysis platform for A/B testing with automated experiment evaluation.

PythonPandasScipy+2 more
planned
advanced

Real-time Stream Processing

Implemented real-time data processing using Apache Flink and Apache Pulsar for event-driven architecture.

Apache FlinkApache PulsarJava+2 more
in progress
advanced

ML Feature Store

Built a centralized feature store using Feast for managing ML features across multiple models.

FeastRedisPostgreSQL+2 more