Projects

23 / 29

ML Projects for Experienced Professionals

Production-grade project ideas with architecture, datasets, evaluation, and business impact — chosen to stretch senior ML engineers.

How to use this page

These aren't toy notebooks. Each project is scoped like a real engineering initiative with an end-to-end deliverable. Click any card to see the architecture diagram, step-by-step plan, key risks, and the common problems engineers hit (with fixes).

What "experienced-level" means here

End-to-end ownership: data → training → serving → monitoring.
Production constraints: latency, cost, drift, reproducibility.
Business framing: pick a metric the business actually cares about.
Failure modes: explicit error analysis, not just leaderboard scores.

Project catalogue

#01MLOpsExpert

Real-time Fraud Detection Pipeline

Problem

Detect fraudulent credit-card transactions in <50ms with extreme class imbalance (~0.17% positive) and concept drift over time.

Stack

KafkaFeastLightGBMFastAPIMLflowEvidentlyDockerKubernetes

Business impact

Each 1% recall improvement at fixed false-positive rate often saves $1M+/year for a mid-size issuer.

Open deep dive — architecture, plan, risks, problems & solutions

#02GenAIExpert

Domain-Specific RAG Assistant for Internal Docs

Problem

Build a Retrieval-Augmented Generation system that answers questions over a private corpus (10k+ PDFs) with citations and zero hallucination tolerance.

Stack

LangGraphQdrant / pgvectorBGE embeddingsvLLMRagasLangSmith

Business impact

Replaces tier-1 internal support; typical deflection ~40% of repetitive queries.

Open deep dive — architecture, plan, risks, problems & solutions

#03Time SeriesAdvanced

Demand Forecasting at SKU x Store Granularity

Problem

Forecast 14-day demand for 50k SKU-store pairs with promotions, holidays, and intermittent demand.

Stack

LightGBMstatsforecastmlforecastOptunaPrefectDuckDB

Business impact

1-3% forecast accuracy gain typically reduces inventory holding costs by 5-10%.

Open deep dive — architecture, plan, risks, problems & solutions

#04Computer VisionAdvanced

Multi-Modal Visual Search for E-commerce

Problem

Given a user-uploaded photo, return visually + semantically similar products from a catalog of 5M items.

Stack

PyTorchOpenCLIPFAISS / QdrantONNX RuntimeTriton

Business impact

Visual search converts 2-4x higher than text search on fashion verticals.

Open deep dive — architecture, plan, risks, problems & solutions

#05RecommenderExpert

Two-Tower Recommender with Cold-Start Handling

Problem

Recommend items to 10M+ users with millions of items, including handling new users/items appearing every minute.

Stack

TensorFlow Recommenders / PyTorchScaNNBigQueryVertex AIAirflow

Business impact

Top-of-funnel rec quality drives 10-30% of platform GMV on most marketplaces.

Open deep dive — architecture, plan, risks, problems & solutions

#06MLOpsExpert

End-to-End MLOps Platform on Kubernetes

Problem

Build a self-serve platform where data scientists can train, register, deploy, and monitor models with full reproducibility and CI/CD.

Stack

KubernetesArgoCDMLflowKubeflowKServePrometheusTerraform

Business impact

Reduces ML cycle time from weeks to hours; the highest-leverage investment for a data org > 5 people.

Open deep dive — architecture, plan, risks, problems & solutions

#07NLPAdvanced

Causal Uplift Modeling for Marketing Campaigns

Problem

Identify which customers should receive a promotion to maximize *incremental* revenue, not just predicted purchase.

Stack

EconMLDoWhyscikit-learnCausalMLPandas

Business impact

Switching from look-alike to uplift targeting commonly improves marketing ROI by 20-50%.

Open deep dive — architecture, plan, risks, problems & solutions

Portfolio packaging tips

One repo per project with a clean README: problem, results, architecture diagram, how to reproduce.
Show the metric curve, not just the final number — and a baseline you beat.
Write a 1-page case study in business language; engineers read code, hiring managers read prose.
Deploy something live — a Streamlit / Gradio demo or a small HTTP API beats a static notebook.
Talk about trade-offs: what you tried, what failed, what you'd do with more time/data.

Suggested 90-day plan

Weeks 1–2: scope, data acquisition, baseline model, eval harness.
Weeks 3–6: iterate on modeling, ablations, error analysis.
Weeks 7–9: serving, monitoring, drift, CI/CD.
Weeks 10–12: write-up, demo video, blog post, polish.