Capstone Project

End-to-End LLM System

Design, build, evaluate, and present a production-grade LLM application that integrates every major skill from this course

Project Overview

The capstone project is the culminating experience of this course. You will design, build, and present a complete LLM-powered system that demonstrates mastery across the full stack: data preparation, model training and adaptation, retrieval-augmented generation, agent orchestration, production deployment, evaluation, and business strategy.

Unlike individual module labs that focus on a single technique, the capstone requires you to make architectural decisions that balance competing concerns: model quality versus latency, accuracy versus cost, flexibility versus reliability. These tradeoffs are what distinguish a course exercise from a production system.

You will work on this project over approximately 4 to 6 weeks. The project culminates in a GitHub repository with working code, a model and dataset published on Hugging Face Hub, a written technical report, and a 15-minute presentation.

What Makes a Strong Capstone

Learning Objectives

Capstone Sections

Suggested Timeline (6 weeks)

Week 1 Design: Select use case, define requirements, design architecture, identify datasets
Week 2 Data + Model: Prepare synthetic dataset, begin fine-tuning or adapter training
Week 3 RAG + Agent: Build RAG pipeline, implement agent with tools, integrate components
Week 4 Deploy + Evaluate: Deploy to cloud, set up monitoring, run evaluation suite
Week 5 Refine: Address evaluation findings, add safety guardrails, optimize performance
Week 6 Report + Present: Write technical report, prepare presentation, publish artifacts

Deliverable Summary

  1. GitHub Repository with clean code, README, and deployment instructions
  2. Hugging Face Hub artifacts: fine-tuned model and curated dataset
  3. Technical Report (8 to 12 pages) with architecture, evaluation, and limitations
  4. Interpretability Analysis documenting attention patterns, token attributions, or probing results
  5. Live Demo (deployed or screencast) showing the system in action
  6. Presentation (15 minutes) covering motivation, architecture, results, and lessons learned