🚀 Lesson 9 — Scaling & Deployment (Docker, Gunicorn/Uvicorn Workers, CI/CD, Kubernetes, Cloud Deployment)
This is where your FastAPI project becomes production-grade — scalable, observable, secure, and cloud-ready.
Companies care A LOT about this knowledge.
By the end of this lesson, you will be able to deploy a FastAPI app to:
✔ Docker
✔ AWS / Azure / GCP
✔ Kubernetes
✔ Production servers (Gunicorn + Uvicorn workers)
✔ CI/CD pipelines
Let’s begin. 🔥
🎯 What You Will Learn Today
✔ Dockerizing FastAPI
✔ Production server (Gunicorn + Uvicorn workers)
✔ Environment variables
✔ Nginx reverse proxy
✔ CI/CD (GitHub Actions)
✔ Kubernetes Deployment
✔ Scaling horizontally
✔ Cloud deployment patterns
🧱 PART A — Dockerizing FastAPI
Docker is the #1 requirement for modern backend + AI engineers.
🐳 1. Create Dockerfile
Create a file named Dockerfile in your project:
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["gunicorn", "-k", "uvicorn.workers.UvicornWorker", "main:app", "--bind", "0.0.0.0:8000", "--workers", "4"]
This uses:
✔ Gunicorn (multi-worker server)
✔ Uvicorn workers (ASGI support)
✔ 4 worker processes → handles high load
📦 2. Build Docker Image
docker build -t fastapi-app .
▶️ 3. Run Container
docker run -p 8000:8000 fastapi-app
Your app is running inside Docker. 🚀
🔥 Why Gunicorn + Uvicorn worker?
Because:
- Uvicorn alone = dev server
- Gunicorn = production process manager
- UvicornWorker = async worker
- Multi-worker = uses multiple CPU cores
Example scaling up to 8 workers:
--workers 8
🌍 PART B — Environment Variables
Never hardcode secrets.
Use .env:
DATABASE_URL=postgresql://user:pass@db/mydb
SECRET_KEY=MYSECRET
Load with python-dotenv:
pip install python-dotenv
Inside Python:
from dotenv import load_dotenv
import os
load_dotenv()
DB_URL = os.getenv("DATABASE_URL")
🚦 PART C — Nginx Reverse Proxy (Production Pattern)
Nginx sits in front of FastAPI:
Client → Nginx → Gunicorn+Uvicorn → FastAPI
Why?
- SSL/TLS
- Caching
- Rate limiting
- Load balancing
- Static files
nginx.conf example:
server {
listen 80;
location / {
proxy_pass http://fastapi:8000;
}
}
🔁 PART D — GitHub Actions CI/CD Pipeline
Create .github/workflows/deploy.yml:
name: Deploy FastAPI
on:
push:
branches:
- main
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Build Docker Image
run: docker build -t my-fastapi .
- name: Push to DockerHub
run: |
echo ${{ secrets.DOCKER_PASSWORD }} | docker login -u ${{ secrets.DOCKER_USERNAME }} --password-stdin
docker tag my-fastapi myuser/my-fastapi:latest
docker push myuser/my-fastapi:latest
This automatically:
- Builds Docker image
- Pushes it to DockerHub
- Deploys to cloud server
☸️ PART E — Kubernetes Deployment (Production at Scale)
Kubernetes = infinite scaling + self-healing + auto-load balancing.
1. Deployment File
deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: fastapi-deployment
spec:
replicas: 4
selector:
matchLabels:
app: fastapi
template:
metadata:
labels:
app: fastapi
spec:
containers:
- name: fastapi
image: myuser/my-fastapi:latest
ports:
- containerPort: 8000
This runs 4 instances of FastAPI.
2. Load Balancer Service
service.yaml
apiVersion: v1
kind: Service
metadata:
name: fastapi-service
spec:
type: LoadBalancer
selector:
app: fastapi
ports:
- protocol: TCP
port: 80
targetPort: 8000
Now your API auto-scales across pods.
🚀 Scaling Strategy (Used by Netflix, Uber, Amazon)
1. Horizontal scaling
Increase pods/workers:
kubectl scale deployment fastapi-deployment --replicas=10
2. Vertical scaling
Increase CPU/RAM of pods.
3. Auto-scaling (HPA)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: fastapi-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: fastapi-deployment
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
🧠 PART F — Observability (Logs, Metrics, APM)
Companies expect:
✔ Prometheus for metrics
✔ Grafana dashboards
✔ Loki for logs
✔ Jaeger for distributed tracing
🔮 Production Monitoring Example
Expose metrics:
/metrics → Prometheus
/logs → Loki
/traces → Jaeger
This helps track:
- API latencies
- Error rates
- Worker overload
- ML model performance
☁️ PART G — Cloud Deployment Patterns
1. AWS ECS Fargate
Docker containers without servers.
2. AWS Lambda + API Gateway
Serverless FastAPI
(using Mangum)
3. Azure App Service
Directly run containers.
4. Google Cloud Run
Auto-scale to zero.
🚀 Lesson 9 Summary
You learned:
✔ Dockerizing FastAPI
✔ Production server (Gunicorn + Uvicorn)
✔ Environment variables
✔ Nginx reverse proxy
✔ CI/CD (GitHub Actions)
✔ Kubernetes deployment
✔ Scaling (HPA)
✔ Cloud deployment patterns
✔ Monitoring & observability
This lesson turns you into a full-stack deployment expert.
🎉 Only 1 lesson left — Lesson 10: Industry Case Studies + Top Interview Questions
It will include:
✔ Real architectures from Uber, Netflix, Swiggy
✔ FastAPI microservices examples
✔ High-level diagrams
✔ 100+ top interview questions
✔ Final project blueprint
Shall I continue with Lesson 10?