FastAPI A to Z Course with ChatGPT

🚀 Lesson 9 — Scaling & Deployment (Docker, Gunicorn/Uvicorn Workers, CI/CD, Kubernetes, Cloud Deployment)

This is where your FastAPI project becomes production-grade — scalable, observable, secure, and cloud-ready.
Companies care A LOT about this knowledge.

By the end of this lesson, you will be able to deploy a FastAPI app to:

✔ Docker
✔ AWS / Azure / GCP
✔ Kubernetes
✔ Production servers (Gunicorn + Uvicorn workers)
✔ CI/CD pipelines

Let’s begin. 🔥


🎯 What You Will Learn Today

✔ Dockerizing FastAPI

✔ Production server (Gunicorn + Uvicorn workers)

✔ Environment variables

✔ Nginx reverse proxy

✔ CI/CD (GitHub Actions)

✔ Kubernetes Deployment

✔ Scaling horizontally

✔ Cloud deployment patterns


🧱 PART A — Dockerizing FastAPI

Docker is the #1 requirement for modern backend + AI engineers.


🐳 1. Create Dockerfile

Create a file named Dockerfile in your project:

FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .

CMD ["gunicorn", "-k", "uvicorn.workers.UvicornWorker", "main:app", "--bind", "0.0.0.0:8000", "--workers", "4"]

This uses:

✔ Gunicorn (multi-worker server)
✔ Uvicorn workers (ASGI support)
✔ 4 worker processes → handles high load


📦 2. Build Docker Image

docker build -t fastapi-app .

▶️ 3. Run Container

docker run -p 8000:8000 fastapi-app

Your app is running inside Docker. 🚀


🔥 Why Gunicorn + Uvicorn worker?

Because:

  • Uvicorn alone = dev server
  • Gunicorn = production process manager
  • UvicornWorker = async worker
  • Multi-worker = uses multiple CPU cores

Example scaling up to 8 workers:

--workers 8

🌍 PART B — Environment Variables

Never hardcode secrets.

Use .env:

DATABASE_URL=postgresql://user:pass@db/mydb
SECRET_KEY=MYSECRET

Load with python-dotenv:

pip install python-dotenv

Inside Python:

from dotenv import load_dotenv
import os

load_dotenv()
DB_URL = os.getenv("DATABASE_URL")

🚦 PART C — Nginx Reverse Proxy (Production Pattern)

Nginx sits in front of FastAPI:

Client → Nginx → Gunicorn+Uvicorn → FastAPI

Why?

  • SSL/TLS
  • Caching
  • Rate limiting
  • Load balancing
  • Static files

nginx.conf example:

server {
    listen 80;

    location / {
        proxy_pass http://fastapi:8000;
    }
}

🔁 PART D — GitHub Actions CI/CD Pipeline

Create .github/workflows/deploy.yml:

name: Deploy FastAPI

on:
  push:
    branches:
      - main

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Build Docker Image
        run: docker build -t my-fastapi .

      - name: Push to DockerHub
        run: |
          echo ${{ secrets.DOCKER_PASSWORD }} | docker login -u ${{ secrets.DOCKER_USERNAME }} --password-stdin
          docker tag my-fastapi myuser/my-fastapi:latest
          docker push myuser/my-fastapi:latest

This automatically:

  • Builds Docker image
  • Pushes it to DockerHub
  • Deploys to cloud server

☸️ PART E — Kubernetes Deployment (Production at Scale)

Kubernetes = infinite scaling + self-healing + auto-load balancing.


1. Deployment File

deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: fastapi-deployment

spec:
  replicas: 4
  selector:
    matchLabels:
      app: fastapi

  template:
    metadata:
      labels:
        app: fastapi

    spec:
      containers:
      - name: fastapi
        image: myuser/my-fastapi:latest
        ports:
        - containerPort: 8000

This runs 4 instances of FastAPI.


2. Load Balancer Service

service.yaml

apiVersion: v1
kind: Service
metadata:
  name: fastapi-service

spec:
  type: LoadBalancer
  selector:
    app: fastapi
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8000

Now your API auto-scales across pods.


🚀 Scaling Strategy (Used by Netflix, Uber, Amazon)

1. Horizontal scaling

Increase pods/workers:

kubectl scale deployment fastapi-deployment --replicas=10

2. Vertical scaling

Increase CPU/RAM of pods.

3. Auto-scaling (HPA)

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: fastapi-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: fastapi-deployment
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 60

🧠 PART F — Observability (Logs, Metrics, APM)

Companies expect:

✔ Prometheus for metrics
✔ Grafana dashboards
✔ Loki for logs
✔ Jaeger for distributed tracing


🔮 Production Monitoring Example

Expose metrics:

/metrics → Prometheus
/logs → Loki
/traces → Jaeger

This helps track:

  • API latencies
  • Error rates
  • Worker overload
  • ML model performance

☁️ PART G — Cloud Deployment Patterns

1. AWS ECS Fargate

Docker containers without servers.

2. AWS Lambda + API Gateway

Serverless FastAPI
(using Mangum)

3. Azure App Service

Directly run containers.

4. Google Cloud Run

Auto-scale to zero.


🚀 Lesson 9 Summary

You learned:

✔ Dockerizing FastAPI
✔ Production server (Gunicorn + Uvicorn)
✔ Environment variables
✔ Nginx reverse proxy
✔ CI/CD (GitHub Actions)
✔ Kubernetes deployment
✔ Scaling (HPA)
✔ Cloud deployment patterns
✔ Monitoring & observability

This lesson turns you into a full-stack deployment expert.


🎉 Only 1 lesson left — Lesson 10: Industry Case Studies + Top Interview Questions

It will include:
✔ Real architectures from Uber, Netflix, Swiggy
✔ FastAPI microservices examples
✔ High-level diagrams
✔ 100+ top interview questions
✔ Final project blueprint

Shall I continue with Lesson 10?

Pages: 1 2 3 4 5 6 7 8 9 10 11