True A→Z → PhD-level Python Mastery Course Part2

Perfect.
This is PHASE 11 — Performance, Profiling & Optimization PLUS the 30-question Testing & Logging interview drill, delivered at senior/staff production depth.

We are now in the last hardcore foundations phase. Everything here directly affects real systems, SLAs, cost, and scalability.


⚡ PHASE 11 — PERFORMANCE, PROFILING & OPTIMIZATION (DEEP)

Performance is not about tricks.
It is about measurement, understanding bottlenecks, and choosing the right abstraction.


11.1 GOLDEN RULE OF PERFORMANCE (INTERVIEW MUST-SAY)

Never optimize without measuring.

Wrong order ❌

optimize → benchmark → guess

Correct order ✅

measure → identify bottleneck → optimize → re-measure

⏱️ TIME PERFORMANCE (CPU)


11.2 BIG-O STILL MATTERS (EVEN IN PYTHON)

Python is slower than C, but:

  • O(n²) vs O(n log n) still dominates
  • Algorithm choice beats micro-optimizations

📌 Interview line

Python performance problems are usually algorithmic, not syntactic.


11.3 timeit — MICRO BENCHMARKING (CORRECT WAY)

❌ Bad

import time
start = time.time()
...

✅ Good

import timeit
timeit.timeit("x+1", number=1_000_000)

Why?

  • Removes noise
  • Repeats execution
  • Accurate comparisons

11.4 FUNCTION CALL COST (REALITY)

Function calls are expensive in Python.

def f(x): return x + 1

Calling f() is slower than inline logic.

📌 Implication:

  • Avoid tiny functions in tight loops
  • Inline hot paths when needed

11.5 LOCAL VARIABLES ARE FASTER THAN GLOBALS

def slow():
    for _ in range(10**6):
        len([1,2,3])

def fast():
    l = len
    for _ in range(10**6):
        l([1,2,3])

Why?

  • Locals stored in array
  • Globals require dict lookup

📌 This is CPython-specific but interview-relevant.


🧠 MEMORY PERFORMANCE


11.6 MEMORY IS OFTEN THE REAL BOTTLENECK

Common causes:

  • Loading entire files
  • Holding large lists
  • Caching without eviction
  • Reference cycles

📌 Senior insight

Memory pressure causes slowdowns long before crashes.


11.7 sys.getsizeof() — LIMITED BUT USEFUL

import sys
sys.getsizeof([1,2,3])

⚠️ Only shallow size
Does NOT include referenced objects.


11.8 GENERATORS SAVE MEMORY (CRITICAL)

❌ List

[x*x for x in range(10**7)]

✅ Generator

(x*x for x in range(10**7))

Difference:

  • List → allocates all memory
  • Generator → computes on demand

📌 Interview line

Use generators for large or streaming data.


11.9 __slots__ — MEMORY OPTIMIZATION

class User:
    __slots__ = ("id", "name")

Effects:

  • No __dict__
  • Less memory
  • Faster attribute access

❌ Trade-off:

  • No dynamic attributes
  • Harder inheritance

11.10 AVOID ACCIDENTAL OBJECT CREATION

for i in range(n):
    x = i * 2

Better than:

for i in range(n):
    x = int(i * 2)

📌 Object creation is expensive.


🔍 PROFILING — FIND THE REAL BOTTLENECK


11.11 CPU PROFILING WITH cProfile

import cProfile

cProfile.run("main()")

Shows:

  • Function call count
  • Time per function
  • Cumulative time

📌 Always optimize hot functions only.


11.12 LINE-LEVEL PROFILING (MENTION-LEVEL)

Tools:

  • line_profiler
  • py-spy
  • perf

Used when:

  • CPU-heavy loops
  • Numerical workloads

11.13 MEMORY PROFILING (REAL SYSTEMS)

Common tools:

  • tracemalloc
  • memory_profiler
import tracemalloc
tracemalloc.start()

Used to:

  • Detect leaks
  • Track allocation spikes

11.14 GARBAGE COLLECTION IMPACT

Python uses:

  • Reference counting (immediate)
  • Cyclic GC (periodic)
import gc
gc.get_threshold()

📌 Tuning GC can help latency-sensitive systems.


⚙️ OPTIMIZATION STRATEGIES (PRODUCTION-GRADE)


11.15 CACHE SMARTLY

from functools import lru_cache

Benefits:

  • Eliminates repeated computation

Risks:

  • Memory leaks
  • Stale data

📌 Always consider cache invalidation.


11.16 BATCHING IS A PERFORMANCE WEAPON

❌ Repeated calls

for x in items:
    save(x)

✅ Batch

save_many(items)

Used in:

  • DB writes
  • Network calls
  • APIs

11.17 CONCURRENCY ≠ PERFORMANCE (IMPORTANT)

  • Threads → IO only
  • Processes → CPU
  • Async → scale IO

📌 Don’t add concurrency blindly.


11.18 NUMPY & VECTORIZATION (MENTION-LEVEL)

# Python loop ❌
for i in range(n):
    a[i] += b[i]

# NumPy ✅
a += b

Why faster:

  • C loops
  • SIMD
  • Reduced Python overhead

11.19 COMMON PERFORMANCE ANTI-PATTERNS

❌ Premature optimization
❌ Excessive abstraction
❌ Logging inside tight loops
❌ Exceptions for control flow
❌ Over-threading


11.20 PERFORMANCE INTERVIEW KILLER ANSWER

“I profile first, optimize bottlenecks only, and validate gains with benchmarks.”


🧪 PART 2 — 30-QUESTION TESTING & LOGGING INTERVIEW DRILL

Answer fast & clean.


1️⃣ Why is pytest preferred over unittest?

→ Less boilerplate, better fixtures, parametrize, ecosystem.


2️⃣ What is a fixture?

→ Controlled setup/teardown logic.


3️⃣ Fixture scope types?

→ function, class, module, session.


4️⃣ What is parametrized testing?

→ Same test, multiple inputs.


5️⃣ How do you test exceptions?

pytest.raises.


6️⃣ What should be mocked?

→ External dependencies, not core logic.


7️⃣ Mock vs Stub?

→ Mock verifies calls, stub returns data.


8️⃣ Where should patching be applied?

→ Where the object is used, not defined.


9️⃣ Why is patching often wrong?

→ Import binding happens at import time.


🔟 What is an autouse fixture?

→ Automatically applied fixture.


1️⃣1️⃣ Why avoid autouse?

→ Hidden dependencies.


1️⃣2️⃣ Unit vs integration test?

→ Isolated logic vs real dependencies.


1️⃣3️⃣ What is the test pyramid?

→ Many unit, few integration, minimal E2E.


1️⃣4️⃣ Why not print debugging?

→ Alters behavior, pollutes output.


1️⃣5️⃣ How to debug production issues?

→ Logs, metrics, traces — not print.


1️⃣6️⃣ How to read a traceback?

→ Bottom-up.


1️⃣7️⃣ What does logger.exception() do?

→ Logs message + traceback.


1️⃣8️⃣ Why not log and re-raise everywhere?

→ Duplicate logs.


1️⃣9️⃣ Where should logging occur?

→ At system boundaries.


2️⃣0️⃣ Logging levels order?

→ DEBUG → INFO → WARNING → ERROR → CRITICAL


2️⃣1️⃣ Why use __name__ for loggers?

→ Hierarchical control.


2️⃣2️⃣ What is structured logging?

→ Logs as data, not strings.


2️⃣3️⃣ How to test logs in pytest?

caplog.


2️⃣4️⃣ What causes flaky tests?

→ Time, randomness, shared state.


2️⃣5️⃣ How to fix flaky tests?

→ Determinism, isolation, proper teardown.


2️⃣6️⃣ Why are sleeps bad in tests?

→ Slow, unreliable.


2️⃣7️⃣ How to mock time?

→ Patch time functions.


2️⃣8️⃣ Logging vs metrics?

→ Logs = events, metrics = trends.


2️⃣9️⃣ When not to mock?

→ When behavior matters more than isolation.


3️⃣0️⃣ One-line testing philosophy?

“Tests should be fast, isolated, deterministic, and meaningful.”


🏁 PHASE 11 COMPLETE ✅

You now understand:
✔ Measuring performance
✔ Profiling CPU & memory
✔ Optimization strategies
✔ Real testing practices
✔ Debugging systematically
✔ Production-grade logging

This is senior/staff-level Python engineering.


🚀 FINAL PHASE OPTIONS (YOUR CHOICE)

Reply with ONLY ONE number:

1 → PHASE 12: Security, Packaging, Deployment & CI/CD (Python in production)
2 → Full end-to-end capstone project (CLI / ETL / service) using ALL phases
3 → Multiple mock interviews (Python, system design, debugging)

You’ve built true Python mastery.