Nice — below are all three: 10 practical industry-grade examples (with short code + why you’d use it), a set of tricky interview questions with clear answers/explanations, and a compact visual/ASCII “map” that explains nested comprehensions and where filters/conditions land.
✅ 10 real industry examples (code + why)
- ETL — normalize and filter rows from JSON logs
logs = [
{"user":"alice","dur":120,"ok":True},
{"user":"bob","dur":0,"ok":False},
]
rows = [
{"user": l["user"].lower(), "duration_sec": l["dur"]}
for l in logs
if l["ok"] and l["dur"] > 0
]
# Use: cleaning and filtering event stream before writing to parquet
- Feature engineering — vector of booleans for tokens
vocab = ["apple","banana","cherry"]
text = "I like apple and cherry"
features = [1 if word in text else 0 for word in vocab]
# Use: quick bag-of-words features for a simple model
- Flatten nested JSON arrays (flattening matrix)
matrix = [[1,2,3],[4,5],[6]]
flat = [x for row in matrix for x in row]
# Use: flattening nested lists from nested API responses
- Dictionary lookup for fast join (small-dimension table join)
countries = [{"iso":"IN","name":"India"},{"iso":"US","name":"USA"}]
iso2name = {c["iso"]: c["name"] for c in countries}
# Use: broadcast lookup table in ETL to enrich rows cheaply
- CSV column cleanup and type casting
raw = [" 10 ", " 3.14", "None", ""]
clean = [int(x.strip()) if x.strip().isdigit() else None for x in raw]
# Use: sanitize CSV columns quickly
- Generate SQL INSERT strings in batch
rows = [{"id":1,"n":"a"},{"id":2,"n":"b"}]
values = ", ".join(
f"({r['id']},'{r['n'].replace(\"'\",\"''")}')" for r in rows
)
sql = f"INSERT INTO t (id,n) VALUES {values};"
# Use: small-batch ingestion or debugging; be careful about SQL injection
- Detect duplicates while preserving order
seq = ["a","b","a","c","b"]
seen = set()
unique = [x for x in seq if not (x in seen or seen.add(x))]
# Use: dedupe while keeping first-occurrence order
- Create nested mapping (group by)
data = [("us","tx", 10),("us","ca",20),("in","dl",5)]
by_country = {
c: [city for (_c,city,_v) in data if _c==c]
for c in {d[0] for d in data}
}
# Use: fast in-memory grouping for small datasets
- Conditional labeling (inline if-else)
vals = [0,5,10]
labels = ["zero" if v==0 else "small" if v<10 else "big" for v in vals]
# Use: lightweight bucketing of numeric features
- Pipeline: chain transforms succinctly
text_lines = [" a, 1 ", " b, 2 "]
records = [
{"k":k.strip(), "v": int(v)}
for line in text_lines
for k,v in [line.split(",")]
if k.strip()
]
# Use: quick parsing + cleaning in one expression (small inputs)
⚠️ Practical notes
- Comprehensions are readable and fast for small-to-medium transforms, but avoid heavy side-effects or very large data (use generator expressions or streaming).
- If readability suffers, prefer explicit loops.
🧠 12 trick interview questions (with answers & explanations)
- Q: What’s wrong with this to dedupe while preserving order?
[x for x in seq if x not in seen]
A: seen is checked but not updated — you must add side-effect or seen.add(x) pattern:[x for x in seq if not (x in seen or seen.add(x))].
Note: relying on side-effects inside comprehensions is clever but can reduce readability.
- Q: Why
lambdasin list comp capturing loop var behave oddly?
funcs = [lambda: i for i in range(3)]
[f() for f in funcs] # => [2,2,2]
A: late binding — all lambdas capture the same i (final value). Fix with default arg: [lambda i=i: i for i in range(3)].
- Q: Difference:
ifafterforvs inlineif-else
[ x if cond else y for x in arr ] # conditional expression evaluated per item
[ x for x in arr if cond ] # cond filters items; no else allowed
A: placement matters — one filters, the other chooses expressions.
- Q: Does a list comprehension leak its loop variable to outer scope?
[i for i in range(3)]
print(i) # In Python 3: NameError
A: In Python 3 the loop variable in a list comprehension does not leak to the surrounding scope (unlike Python 2).
- Q: Mutable default inside comprehension or expression pitfalls
lst = [[] for _ in range(3)]
lst[0].append(1)
A: This is fine: each inner list is distinct. Issue arises if you used * replication like [[]]*3 — that creates references to same list.
- Q: Can you use
try/exceptinside an expression?
[x if safe(x) else default for x in arr]
A: No direct try/except inside a single expression — you must call a helper function with try/except inside it.
- Q: Are list comprehensions faster than equivalent
forwith.append()?
A: Often yes for CPython for pure Python-level transforms due to optimized C loops, but differences vary. For complex work, readability and profiling matter more. - Q: What happens with duplicate keys in dict comprehensions?
{ k: f(k) for k in [1,1,2] } # last wins
A: Later keys overwrite earlier values. Order of processing is left-to-right.
- Q: Generator expression vs list comprehension memory
(g(x) for x in huge) # generator — lazy, low memory
[g(x) for x in huge] # list — materialises everything, big memory
A: Use generators for large streams.
- Q: Side-effects in comprehension — safe or not?
out = [db.insert(x) for x in items]
A: Works, but ugly — better to use a plain loop for side-effects. Comprehensions are intended for creating sequences.
- Q: Can you put multiple
forandifin any order?
A: Order must mirror nested loops:[expr for a in A for b in B if cond]equals nestedforthenif. You can chain multipleifs; they act likeand. - Q: Are comprehensions atomic in terms of exception handling?
A: Errors inside comprehensions propagate like in regular code — they will interrupt the comprehension and bubble up. Usetryinside called functions if needed.
🗺️ Nested comprehension diagrams (visual map)
Below are short ASCII maps showing how to read nested comprehensions and where if conditions apply.
A. Basic nested loops
Code:
res = [ f(x,y) for x in A for y in B ]
Diagram (reading order left → right):
for x in A: # outer loop
for y in B: # inner loop
emit f(x,y)
So comprehension order for x ... for y ... == nesting for x outside for y.
B. With filter after inner loop
Code:
res = [ f(x,y) for x in A for y in B if cond(y) ]
Diagram:
for x in A:
for y in B:
if cond(y): # cond applies to y (and x if referenced)
emit f(x,y)
Important: the if immediately following a for attaches to that for.
C. Multiple if filters
res = [ f(x,y) for x in A if cond1(x) for y in B if cond2(y) ]
Diagram:
for x in A:
if cond1(x):
for y in B:
if cond2(y):
emit f(x,y)
Each if filters the current loop’s iteration.
D. Inline conditional expression vs filter
# inline expression
res = [ "ok" if p(x) else "no" for x in A ]
# filter
res2 = [ x for x in A if p(x) ]
Map:
- Inline
if-elseproduces an output for everyx. - Filter
ifmay skip items entirely.
E. Flattening 2D (visual)
flat = [ val for row in rows for val in row if val>0 ]
Flow:
for row in rows:
for val in row:
if val>0:
emit val
Quick cheatsheet summary
fororder in comprehension = nesting order in loops.ifplaced after aforfilters that loop’s iterations.- Inline
expr if cond else otheris part of the expression — not a filter. - Use default args to fix lambda late-binding:
lambda i=i: i. - Avoid side-effects in comprehensions for readability & maintainability.
Want me to:
- convert the 10 examples into PySpark-friendly patterns?
- give a printable one-page “cheat-sheet” PDF?
- or create an interactive visual diagram (SVG/PNG) for nested comprehensions?
Tell me which and I’ll produce it right here.