Excellent.
DAY 7 is where Python strings stop being “just text” and become memory, performance, and Unicode engineering.
Many production slowdowns and interview traps come from misunderstanding strings.
🧠 DAY 7 — Strings: Immutability, Interning & Unicode Internals
🔑 CORE IDEA OF DAY 7
Python strings are immutable Unicode objects optimized for safety, hashing, and sharing — not mutation.
1️⃣ Strings Are Objects (Not Char Arrays)
In C:
char s[] = "hello"; // mutable
In Python:
s = "hello" # immutable object
You cannot change a string in place.
s[0] = "H" # TypeError
Why this design choice?
- Hash safety
- Thread safety
- Aggressive reuse (interning)
- Performance predictability
2️⃣ What Happens in s += "world" (VERY IMPORTANT)
s = "hello"
s += "world"
❌ What people think
- String is extended
✅ What actually happens
- New string
"helloworld"created - Old
"hello"remains unchanged sis rebound to new object
This is why repeated string concatenation is slow.
3️⃣ CPython String Internals (Simplified)
Conceptually:
PyUnicodeObject
├── ob_refcnt
├── ob_type
├── length
├── hash
├── state (compact / ascii / kind)
└── data (code points)
Important optimizations:
- ASCII-only strings are stored compactly
- Unicode storage adapts (1, 2, or 4 bytes per char)
4️⃣ Unicode: Why Python 3 Strings Are Powerful
Python 3 strings are Unicode by default.
s = "नमस्ते"
- Each character is a Unicode code point
- Encoding happens only during I/O
Encoding vs Decoding
text = "hello"
b = text.encode("utf-8")
text2 = b.decode("utf-8")
🧠 Strings ≠ bytes
5️⃣ String Interning (INTERVIEW FAVORITE)
CPython may reuse identical immutable strings.
a = "hello"
b = "hello"
a is b # True (often)
But:
a = "".join(["he", "llo"])
b = "hello"
a is b # False
Why?
- Literal strings may be interned at compile time
- Runtime-created strings usually are not
⚠️ Never rely on is for strings.
6️⃣ Why Strings Are Hashable (And Lists Aren’t)
Strings:
- Immutable
- Stable hash value
- Can be dict keys
Lists:
- Mutable
- Hash would change
- ❌ Cannot be dict keys
d = {"key": 1} # OK
d[[1,2]] = 3 # TypeError
7️⃣ Slicing Strings Is NOT Free
s = "abcdefghijklmnopqrstuvwxyz"
t = s[5:15]
- Creates a new string
- Copies characters
- O(n) time
Old Python 2 optimization (views) is gone.
8️⃣ Common Performance Trap (VERY COMMON)
❌ Bad:
s = ""
for i in range(10000):
s += str(i)
Why bad?
- New string every iteration
- O(n²) behavior
✅ Good:
parts = []
for i in range(10000):
parts.append(str(i))
s = "".join(parts)
9️⃣ str() vs repr() (INTERVIEW CLASSIC)
s = "hello"
print(str(s)) # hello
print(repr(s)) # 'hello'
str()→ user-friendlyrepr()→ developer/debug-friendly
Rule:
repr(obj)should ideally recreate the object.
🔟 Comparing Strings: == vs is
a = "py"
b = "py"
a is b # maybe True
c = "".join(["p", "y"])
a is c # False
✔ Use == for strings
❌ Never use is
11️⃣ Strings & Memory Sharing
Because strings are immutable:
- Safe to share
- Safe to cache
- Safe as dict keys
- Safe across threads
This design enables:
- String interning
- Fast dictionary lookups
- Reduced memory usage
🔥 INTERVIEW TRAPS (DAY 7)
Q1
a = "hello"
b = "he" + "llo"
print(a is b)
✔ Often True (compile-time optimization)
Q2
a = "hello"
b = "he"
b += "llo"
print(a is b)
✔ False (runtime creation)
Q3
print("a" * 1000 == "".join(["a"] * 1000))
✔ True
But identity differs.
🧠 DAY 7 MENTAL CHECKLIST
Before string operations, ask:
- Is this in a loop?
- Am I creating many temporary strings?
- Do I need bytes instead?
- Is identity being tested incorrectly?
📝 DAY 7 ASSIGNMENT (IMPORTANT)
1️⃣ Predict output (no running):
a = "data"
b = "da" + "ta"
c = "".join(["da", "ta"])
print(a is b)
print(a is c)
2️⃣ Explain clearly:
- Why strings are immutable
- Why Python 3 made Unicode default
3️⃣ Design question:
How does string immutability help dictionary performance?
🔜 DAY 8 PREVIEW
DAY 8 — Lists (Dynamic Arrays, Over-allocation & Copy Traps)
You’ll learn:
- How lists grow
- Why
append()is amortized O(1) - Shallow vs deep copy disasters
- Hidden list performance traps
When ready, say 👉 “START DAY 8”