Excellent.
DAY 7 is where Python strings stop being “just text” and become memory, performance, and Unicode engineering.

Many production slowdowns and interview traps come from misunderstanding strings.


🧠 DAY 7 — Strings: Immutability, Interning & Unicode Internals


🔑 CORE IDEA OF DAY 7

Python strings are immutable Unicode objects optimized for safety, hashing, and sharing — not mutation.


1️⃣ Strings Are Objects (Not Char Arrays)

In C:

char s[] = "hello";   // mutable

In Python:

s = "hello"           # immutable object

You cannot change a string in place.

s[0] = "H"   # TypeError

Why this design choice?

  • Hash safety
  • Thread safety
  • Aggressive reuse (interning)
  • Performance predictability

2️⃣ What Happens in s += "world" (VERY IMPORTANT)

s = "hello"
s += "world"

❌ What people think

  • String is extended

✅ What actually happens

  1. New string "helloworld" created
  2. Old "hello" remains unchanged
  3. s is rebound to new object

This is why repeated string concatenation is slow.


3️⃣ CPython String Internals (Simplified)

Conceptually:

PyUnicodeObject
 ├── ob_refcnt
 ├── ob_type
 ├── length
 ├── hash
 ├── state (compact / ascii / kind)
 └── data (code points)

Important optimizations:

  • ASCII-only strings are stored compactly
  • Unicode storage adapts (1, 2, or 4 bytes per char)

4️⃣ Unicode: Why Python 3 Strings Are Powerful

Python 3 strings are Unicode by default.

s = "नमस्ते"
  • Each character is a Unicode code point
  • Encoding happens only during I/O

Encoding vs Decoding

text = "hello"
b = text.encode("utf-8")
text2 = b.decode("utf-8")

🧠 Strings ≠ bytes


5️⃣ String Interning (INTERVIEW FAVORITE)

CPython may reuse identical immutable strings.

a = "hello"
b = "hello"
a is b   # True (often)

But:

a = "".join(["he", "llo"])
b = "hello"
a is b   # False

Why?

  • Literal strings may be interned at compile time
  • Runtime-created strings usually are not

⚠️ Never rely on is for strings.


6️⃣ Why Strings Are Hashable (And Lists Aren’t)

Strings:

  • Immutable
  • Stable hash value
  • Can be dict keys

Lists:

  • Mutable
  • Hash would change
  • ❌ Cannot be dict keys
d = {"key": 1}     # OK
d[[1,2]] = 3       # TypeError

7️⃣ Slicing Strings Is NOT Free

s = "abcdefghijklmnopqrstuvwxyz"
t = s[5:15]
  • Creates a new string
  • Copies characters
  • O(n) time

Old Python 2 optimization (views) is gone.


8️⃣ Common Performance Trap (VERY COMMON)

❌ Bad:

s = ""
for i in range(10000):
    s += str(i)

Why bad?

  • New string every iteration
  • O(n²) behavior

✅ Good:

parts = []
for i in range(10000):
    parts.append(str(i))
s = "".join(parts)

9️⃣ str() vs repr() (INTERVIEW CLASSIC)

s = "hello"
print(str(s))   # hello
print(repr(s))  # 'hello'
  • str() → user-friendly
  • repr() → developer/debug-friendly

Rule:

repr(obj) should ideally recreate the object.


🔟 Comparing Strings: == vs is

a = "py"
b = "py"
a is b     # maybe True

c = "".join(["p", "y"])
a is c     # False

✔ Use == for strings
❌ Never use is


11️⃣ Strings & Memory Sharing

Because strings are immutable:

  • Safe to share
  • Safe to cache
  • Safe as dict keys
  • Safe across threads

This design enables:

  • String interning
  • Fast dictionary lookups
  • Reduced memory usage

🔥 INTERVIEW TRAPS (DAY 7)

Q1

a = "hello"
b = "he" + "llo"
print(a is b)

✔ Often True (compile-time optimization)


Q2

a = "hello"
b = "he"
b += "llo"
print(a is b)

False (runtime creation)


Q3

print("a" * 1000 == "".join(["a"] * 1000))

True
But identity differs.


🧠 DAY 7 MENTAL CHECKLIST

Before string operations, ask:

  1. Is this in a loop?
  2. Am I creating many temporary strings?
  3. Do I need bytes instead?
  4. Is identity being tested incorrectly?

📝 DAY 7 ASSIGNMENT (IMPORTANT)

1️⃣ Predict output (no running):

a = "data"
b = "da" + "ta"
c = "".join(["da", "ta"])

print(a is b)
print(a is c)

2️⃣ Explain clearly:

  • Why strings are immutable
  • Why Python 3 made Unicode default

3️⃣ Design question:

How does string immutability help dictionary performance?


🔜 DAY 8 PREVIEW

DAY 8 — Lists (Dynamic Arrays, Over-allocation & Copy Traps)

You’ll learn:

  • How lists grow
  • Why append() is amortized O(1)
  • Shallow vs deep copy disasters
  • Hidden list performance traps

When ready, say 👉 “START DAY 8”