Let’s do this step-by-step: first build a deep mental model of Python strings, then walk through all the exercises with solid patterns (brute-force → optimal where it matters).
1️⃣ Python Strings – Deep Mental Model
1.1 What actually is a string in Python?
- Type:
str - Conceptually: an immutable sequence of Unicode code points.
- Implementation (CPython):
- Stored as a
PyUnicodeObject(C struct). - Contains:
- Length (
len(s)) - Hash (cached after first computation)
- Pointer to raw character data
- Encoding details (internally can be 1/2/4 bytes per char depending on the max code point → “compact unicode representation”).
- Length (
- Stored as a
Immutability
- Once created, contents cannot change.
- Operations like
replace,upper,stripreturn new strings. - Doing:
s = "hello" s = s + "!"creates a new string"hello!"and rebindssto it.
This is why repeated concatenation in a loop is bad:
result = ""
for part in parts:
result += part # O(n^2) overall
Better:
result = "".join(parts) # O(total_length)
1.2 Interning (why a is b sometimes True)
Python may intern some strings:
- All identifiers (variable names, keywords).
- Short strings, or strings that look like identifiers.
- You can also manually intern:
sys.intern(s).
Interning means: keep a single copy in memory, reuse it → faster comparisons because you can sometimes compare pointers before content.
But never rely on this behavior in logic (== is correct, is is not).
1.3 Indexing & slicing
s[i]is O(1) in CPython (offset arithmetic plus correct width).s[i:j]creates a new string (copy of that slice).- Python strings are 0-indexed; negative indexes wrap from the end.
s = "abcdef"
s[1:4] # "bcd"
s[-2:] # "ef"
1.4 Common operations & inner behavior
Key point: most string methods are O(n) in length of the string (or n + m for substring).
len(s)– O(1) (Python stores length).in(substr in s) – O(n*m) worst case, but CPython uses optimized search (like Boyer–Moore variants) so it’s much faster in practice.s.find(sub),s.index(sub)– same underlying C search.s.replace(old, new)– single pass scanning, building new output.s.split(sep)– scan and cut into pieces; each piece is a new string.s.join(list_of_strings)– allocated once with correct final size, then copies parts → very efficient.
Case conversions:
s.lower(),s.upper()– iterate characters, apply Unicode mapping. For non-ASCII, rules are more complex.
Comparison:
s1 == s2,<, etc. – lexicographical comparison, char by char.
Hashing:
- String hash is computed on first use, then cached; dictionary/set lookups are very fast.
1.5 Encodings & bytes
- Python
stris Unicode. - Binary data →
bytes(immutable) orbytearray(mutable).
Conversions:
s = "नमस्ते"
b = s.encode("utf-8") # str -> bytes
s2 = b.decode("utf-8") # bytes -> str
Errors here are due to encoding mismatch, not “string issues”.
2️⃣ Key String Methods & How They’re Typically Implemented
High level (pseudo-internals):
find(sub)→ efficient substring search (e.g., Boyer–Moore / Two-way algorithm).startswith(prefix)/endswith(suffix)→ compare slices directly; no scanning of full string.strip(chars)→ scan from left & right while characters are inchars, then slice.replace(old, new):- Count occurrences to know final size.
- Allocate final string.
- Copy characters, substituting
newfor eachold.
You can mentally model them as linear scans plus some cleverness.
3️⃣ Exercises – Basics 🔤
I’ll give: problem → core idea → clean solution.
3.1 Is It a Vowel?
def is_vowel(ch: str) -> bool:
return len(ch) == 1 and ch.lower() in "aeiou"
3.2 Reverse String
def reverse_string(s: str) -> str:
return s[::-1]
# manual (for understanding):
def reverse_string_manual(s: str) -> str:
chars = []
for i in range(len(s) - 1, -1, -1):
chars.append(s[i])
return "".join(chars)
3.3 Remove Vowels from a String
def remove_vowels(s: str) -> str:
vowels = set("aeiouAEIOU")
return "".join(ch for ch in s if ch not in vowels)
3.4 Replace Spaces with Hyphens
def replace_spaces_with_hyphens(s: str) -> str:
return s.replace(" ", "-")
3.5 Valid Anagram (LeetCode)
from collections import Counter
def is_anagram(s: str, t: str) -> bool:
if len(s) != len(t):
return False
return Counter(s) == Counter(t)
3.6 Remove Adjacent Duplicates (LeetCode style: "abbaca" -> "ca")
Use a stack of characters:
def remove_adjacent_duplicates(s: str) -> str:
stack = []
for ch in s:
if stack and stack[-1] == ch:
stack.pop()
else:
stack.append(ch)
return "".join(stack)
3.7 Pangram Checker (a–z present at least once)
def is_pangram(s: str) -> bool:
letters = {ch.lower() for ch in s if ch.isalpha()}
return len(letters) == 26
3.8 Longest Common Prefix (LeetCode)
def longest_common_prefix(strs) -> str:
if not strs:
return ""
prefix = strs[0]
for s in strs[1:]:
while not s.startswith(prefix):
prefix = prefix[:-1]
if not prefix:
return ""
return prefix
3.9 First Unique Character in String (LeetCode)
from collections import Counter
def first_unique_char(s: str) -> int:
freq = Counter(s)
for i, ch in enumerate(s):
if freq[ch] == 1:
return i
return -1
3.10 String Compression (LeetCode, e.g. "aabcccccaaa" -> "a2b1c5a3")
def compress_string(s: str) -> str:
if not s:
return ""
result = []
count = 1
for i in range(1, len(s)):
if s[i] == s[i-1]:
count += 1
else:
result.append(s[i-1] + str(count))
count = 1
result.append(s[-1] + str(count))
compressed = "".join(result)
return compressed if len(compressed) < len(s) else s
3.11 Longest Palindromic Substring
Use expand around center:
def longest_palindrome(s: str) -> str:
if not s:
return ""
def expand(l, r):
while l >= 0 and r < len(s) and s[l] == s[r]:
l -= 1
r += 1
return l + 1, r # [start, end)
best_start, best_end = 0, 1
for i in range(len(s)):
# odd length
l1, r1 = expand(i, i)
if r1 - l1 > best_end - best_start:
best_start, best_end = l1, r1
# even length
l2, r2 = expand(i, i + 1)
if r2 - l2 > best_end - best_start:
best_start, best_end = l2, r2
return s[best_start:best_end]
3.12 Is Subsequence (LeetCode)
def is_subsequence(s: str, t: str) -> bool:
i = 0
for ch in t:
if i < len(s) and s[i] == ch:
i += 1
return i == len(s)
3.13 Check if Two Strings Are Rotations of Each Other
def are_rotations(s1: str, s2: str) -> bool:
if len(s1) != len(s2):
return False
return s2 in (s1 + s1)
3.14 Check if String is Palindrome (Ignore Case & Non-Alphanumerics)
def is_palindrome_clean(s: str) -> bool:
filtered = [ch.lower() for ch in s if ch.isalnum()]
i, j = 0, len(filtered) - 1
while i < j:
if filtered[i] != filtered[j]:
return False
i += 1
j -= 1
return True
3.15 Count Vowels and Consonants in a String
def count_vowels_consonants(s: str):
vowels_set = set("aeiouAEIOU")
vowels = consonants = 0
for ch in s:
if ch.isalpha():
if ch in vowels_set:
vowels += 1
else:
consonants += 1
return vowels, consonants
3.16 Find First Non-Repeating Character in a String
(Almost same as “First Unique Character”, but return char instead of index.)
from collections import Counter
def first_non_repeating_char(s: str) -> str | None:
freq = Counter(s)
for ch in s:
if freq[ch] == 1:
return ch
return None
3.17 Group Anagrams (LeetCode)
Use sorted string or frequency tuple as key:
from collections import defaultdict
def group_anagrams(strs):
groups = defaultdict(list)
for s in strs:
key = "".join(sorted(s)) # or tuple of counts for O(n)
groups[key].append(s)
return list(groups.values())
4️⃣ Intermediate & Advanced 📚
4.1 Implement strStr() / Substring Search (no built-in search)
Return index of first occurrence of needle in haystack or -1.
Brute force (O(n*m)):
def str_str(haystack: str, needle: str) -> int:
if needle == "":
return 0
n, m = len(haystack), len(needle)
for i in range(n - m + 1):
j = 0
while j < m and haystack[i + j] == needle[j]:
j += 1
if j == m:
return i
return -1
(You can later upgrade to KMP for O(n + m)).
4.2 Longest Substring Without Repeating Characters
Sliding window with a map of last seen indices:
def length_of_longest_substring(s: str) -> int:
last = {}
left = 0
best = 0
for right, ch in enumerate(s):
if ch in last and last[ch] >= left:
left = last[ch] + 1
last[ch] = right
best = max(best, right - left + 1)
return best
4.3 Longest Substring with At Most K Distinct Characters
Sliding window + count of characters:
from collections import Counter
def longest_substring_k_distinct(s: str, k: int) -> int:
if k == 0:
return 0
freq = Counter()
left = 0
best = 0
distinct = 0
for right, ch in enumerate(s):
if freq[ch] == 0:
distinct += 1
freq[ch] += 1
while distinct > k:
left_ch = s[left]
freq[left_ch] -= 1
if freq[left_ch] == 0:
distinct -= 1
left += 1
best = max(best, right - left + 1)
return best
4.4 Minimum Window Substring
Classic sliding window (a bit longer but important):
from collections import Counter
def min_window(s: str, t: str) -> str:
if not s or not t:
return ""
need = Counter(t)
missing = len(t)
left = start = end = 0
for right, ch in enumerate(s, 1): # right is 1-based here
if need[ch] > 0:
missing -= 1
need[ch] -= 1
while missing == 0:
if end == 0 or right - left < end - start:
start, end = left, right
left_ch = s[left]
need[left_ch] += 1
if need[left_ch] > 0:
missing += 1
left += 1
return s[start:end]
4.5 Valid Palindrome II (remove at most one character)
def valid_palindrome_ii(s: str) -> bool:
def is_pal_range(i, j):
while i < j:
if s[i] != s[j]:
return False
i += 1
j -= 1
return True
i, j = 0, len(s) - 1
while i < j:
if s[i] != s[j]:
return is_pal_range(i + 1, j) or is_pal_range(i, j - 1)
i += 1
j -= 1
return True
4.6 Decode String (3[a2[c]] → accaccacc)
Use stack:
def decode_string(s: str) -> str:
num_stack = []
str_stack = []
current = []
num = 0
for ch in s:
if ch.isdigit():
num = num * 10 + int(ch)
elif ch == '[':
num_stack.append(num)
str_stack.append("".join(current))
num = 0
current = []
elif ch == ']':
repeat = num_stack.pop()
prev = str_stack.pop()
current = [prev + "".join(current) * repeat]
else:
current.append(ch)
return "".join(current)
4.7 Basic Calculator (only +, -, (, ), spaces)
Evaluate expression like "1 + (2 - 3)".
Idea: scan characters, use a stack to remember signs before parentheses.
def basic_calculator(expr: str) -> int:
stack = [1] # sign context
sign = 1
result = 0
i = 0
n = len(expr)
while i < n:
ch = expr[i]
if ch.isdigit():
num = 0
while i < n and expr[i].isdigit():
num = num * 10 + int(expr[i])
i += 1
result += sign * num
continue # i already moved
elif ch == '+':
sign = stack[-1] * 1
elif ch == '-':
sign = stack[-1] * -1
elif ch == '(':
stack.append(sign)
elif ch == ')':
stack.pop()
i += 1
return result
4.8 Multiply Two Large Numbers Given as Strings
Manual multiplication:
def multiply_strings(num1: str, num2: str) -> str:
if num1 == "0" or num2 == "0":
return "0"
m, n = len(num1), len(num2)
res = [0] * (m + n)
for i in range(m - 1, -1, -1):
for j in range(n - 1, -1, -1):
mul = (ord(num1[i]) - ord('0')) * (ord(num2[j]) - ord('0'))
total = mul + res[i + j + 1]
res[i + j + 1] = total % 10
res[i + j] += total // 10
# skip leading zeros
i = 0
while i < len(res) and res[i] == 0:
i += 1
return "".join(str(d) for d in res[i:])
4.9 Add Two Binary Strings
def add_binary(a: str, b: str) -> str:
i, j = len(a) - 1, len(b) - 1
carry = 0
result = []
while i >= 0 or j >= 0 or carry:
bit_a = int(a[i]) if i >= 0 else 0
bit_b = int(b[j]) if j >= 0 else 0
total = bit_a + bit_b + carry
result.append(str(total % 2))
carry = total // 2
i -= 1
j -= 1
return "".join(reversed(result))
4.10 Integer to Roman
def int_to_roman(num: int) -> str:
values = [
(1000, "M"), (900, "CM"),
(500, "D"), (400, "CD"),
(100, "C"), (90, "XC"),
(50, "L"), (40, "XL"),
(10, "X"), (9, "IX"),
(5, "V"), (4, "IV"),
(1, "I"),
]
result = []
for val, sym in values:
while num >= val:
num -= val
result.append(sym)
return "".join(result)
4.11 Roman to Integer
def roman_to_int(s: str) -> int:
values = {
"I": 1, "V": 5, "X": 10, "L": 50,
"C": 100, "D": 250, "M": 1000
}
# correct values:
values["D"] = 500 # fix typo
total = 0
i = 0
while i < len(s):
if i + 1 < len(s) and values[s[i]] < values[s[i + 1]]:
total += values[s[i + 1]] - values[s[i]]
i += 2
else:
total += values[s[i]]
i += 1
return total
(Note: I fixed D to 500 inside code.)
5️⃣ How to Actually Master This (not just read)
Here’s how I’d suggest you practice this string pack:
- Phase 1 – Internals
- Play with:
id()of small vs large strings, trysys.intern.- Time
.joinvs+=in a loop usingtimeit.
- Print small experiments to see immutability in action.
- Play with:
- Phase 2 – Basics (All 17 problems)
- Try each one from scratch before looking at the solution.
- For each problem, write:
- Brute force version.
- Improved version using a pattern (two pointers, hash map, stack, etc.).
- Phase 3 – Sliding Window Problems
- Group together:
- Longest substring without repeating.
- Longest substring with at most K distinct.
- Minimum window substring.
- Recognize pattern: window + frequency + shrink while condition broken.
- Group together:
- Phase 4 – Stack & Expression Problems
- Decode string.
- Basic calculator.
- Phase 5 – Number/String Conversions
multiply_strings,add_binary, int↔Roman.
If you like, next I can:
- Turn this into a multi-day roadmap (like “Deep Python Strings in 7 days”), and
- Add diagrams for 2–3 tricky ones (minimum window, decode string, multiply strings) with step-by-step state of variables.