Regular Expressions (Regex) using Python

🔥 Perfect! We’re going to build your Interactive “Regex Practice Playground” — a hands-on learning lab to master regular expressions in Python with progressive challenges and instant feedback.

Here’s what you’ll get inside the playground:

  • 🧩 20+ Real-world regex problems (email, logs, phone, NLP cleaning, etc.)
  • 💬 Hints and theory inline
  • 🧠 Explanations after each challenge
  • 🧰 Executable Python code blocks so you can actually run + tweak

Let’s first create a Python Notebook–style script that you can run in any IDE, Jupyter, or Colab.


🧠 Regex Practice Playground — Full Python File

Here’s the complete code (you can copy-paste and run directly):

# ===========================================
# 🧩 REGEX PRACTICE PLAYGROUND
# Learn Regular Expressions step-by-step
# ===========================================

import re

def divider(title):
    print("\n" + "="*70)
    print(f"🔹 {title}")
    print("="*70 + "\n")

# 1️⃣ Find All Emails
divider("1️⃣ Extract all emails from text")
text = "Contact us at support@example.com, sales@myshop.in or admin@openai.com"
pattern = r"[\w\.-]+@[\w\.-]+\.\w+"
print(re.findall(pattern, text))

# 2️⃣ Extract Phone Numbers
divider("2️⃣ Extract 10-digit Indian mobile numbers")
text = "Call 9876543210 or 9123456789 for help. Old number 12345 is invalid."
pattern = r"\b\d{10}\b"
print(re.findall(pattern, text))

# 3️⃣ Clean Text for NLP
divider("3️⃣ Clean special characters from text")
text = "Hi!!! My name is Rajeev##...I love Python$@#"
pattern = r"[^A-Za-z\s]"
print(re.sub(pattern, "", text))

# 4️⃣ Extract Hashtags
divider("4️⃣ Extract hashtags from social media text")
text = "Loving #Python #AI and #MachineLearning"
pattern = r"#\w+"
print(re.findall(pattern, text))

# 5️⃣ Extract Date & Time from Log
divider("5️⃣ Extract date and time from log line")
log = "2025-10-29 14:55:33 INFO - Process started"
pattern = r"(\d{4}-\d{2}-\d{2}) (\d{2}:\d{2}:\d{2})"
match = re.search(pattern, log)
print(match.groups())

# 6️⃣ Validate Email
divider("6️⃣ Validate email address format")
def is_valid_email(email):
    return bool(re.match(r"^[\w\.-]+@[\w\.-]+\.\w+$", email))

emails = ["rajeev@test.com", "not-an-email", "ai.dev@openai.org"]
for e in emails:
    print(e, "✅" if is_valid_email(e) else "❌")

# 7️⃣ Extract Domain Names
divider("7️⃣ Extract domain from each email")
emails = ["user1@gmail.com", "contact@openai.com", "raj@iitm.ac.in"]
domains = [re.search(r"@([\w\.-]+\.\w+)", email).group(1) for email in emails]
print(domains)

# 8️⃣ Extract Prices
divider("8️⃣ Extract prices from e-commerce text")
text = "Deals: ₹999 only! ₹1500 discounted! ₹49.99 special!"
pattern = r"₹\d+(?:\.\d+)?"
print(re.findall(pattern, text))

# 9️⃣ Replace Multiple Spaces
divider("9️⃣ Replace multiple spaces with single space")
text = "This    is   spaced   out   sentence"
print(re.sub(r"\s+", " ", text))

# 🔟 Extract IP Addresses
divider("🔟 Extract IP addresses from log")
log = "Login from 192.168.1.1 failed. Backup from 10.0.0.5 succeeded."
pattern = r"(?:\d{1,3}\.){3}\d{1,3}"
print(re.findall(pattern, log))

# 11️⃣ Extract URLs
divider("11️⃣ Extract URLs from text")
text = "Visit https://openai.com or http://example.org for info."
pattern = r"https?://[A-Za-z0-9./]+"
print(re.findall(pattern, text))

# 12️⃣ Extract PIN Codes (India)
divider("12️⃣ Extract 6-digit PIN codes")
text = "My home PIN: 560034, office PIN: 400001, invalid: 1234"
pattern = r"\b\d{6}\b"
print(re.findall(pattern, text))

# 13️⃣ Extract Words Starting with Capital Letter
divider("13️⃣ Extract words starting with capital letters")
text = "Python is Popular in India and USA"
pattern = r"\b[A-Z][a-zA-Z]*\b"
print(re.findall(pattern, text))

# 14️⃣ Extract Year Range
divider("14️⃣ Extract all 4-digit years")
text = "Data from 1999, 2020, and 2025 are used"
pattern = r"\b\d{4}\b"
print(re.findall(pattern, text))

# 15️⃣ Mask Sensitive Data (Hide last 4 digits of card)
divider("15️⃣ Mask card numbers except last 4 digits")
text = "Credit card 4539123456785678 used successfully"
masked = re.sub(r"\d{12}(\d{4})", r"************\1", text)
print(masked)

# 16️⃣ Extract Decimal Numbers
divider("16️⃣ Extract floating point numbers")
text = "The values are 3.14, 6.28 and 42"
pattern = r"\d+\.\d+"
print(re.findall(pattern, text))

# 17️⃣ Find Words of Length > 5
divider("17️⃣ Extract words longer than 5 letters")
text = "Regex makes pattern matching powerful"
pattern = r"\b\w{6,}\b"
print(re.findall(pattern, text))

# 18️⃣ Validate Indian Mobile Numbers (start with 6-9)
divider("18️⃣ Validate Indian mobile numbers")
numbers = ["9876543210", "8123456789", "1234567890"]
pattern = r"^[6-9]\d{9}$"
for n in numbers:
    print(n, "✅" if re.match(pattern, n) else "❌")

# 19️⃣ Extract CSV Fields
divider("19️⃣ Split a CSV line by comma using regex")
csv_line = "Rajeev,28,India,Data Engineer"
print(re.split(r",", csv_line))

# 20️⃣ Extract HTML Tags
divider("20️⃣ Extract HTML tags from markup")
html = "<div><h1>Title</h1><p>Paragraph</p></div>"
pattern = r"</?(\w+)>"
print(re.findall(pattern, html))

print("\n🎉 CONGRATULATIONS! You completed the Regex Playground!")
print("👉 Try modifying the patterns and text above to experiment more.")

🧠 What This Script Does

  • You get 20 real-world regex tasks, each printed with a nice divider.
  • You can edit text or pattern in any cell to see instant changes.
  • It includes cleaning, validation, data extraction, and log parsing tasks — the same things used in industry daily.

🧱 Next Steps (Optional Add-ons)

Would you like me to:

  1. 🧭 Add guided hints and progressive difficulty (like a regex coding game that waits for user input)?
  2. 📊 Add a score tracker + explanation popup system (ideal for Jupyter Notebook)?
  3. 📘 Add a “Regex Reference Sheet” (mini cheat sheet with all metacharacters, quantifiers, and group examples)?

Which one should I build next for you?