The re
module in Python provides functions for working with regular expressions. It allows you to match, search, replace, and split text based on patterns.
Basic Usage
Python
import re
text = "The phone number is 415-555-1212."
pattern = r'\d{3}-\d{3}-\d{4}'
match = re.search(pattern, text)
if match:
print("Phone number found:", match.group())
else:
print("Phone number not found")
Common Functions
re.match()
: Matches the pattern at the beginning of the string.re.search()
: Searches for the pattern anywhere in the string.re.findall()
: Returns a list of all non-overlapping matches in the string.re.sub()
: Replaces occurrences of the pattern with a new string.re.split()
: Splits the string at occurrences of the pattern.
Example:
Python
import re
text = "The quick brown fox jumps over the lazy dog."
# Find all words
words = re.findall(r'\w+', text)
print(words)
# Replace 'fox' with 'cat'
new_text = re.sub(r'fox', 'cat', text)
print(new_text)
Match Object
The re.search()
and re.match()
functions return a match
object if there is a match. This object contains information about the match, such as the matched text, start and end positions, and groups.
Python
match = re.search(r'(\d{3})-(\d{3})-(\d{4})', text)
if match:
area_code, exchange, number = match.groups()
print(area_code, exchange, number)
Compiling Regular Expressions
For performance reasons, you can compile a regular expression into a pattern object using re.compile()
.
Python
import re
phone_pattern = re.compile(r'\d{3}-\d{3}-\d{4}')
match = phone_pattern.search(text)
Additional Features
- Flags: Modify the behavior of regular expressions (e.g.,
re.IGNORECASE
for case-insensitive matching). - Groups: Capture specific parts of the matched text using parentheses.
- Lookahead and lookbehind assertions: Match text without including it in the match.
By understanding these core functions and concepts, you can effectively use regular expressions in your Python programs.
Using the re Module
What is the re
module in Python?
The re
module provides functions for working with regular expressions.
Why use the re
module?
To efficiently search, match, and manipulate text strings based on patterns.
What is the difference between re.match
and re.search
?
re.match
matches only at the beginning of the string, while re.search
searches the entire string.
How do I find all occurrences of a pattern in a string?
Use re.findall()
.
How do I replace occurrences of a pattern in a string?
Use re.sub()
.
What is a raw string (r”) in Python?
It prevents backslashes from being interpreted as escape characters.
When should I use regular expressions?
For complex text processing tasks that require pattern matching.
How can I improve the readability of regular expressions?
Use comments and whitespace within the expression.
What are some common pitfalls to avoid?
Overly complex regular expressions, incorrect escaping, and forgetting to handle edge cases.