Regular Expressions
Regular expressions (called REs, or regexes, or regex patterns) are essentially a tiny, highly specialized programming language embedded inside Python and made available through the re
module.
Using this little language, you specify the rules for the set of possible strings that you want to match; this set might contain English sentences, or e-mail addresses, or TeX commands, or anything you like. You can then ask questions such as “Does this string match the pattern?”, or “Is there a match for the pattern anywhere in this string?”. You can also use REs to modify a string or to split it apart in various ways.
Find All Matches
import re
pattern = "ab"
content = "abcabcdbab"
results = re.findall(pattern, content)
print(results) # ['ab', 'ab', 'ab']
pattern = "[0-9_]+" # digit or underscore which occurs multiple times
content = "56abc789h__31"
results = re.findall(pattern, content)
print(results) # ['56', '789', '__31']
# ignore case
pattern = "ab"
content = "AbcabcdbaBB"
results = re.findall(pattern, content, re.IGNORECASE)
print(results) # ['Ab', 'ab', 'aB']
Get First Match
import re
pattern = "[0-9_]+"
content = "56abc789h__31"
result = re.search(pattern, content)
print(result) # <re.Match object; span=(0, 2), match='56'>
print(result.group()) # 56
Split Strings by Pattern
import re
pattern = "[0-9_]+"
content = "56abc789h__31hello"
segments = re.split(pattern, content)
print(segments) # ['', 'abc', 'h', 'hello']
Substitute Substrings by Pattern
import re
pattern = "[0-9_]+"
content = "56abc789h__31hello"
result = re.sub(pattern, '***', content)
print(result) # ***abc***h***hello
Code Challenge
Try to modify the regex pattern in the editor to extract all phone numbers.
Loading...
> code result goes here