The Dark Knight Problem¶
Here's a quote from The Dark Knight
quote = (
"Because we have to chase him.Because he's the hero Gotham "
"deserves, but not the one it needs right now, so we'll hunt "
"him.Because he can take it, because he's not a hero.He's a "
"silent guardian, a watchful protector, a Dark Knight."
)
Split the string into a list of distinct sentences. Each sentence should end with and include a period.
Expected result
quotes = [
'Because we have to chase him.',
"Because he's the hero Gotham deserves, but not the one it needs right now, so we'll hunt him.",
"Because he can take it, because he's not a hero.",
"He's a silent guardian, a watchful protector, a Dark Knight."
]
Regex Functions
Function | Description | Return Value |
---|---|---|
re.findall(pattern, string, flags=0) |
Find all non-overlapping occurrences of pattern in string | list of strings, or list of tuples if > 1 capture group |
re.finditer(pattern, string, flags=0) |
Find all non-overlapping occurrences of pattern in string | iterator yielding match objects |
re.search(pattern, string, flags=0) |
Find first occurrence of pattern in string | match object or None |
re.split(pattern, string, maxsplit=0, flags=0) |
Split string by occurrences of pattern | list of strings |
re.sub(pattern, repl, string, count=0, flags=0) |
Replace pattern with repl | new string with the replacement(s) |
Regex Patterns
Pattern | Description |
---|---|
[abc] |
a or b or c |
[^abc] |
not (a or b or c) |
[a-z] |
a or b ... or y or z |
[1-9] |
1 or 2 ... or 8 or 9 |
\d |
digits [0-9] |
\D |
non-digits [^0-9] |
\s |
whitespace [ \t\n\r\f\v] |
\S |
non-whitespace [^ \t\n\r\f\v] |
\w |
alphanumeric [a-zA-Z0-9_] |
\W |
non-alphanumeric [^a-zA-Z0-9_] |
. |
any character |
x* |
zero or more repetitions of x |
x+ |
one or more repetitions of x |
x? |
zero or one repetitions of x |
{m} |
m repetitions |
{m,n} |
m to n repetitions |
{m,n} |
m to n repetitions |
\\ , \. , \* |
backslash, period, asterisk |
\b |
word boundary |
^hello |
starts with hello |
bye$ |
ends with bye |
(...) |
capture group |
(po|go) |
po or go |