According to Python's how-to documentation on regular expressions, "Repetitions such as
* are greedy; when repeating a RE, the matching engine will try to repeat it as many times as possible. If later portions of the pattern don’t match, the matching engine will then back up and try again with fewer repetitions."
One way to think of this is that they are greedy, but not so greedy that it will throw away the whole match just to grab a few more characters for themselves.
To illustrate, consider this regular expression:
p = re.compile(r"([ab]*)bb") p.search('aabbaabbaacc')
This matches 'aabbaabb'. The group matched by
([ab]*) is "aabbaa". How's it work?
.*does go all the way through the string matching "a"s or "b"s until it gets to the last "a" or "b". It finds "aabbaabbaa", but since that is not followed with "bb", this check fails to match: aabbaabbaacc
Nat Dunn founded Webucator in 2003 to combine his passion for technical training with his business expertise and to help companies benefit from both. His previous experience was in sales, business and technical training, and management. Nat has an MBA from Harvard Business School and a BA in International Relations from Pomona College.