I have a string on which I try to create a regex mask which is of N The word will show the number, an offset given that I have the following string:
"Quick, brown fox jumps on lazy dogs."
I want to show 3 words:
offset 0 : "quick, brown"
Offset 1 : "quick, brown fox"
offset 2 : "brown fox jump" Offset 3 : "offset 5 : " offset <"> on the office 4 : "jumps over" more than lazy "
offset 6 : " lazy dogs. "
And I'm using the following simple regex to find 3 words Land:
& gt; & Gt; & Gt; Import re
& gt; & Gt; & Gt; S = "Quick, brown fox jumps on lazy dogs."
& gt; & Gt; & Gt; Re.search (r '(\ w + w *) {3}', s) .group ()
'quick, brown'
prefix matching options
you can first Then something like this: Let's look at the pattern: This does what the person says: match It should be said that There are no 3 words, but regex was able to match anyway, because Perhaps a better pattern is something like this: that is , As suggested by cabbage in a comment, this option is simple that you have only one static pattern to catch all the fairs. Uses How it works is that it matches the zero-width word limit offset words, and having a variable-prefix rezux to capture the word Tripathi in a group.
import re s = "Quick, brown fox jumps on lazy dogs." Print re-search (R: '(?: \ W + \ w *) {(}: ((?: \ W + \ w *) {3}), S). GROUP (1) # Quick, brown Print search again (r? (?: \ W + \ w *) {1} ((?: \ W + \ w *) {3}), s) .group (1) # fast, brown fox Print re-search (R? (?: \ W + \ w *) {2} ((?: \ W + \ w *) {3}), S). Group (1) # brown fox jumper
_ "word" _ _ "word" _ / \ / (?: \ W + \ w *) {2} ( (? :: W + \ w *) {3}) \ _____________ / Group1 2 then Capturing Group 1, match 3 word. (?: ...) Construction is used for grouping for recurrence, but they are non-capturing.
Focus on the "word" pattern
\ w + \ w * is a bad option for a "word" pattern, as demonstrated by the following example: import again = "nothing" print re-search (r '(\ w + \ w *) {3}', s). Group () #Nothing \ w * is empty String allows to match.
\ w + (?: \ W + | $) \ W + is done after either \ W + or the end of the string $ .
Capturing LookHead Option
findall :
import re s = "Quick, brown fox jumps on lazy dogs." Triplets = R.Fundall (R "\ b (? = ((?:: W W + (?: \ W + | $)) {3})", s) Print triples # ['The Quick, Brown', ' Early, brown fox ',' brown fox jumps', # 'fox jumps',' jumping over ',' over lazy ',' lazy dogs. '] Print jumps over three times [3] # fox < Code> \ b , to capture 3 "words" in group 1 LookHead is used
______lookahead______ / ___ "word" __ \ / / \ \ b (? = ((?: \ W + (?: \ W + | $)) {3}) ) \ ___________________ / group1 context
Comments
Post a Comment