I have a question regarding the following regex:
match = /^([^[]+?)(\[.*\])?$/.exec(path);
I don't understand the behavior of the "?" in the first expression:
^([^[]+?)
I mean, if this expression was an independent regex, and path was "abc[def]", I would have got: "a" as match[1], right? (due to the lazy match). Now, when I add the second expression, match[1] is: "abc". Could you please explain the difference?
Thanks, Li
if you use ? you are actually saying ,
may or may not
lazy matching
ab? is a with or without b ( one time only)
but in this format :
a+? is : "try search a's but don't be greedy"
so only the first [a] in [aaaaaaa] will be matched here.
/^([^[]+?)/.exec("abc[def]"); //["a", "a"]
why is that ?
becuase you are searching
from the start----everything not including [ but search the min occurences.
thats your a
but when youre doing
/^([^[]+?)(\[.*\])?$/.exec("abc[def]");
the one which congusing you is :
.* in the secong group.
The ? after the + swaps the behaviour of the regex engine to ungreedy. By default the engine tries to match the largest string available. With a trailing ? it tries to get the shortest.
More information are available here: http://www.regular-expressions.info/repeat.html
The ? in a construction like +? or *? causes the operator preceding it to behave in a non-greedy, or lazy, fashion. This means it will consume as few characters as possible instead of as many as possible (as is the default).
However, in this particular regex, there are no strings for which the ? changes the behavior.
/^([^[]+?)(\[.*\])?$/
Since the first group ([^[]+?) must be followed by either the end of the string or a [ and the first group can't contain a [, it will match either the entire string (if no [ in it) or up to the first [, or it won't match at all. So in this case, the greediness of + is irrelevant.