regexp - match regular expression question

1 visualizzazione (ultimi 30 giorni)
Kenny
Kenny il 30 Set 2016
Modificato: Kenny il 10 Nov 2016
Hi all,
In the Matlab 'help' documents for the function called regexp, I'm trying to understand the what the vertical line ( ie. | ) means in the pattern layout below. The example below comes directly from Matlab's help area .... after typing 'help regexp'.
The help documentation indicates:
"|" means Match subexpression before or after the "|"
What I would like to ask is. What does the above mean exactly? At the moment, I'm thinking 'which is it?' .... I was expecting that a match would either be 'before', or it would be 'after'.... but not both before OR after. But even if it really means 'match before OR after', what does that mean exactly? For example, what does "|" actually represent?
Thanks in advance.
str = 'John Davis; Rogers, James';
pat = '(?<first>\w+)\s+(?<last>\w+)|(?<last>\w+),\s+(?<first>\w+)';
n = regexp(str, pat, 'names')
  2 Commenti
Stephen23
Stephen23 il 30 Set 2016
The | is an exclusive or. Here is an example of how it works, tested on a string with four slightly different "words":
>> regexp('a123z a%%%z a1%3z a__z','a(\d+|%+)z','match')
ans =
'a123z' 'a%%%z'
The pattern matches all sequences starting with a, ending with z, and containing XOR(digits,%-symbols). The third "word" in the string does not match this because it contains both digits and %-smbols, the fourth contains only underscore, so also does not match the regex. Now lets alter the regex and use two |, to give XOR(digits,%-symbols,underscores):
>> regexp('a123z,a%%%z,a1%3z,a__z','a(\d+|%+|_+)z','match')
ans =
'a123z' 'a%%%z' 'a__z'
Bonus if you want a convenient way to test and experiment with regular expressions, you can try my FEX submission:
Kenny
Kenny il 30 Set 2016
Modificato: Kenny il 1 Ott 2016
Hi Stephen !! Thanks for going out of your way to help me as well. The example that you gave is truly excellent. Thanks very much for showing this. The regexp function is so powerful, but it helps a great deal when you and S.S. add great understandable examples. When I first looked at those 'code' patterns from inbuilt examples, it didn't have the nice explanations that allowed followers to follow through, and understand. Thanks for mentioning XOR, and the bonus link too! Best regards! Thanks a lot again. Kenny

Accedi per commentare.

Risposta accettata

Star Strider
Star Strider il 30 Set 2016
Modificato: Star Strider il 30 Set 2016
When I’ve used the ‘|’ (‘or’) operator, I’ve used it to match either of the two (or more) sub-expressions in the expression string. In this instance, if it detects a comma it labels the first string as the last name and the second expression as the first name. If it does not detect a comma, it does the reverse. The presence or absence of a comma in the target string determines which sub-expression will return the result, because the target string with a comma will return an empty value for the sub-expression without a comma, and the reverse is true for the other sub-expression.
If you want to see how this works in practice, try it with only one sub-expression (and without the ‘|’ operator). That’s the easiest (and most instructive) way to see how a particular syntax works.
EDIT Clarified an ambiguity in the original.
  2 Commenti
Kenny
Kenny il 30 Set 2016
Thanks so much for your help and time S.S. ! That helped me a lot tremendously. Thanks for helping me. Genuinely appreciated S.S.
Star Strider
Star Strider il 30 Set 2016
As always, my pleasure!

Accedi per commentare.

Più risposte (0)

Categorie

Scopri di più su Just for fun in Help Center e File Exchange

Tag

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by