Azzera filtri
Azzera filtri

Looking for an alternative to regexp.

6 visualizzazioni (ultimi 30 giorni)
Bob Thompson
Bob Thompson il 23 Mar 2021
Modificato: Stephen23 il 25 Mar 2021
I'm looking for an alternative way to parse through strings to find bits of information, or for a way to use regexp that doesn't give me nested cells. I'm tired of dealing with the nested cells.
I've got a string that contains node numbers and locations. I would like to capture all of the node numbers, and then put them into a double array. I can identify and extract the numbers with regexp, but any time I use regexp with tokens I end up with cells inside of cells for a reason that I don't entirely understand. Am I doing something to create the extra layer of cells, or is there another command that can parse and extract the information I want?
singlestring = 'nxyzs=74xyz[0]:-2.0447000e+010.0000000e+001.8288000e+00Nearestnodeis7736664atadistanceof4.6823094e-03locatedat-2.0451682e+012.2396341e-161.8288000e+00';
repeatstrings = repmat(singlestring,1,5);
nodes = regexp(repeatstrings,'Nearestnodeis(\d+)','tokens');
The nodes variable will contain a 1x5 cell matrix, where each cell contains a 1x1 cell, which contains the node number string.
  2 Commenti
Stephen23
Stephen23 il 24 Mar 2021
Modificato: Stephen23 il 25 Mar 2021
Tokens are always returned in a cell array (with size equal to the number of tokens (thus in your case scalar, because you only specified one token)). If multiple matches is enabled (the default) then every output is nested in a cell array (with size equal to the number of matches made), so you will get nested cell arrays of tokens.
FYI, if you only need to match the regular expression exactly once, then you can specify the 'once' option and the outputs are not nested in cell arrays. This does not apply to your example, but is useful in other cases.
As well as concatenating the output data or using named tokens as the answers below show, you can also use a look-behind assertion and return the matched string (no nested cell arrays), which makes post-processing much simpler:
nodes = regexp(repeatstrings,'(?<=Nearestnodeis)\d+','match')
nodes = 1×5 cell array
{'7736664'} {'7736664'} {'7736664'} {'7736664'} {'7736664'}
vec = str2double(nodes)
vec = 1×5
7736664 7736664 7736664 7736664 7736664
Bob Thompson
Bob Thompson il 24 Mar 2021
Thanks, I definitely think this is more smooth than what I usually attempt.

Accedi per commentare.

Risposte (2)

Star Strider
Star Strider il 23 Mar 2021
See if adding either:
Out = cell2mat([nodes{:}].')
or:
Out = str2num(cell2mat([nodes{:}].'))
to the posted code provides the desired result.
Note that str2num is not generally recommended, however it works when str2double produces an unacceptable result.

Walter Roberson
Walter Roberson il 23 Mar 2021
singlestring = 'nxyzs=74xyz[0]:-2.0447000e+010.0000000e+001.8288000e+00Nearestnodeis7736664atadistanceof4.6823094e-03locatedat-2.0451682e+012.2396341e-161.8288000e+00';
repeatstrings = repmat(singlestring,1,5);
nodes = regexp(repeatstrings,'Nearestnodeis(?<NN>\d+)','names');
str2double({nodes.NN})
ans = 1×5
7736664 7736664 7736664 7736664 7736664
  3 Commenti
Walter Roberson
Walter Roberson il 23 Mar 2021
(?<WORD>PATTERN)
creates a named token; whatever is matched by PATTERN gets stored in a struct field named WORD, as text. But even though it is called a "named token", oddly enough to get back the struct, you have to ask for "names" instead of for "tokens".
You get back a struct array, one struct array entry for each time the overall pattern matches -- in this case one for each time Nearestnodeis is followed by a sequence of digits. So a 5 x 1 struct in this case, each with a field named as indicated, NN. So as usual with struct arrays you call pull out all of the entries using struct expansion inside a {}, creating a cell array of character vectors, and then you can convert them all at once using str2double() on the cell array.
Bob Thompson
Bob Thompson il 24 Mar 2021
Thanks for the explanation. I do like structures better than cells, most of the time.

Accedi per commentare.

Categorie

Scopri di più su Structures in Help Center e File Exchange

Tag

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by