extracting numbers from strings

I am trying to read my experimental data from an excel file, where what is in one column decides that a number should be read in another, in multiple verses, in multiple files, and creates a double as an output. (I will omit the part of the code that loads the files)
the below code works well for numerical values.
NegativeView=[];
for nl=1:length(logs(:,1))
if (strfind(logs{nl,5},'negative_view_start'))==1 % jak
NegativeView(end+1)=str2num(logs{nl,6});
end;
that is if the stuff in column six is a number, I get what I wanted to get
however for another variable I have a mixed string, namely the value in a column that needs to be read will be output_33, output_66 etc, and I'd like to have a double with just 33 or 66 as a numerical value.
Tried using the regexprep function, to transform output_33 to 33 etc.; with no success. HELP
an example of what I tried is below:
rate={}
output=[]
for nl=1:length(logs(:,1))
if strcmp(logs{nl,4},'output_')
rate(end+1)=(regexprep(logs{nl,5},'output_',''))
output(end+1)=str2num(rate)
end;

Risposte (2)

Adam
Adam il 15 Dic 2016
If your strings are always of the form
someString_someNumber
then you can just use something more simple like
splitStr = strsplit( str, '_' );
n = str2num( splitStr{2} )

1 Commento

moniken
moniken il 15 Dic 2016
I still don't know how to include either this function or the regexprep function in my code, so that it does what I want it to do. I need matlab to extract the number from the someString_someNumber form in column 5 each time that column 4 in the same row contains the string 'outcome_'. And I need the output to be a double array.
Guillaume
Guillaume il 15 Dic 2016
Using regexprep seems roundabout. Why not use regexp to extract what you need rather than replacing what you don't need?
One possible regex:
output(end+1) = str2double(regexp(logs{nl, 5}, '(?<=output_).*', 'match', 'once'))

2 Commenti

moniken
moniken il 15 Dic 2016
Thank you Could you explaing what these stand for: '(?<=output_).*', 'match', 'once'
The regular expression language is well detailed in matlab's documentation and, if it's not enough, there are plenty of tutorials on the net.
(?<= ) is a lookbehind. It means that the match must be preceded by the expression in the lookbehind, in this case, output_
. is a match for all characters. * is a quantifier which means match 0 or more of the preceding character. Actually, I should have used + (1 ore more).
So the regular expression match a sequence of 0 or more of any character immediately following output_. There are many other ways you could have written the expression depending on what you want to accept/reject. E.g:
regexp(logs{nl, 5}, '\d+', 'match', 'once')
may also work for you if you're only looking at integer (it simply extracts any sequence of numeric digits.
As per the documentation of regexp, 'match' tells it to return the match (by default it just return the start position), and 'once' tells it to only do the matching once. It's not strictly necessary in your case.

Questa domanda è chiusa.

Tag

Richiesto:

il 15 Dic 2016

Chiuso:

il 20 Ago 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by