remove numbers from a name

There is a name that have also numbers and letters and characters like '_'. For example:
name='12345_2323_abc_cd'
Now it is needed to remove numbers. And also remove '_' if this character is between numbers (if this character is between letters should not be remove). Then "name" must convert to:
name='abc_cd'
How could I do this converting in MATLAB? Please help.

Risposte (4)

Jan
Jan il 8 Lug 2012
Beside the powerful regexp methods, a simple approach:
name = '12345_2323_abc_cd';
index = find(isletter(name), 1);
name = name(index:end);

2 Commenti

mohammad
mohammad il 8 Lug 2012
Thanks, really nice Jan
Image Analyst
Image Analyst il 8 Lug 2012
But this won't give the answer you wanted for your case "a" in your comment to me, where you wanted a leading underline. Jan's method doesn't give you the underline as the first character like you asked for. You'd have to check for that case and have an "If" block (like I did) to handle whether or not to strip off the leading underline. Did you make a mistake when you explained the different cases to me?

Accedi per commentare.

Mark Whirdy
Mark Whirdy il 7 Lug 2012
welcome to the frustrating world of regular expressions!!
regexp(name,'[a-z]\w+','match')
all the best,
Mark

5 Commenti

mohammad
mohammad il 7 Lug 2012
So nice dear!!!
This will not remove all numbers: it will only remove numbers if they are before the first lower-case letter. It also will not remove underscores that are between numbers.
You are right dear Walter thanks I use this:
for i=1:size(namefile,2)
if namefile(1,i)=='1' || namefile(1,i)=='2' || namefile(1,i)=='3' || namefile(1,i)=='4'|| namefile(1,i)=='5' || namefile(1,i)=='6' || namefile(1,i)=='7'|| namefile(1,i)=='8' || namefile(1,i)=='9' || namefile(1,i)=='0'
namefile(1,i)='';
end
end
for i=1:size(namefile,2)
if namefile(1,1)=='_'
namefile(1,i)='';
end
end
Mark Whirdy
Mark Whirdy il 7 Lug 2012
I'm not sure what you're doing exactly but this achieves exactly what was asked (as shown below), if the format of the expression varies from that above, the regular expression may be adapted simply. If for some reason this doesn't work for all your test cases, then if you supply those test cases, I'd be happy to adapt the expression. There is definitely no need for any "for looping" etc
>> regexp('12345_2323_abc_cd','[a-z]\w+','match') ans = 'abc_cd'
Your code would not remove the numbers in abcd678
Now it is needed to remove numbers.

Accedi per commentare.

Walter Roberson
Walter Roberson il 7 Lug 2012
regexprep(name, '\d[0-9_]+\d', '')
This expression makes the guess that underscores should only be removed if there are digits on both sides of them, so for example in 1234_abcd this would become _abcd . This guess is supported by the wording of the Question.
On the other hand, the expression above has the weakness that multiple underscores together will be deleted if the group is between numbers. This is not supported by the wording of the Question. I think that perhaps the below would work to fix that:
regexprep(name, '\d+(?:_(?=\d))?', '')

2 Commenti

mohammad
mohammad il 7 Lug 2012
really thanks dear Walter
Hmmm, the first of those won't work either. The second might.

Accedi per commentare.

Image Analyst
Image Analyst il 7 Lug 2012
Try this:
% Create the name using the only example we have.
name='12345_2323_abc_cd'
% Find locations of all the underlines.
underlineLocations = find(name == '_', 2, 'first')
% Assume numbers occur earlier than the second underline.
% We have no examples to suggest otherwise.
outputName = name(underlineLocations(2)+1:end)

2 Commenti

mohammad
mohammad il 7 Lug 2012
Modificato: mohammad il 7 Lug 2012
Thanks dear, but we don't know how many underlines there are and also we know nothing about being numbers earlier than second underline generally. We must first detects numbers and remove all of them (I don't know how I must do this) then remove all underlines that occurs at first or at the end.
a)'12345_2323_abc_cd'---> '_abc_cd'
b)'_abc_cd'--->'abc_cd'
Please help for doing a and b.
It's always good to give all variant of your input when you first ask the question. This will do what you asked for the examples you gave:
% Example 1
stringIn = '12345_2323_abc_cd' % want '_abc_cd'
nameOut = regexprep(stringIn, '\d[0-9_]+\d', '')
if stringIn(1) == '_'
nameOut = nameOut(2:end)
end
% Example 2
stringIn = '_abc_cd' % Want 'abc_cd'
nameOut = regexprep(stringIn, '\d[0-9_]+\d', '')
if stringIn(1) == '_'
nameOut = nameOut(2:end)
end
% Example 3
stringIn = '213231_12345_2323_abc_cd' % want '_abc_cd'
nameOut = regexprep(stringIn, '\d[0-9_]+\d', '')
if stringIn(1) == '_'
nameOut = nameOut(2:end)
end

Accedi per commentare.

Categorie

Richiesto:

il 7 Lug 2012

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by