Replacing only certain instances of text within matlab character array

3 visualizzazioni (ultimi 30 giorni)
I have a large character array in matlab: 'lineDataA' - containing many different numbers.
I would like to find and replace all instances of the number '6002' and replace with '0', apart from the very first instance.
lineData = replace(lineDataA, '6002', '0');
This replaces all instances
And
where6002 = strfind(lineDataA, '6002');
Gives the position of all the instances. However I am not sure how to replaces all the instances except the first?
Many thanks for your help,
Rob

Risposta accettata

Stephen23
Stephen23 il 20 Gen 2017
Modificato: Stephen23 il 20 Gen 2017
Method One: split the string
>> str = '___6002__6002___6002___6002__';
>> idx = regexp(str,'6002','once','end');
>> strcat(str(1:idx),strrep(str(idx+1:end),'6002','0'))
ans =
___6002__0___0___0__
Method Two: use a placeholder
>> str = '___6002__6002___6002___6002__';
>> str = regexprep(str,'6002','\b','once');
>> str = strrep(str,'6002','0');
>> regexprep(str,'\b','6002')
ans =
___6002__0___0___0__
Note that the original string must not contain \b.
Method Three: dynamic regular expression
>> str = '___6002__6002___6002___6002__';
>> regexprep(str,'(.*?6002)(.*)','$1${strrep($2,''6002'',''0'')}')
ans =
___6002__0___0___0__
  2 Commenti
John Leal
John Leal il 16 Ott 2017
I have a similar problem. I need to replace some words for others in an extense array. I have the code but is too slow. Can you help me to find a way to make it better?:
if true
% code
textData = regexprep(textData, '[@$/#.-:-&*+=[]?!(){},''">_<;%]|', ' ');
% Remove any non alphanumeric characters
textData = regexprep(textData, '[^a-zA-Zñ ]', '');
textData = regexprep(textData, '[0-9]+', ' ');
textData = regexprep(textData, '<[^<>]+>', ' ');
textData = regexprep(textData, 'á', 'a');
textData = regexprep(textData, 'é', 'e');
textData = regexprep(textData, 'í', 'i');
textData = regexprep(textData, 'ó', 'o');
textData = regexprep(textData, 'ú', 'u');
textData = regexprep(textData, 'ñ', 'n');
textData = regexprep(textData, 'x', 's');
textData = regexprep(textData, 'cc', 'c');
textData = regexprep(textData, 'ci', 'si');
% deletedWords = ["helllo","hello";"moter","mother"] ... 50000 rows
% excludedWords = ["father","three", "tree"]... words I don't want to replace
% textData = ["my mother lives with my father";"hello Word"]... 2 million rows.
m = length(deletedWords(:,1));
for idx=1:m
w_new = deletedWords{idx,1};
w_ok = deletedWords{idx,2};
f = find(excludedWords==w_new, 1);
% only if it is not in excludesWords
if isempty(f)
% Replace EXACT word match"
textData = regexprep(textData,"(?<![\w])"+w_new+"(?![\w])" ,w_ok );
end
end
end
John Leal
John Leal il 16 Ott 2017
The main idea is to correct misspelling words in SPANISH. It is like a handmade stem adjust to my specific data. deletedWords contains the misspelling word and the correct word. These words are extracted from the same textData using jaro wrinkler to convert less frequent word to a high frequent word with more than 95% similarity.
Ty

Accedi per commentare.

Più risposte (0)

Categorie

Scopri di più su Environment and Settings in Help Center e File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by