How can I compare two strings, ignoring any white space or punctuation characters?
107 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
MathWorks Support Team
il 21 Ott 2009
Modificato: MathWorks Support Team
il 2 Mag 2023
I would like to compare two strings. The strings may have varying numbers of spaces and punctuation characters (not necessarily at the beginning or end of the string), which I would like to ignore.
Risposta accettata
MathWorks Support Team
il 30 Dic 2021
Modificato: MathWorks Support Team
il 16 Nov 2021
You can use regular expressions to remove the characters from the strings that you would like to ignore in the comparison. You can then use the modified strings to perform the comparison.
The following example illustrates how you can use regular expressions to remove white space and punctuation characters from a string. The REGEXP function is used to match the regular expression:
a = 'test';
b = 'te s.t';
%Create a regular expression
%This expression matches any character except a whitespace, comma, period, semicolon, or colon
exp = '[^ \f\n\r\t\v.,;:]*';
%Find the matches within the string
b1 = regexp(b, exp, 'match');
%Concatenate all matches into a single string
b1 = [b1{:}];
%Repeat above for the other string
a1 = regexp(a, exp, 'match');
a1 = [a1{:}];
%Compare the modified strings
match = strcmp(a1, b1)
To learn more about creating regular expressions and using the REGEXP function, please see the following documentation pages:
>>web(fullfile(docroot, 'matlab/ref/regexp.html'))
3 Commenti
Walter Roberson
il 2 Mar 2019
John: No, if you 'split' with that expression, you would be left only with emptiness and the punctuations. You would need to remove the ^ from the inside of the [] to use 'split'
Walter Roberson
il 2 Mar 2019
Modificato: MathWorks Support Team
il 2 Mag 2023
exp = '[^ \f\n\r\t\v.,;:]*';
can also be written
exp = '[^\s.,;:]*'
The \s is documented as being equivalent to [ \f\n\r\t\v]
Any time you use the [^] construct, you need to be careful about unicode, which has a number of ways to express white space https://jkorpela.fi/chars/spaces.html and https://www.compart.com/en/unicode/category/Po punctuation. For example ':' is not a colon ':' and is instead U+16EC Runic Multiple Punctuation
Più risposte (0)
Vedere anche
Categorie
Scopri di più su Characters and Strings in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!