How do I get my MATLAB editor to read UTF-8 characters? UTF-8 characters in blank squares in editors, but in the command window and workspace works fine.

321 visualizzazioni (ultimi 30 giorni)
I have a project, which is commented in UTF-8 characters.
I tried changing the system locale on my Windows 10, however MATLAB editor is not recognizing UTF-8 characters(in blank squares). I'm not sure what to do here.
If I open the same .m file in text editor, it works fine.
How do I get my MATLAB editor to read UTF-8? Thank you
feature('DefaultCharacterSet')
feature('locale')
ans =
UTF-8
ans =
ctype: 'en_US.windows-1252'
collate: 'en_US.windows-1252'
time: 'en_US.windows-1252'
numeric: 'en_US_POSIX.windows-1252'
monetary: 'en_US.windows-1252'
messages: 'en_US.windows-1252'
encoding: 'windows-1252'
terminalEncoding: 'windows-949'
jvmEncoding: 'Cp1252'
status: 'MathWorks locale management system initialized.'
warning: 'System locale setting, ko_KR, is different from user locale setti…'
  3 Commenti
Hongyu Shi
Hongyu Shi il 1 Mag 2016
I'm having the same issue for a long time. It seems that sometimes it can handle Chinese characters when I change my system into Chinese default language. But in English system they just becomes a lot of question marks.
Keith Hooks
Keith Hooks il 30 Ago 2016
I'm having the same problem. It seems that editing the lcdata.xml file to change the encoding of en_US has no effect at all. It stays windows-1252 no matter what I change it to in the file.

Accedi per commentare.

Risposta accettata

Jinghao Lei
Jinghao Lei il 20 Ott 2016
I have a very tricky way to solve this problem. And it seems works. In my case, (windows matlab 2016b x64)
feature('locale')
always output below even I have modified lcdata.xml
ctype: 'zh_CN.GBK'
...
so, I delete this in lcdata.xml (in codeset)
<encoding name="GBK">
<encoding_alias name="936">
</encoding>
then I change following
<encoding name="UTF-8">
<encoding_alias name="utf8"/>
</encoding>
to
<encoding name="UTF-8">
<encoding_alias name="utf8"/>
<encoding_alias name="GBK"/>
</encoding>
The point is cheat matlab GBK is just alias of utf8
  9 Commenti
Isidora Timkov-Glumac
Isidora Timkov-Glumac il 14 Set 2023
@Jinghao Lei I have the same version of MatLab as you, but my ctype is 'en_US_POSIX.US-ASCII'. How should I modify the lcdata file so that this hack works for me? Thank you!
Simon Diehl
Simon Diehl il 30 Nov 2023
This solution worked for me as welll. I'm using Matlab R2023b on Windows Server 2019. My encoding was windows-1252.
This is not necessary on newer operating systems. I don't experience this problem with R2023b on Windows 10.
My steps where:
  1. Go to Program Files\MATLAB\R2023b\bin
  2. Rename lcdata.xml to lcdata_old.xml
  3. Copy lcdata_utf8.xml and rename it to lcdata.xml
  4. Open the file and go to section codeset <!-- Codeset entry -->
  5. Comment out <encoding name="windows-1252" ...
  6. Go to <encoding name="UTF-8"> and add the alias <encoding_alias name="1252">
The final sections look like this:
<!-- <encoding name="windows-1252" jvm_encoding="Cp1252">
<encoding_alias name="1252"/>
</encoding> -->
<encoding name="UTF-8">
<encoding_alias name="utf8"/>
<encoding_alias name="1252"/>
</encoding>
And the result is:
feature("locale")
ans =
struct with fields:
ctype: 'de_DE.UTF-8'
collate: 'de_DE.UTF-8'
time: 'de_DE.UTF-8'
numeric: 'en_US_POSIX.UTF-8'
monetary: 'de_DE.UTF-8'
messages: 'de_DE.UTF-8'
encoding: 'UTF-8'
terminalEncoding: 'IBM850'
jvmEncoding: 'UTF-8'
status: 'MathWorks locale management system initialized.'
warning: ''
The editor now works as expected. Thank you very much for the solution @Jinghao Lei

Accedi per commentare.

Più risposte (1)

Michael Cappello
Michael Cappello il 31 Ott 2017
Modificato: Rik il 30 Mar 2023
% read in the file
fID = fopen(filename, 'r', 'n', 'UTF-8');
bytes = fread(fID);
fclose(fID);
The data read from the file can then be converted into Unicode characters, like so:
unic = native2unicode(bytes, 'UTF-8');
if you want, clear the Carriage Returns, set the Line Feeds to a space
unic(unic == 10) = []; unic(unic == 13) = ' ';
disp(unic'); % display the Unicode text

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by