Removing double empty lines from a text file

If a file contains more than one consecutive empty lines, they are replaced by one empty line.
% reading file
fid=fopen(outFile,'rt');
Data = textscan(fid,'%s','Delimiter','\n');
Data=Data{1}; % get rid of nesting
k=1; emptylines_occured=0;
for j=1:numel(Data)
if ~strcmp(Data(j),'') % not empty line
if emptylines_occured
newData{k}=''; k=k+1;
emptylines_occured=0;
end
newData(k)=Data(j); k=k+1;
else % empty line
emptylines_occured=1;
end
end
fclose(fid);
% writing file
fid=fopen(outFile,'wt');
for j=1:numel(newData)
fprintf(fid, '%s\n',newData{j});
end
fclose(fid);
Is there a more concise way?

2 Commenti

Can you share your file?
This may be any text file, e.g. an m-file.

Accedi per commentare.

 Risposta accettata

Stephen23
Stephen23 il 8 Feb 2018
Modificato: Stephen23 il 9 Feb 2018
You can easily write the new file at the same time as you read the old one, which is faster and uses much less memory. Here is a simple version that create the new file with at most one empty line between any two non-empty lines:
[f1d,msg] = fopen('test_old.txt','rt');
assert(f1d>=3,msg)
[f2d,msg] = fopen('test_new.txt','wt');
assert(f2d>=3,msg)
prv = 'X';
while ~feof(f1d)
new = fgetl(f1d);
if numel(new) || numel(prv)
fprintf(f2d,'%s\n',new);
end
prv = new;
end
fclose(f1d);
fclose(f2d);
The test files are attached. Define prv as an empty char to ignore the leading empty line/s.

3 Commenti

Be careful: using 'wt' on MS Windows would introduce \r characters in the file that might not have been there before.
@Walter Roberson: the original question uses the t option for both reading and writing, so presumably this is not a problem.
bbb_bbb
bbb_bbb il 9 Feb 2018
Modificato: Stephen23 il 9 Feb 2018
This works excellently. Thanks.

Accedi per commentare.

Più risposte (1)

Walter Roberson
Walter Roberson il 8 Feb 2018
Modificato: Walter Roberson il 8 Feb 2018
%read the file _and_ do the work of deleting extra empty lines.
new_text = regexprep( fileread(outFile), '(\r?\n)(\r?\n)+', '$1');
%write the result to a new file
fid = fopen('text_new.txt', 'w');
fwrite(fid, new_text);
fclose(fid)

3 Commenti

bbb_bbb
bbb_bbb il 8 Feb 2018
Modificato: bbb_bbb il 8 Feb 2018
This deletes all empty strings and garbles russian characters!
new_text = regexprep( fileread(outFile), '(\r?\n\r?\n)(\r?\n)+', '$1');
bbb_bbb
bbb_bbb il 8 Feb 2018
Modificato: bbb_bbb il 8 Feb 2018
There is still problem with non-english characters. They are turned into 0xFF.

Accedi per commentare.

Categorie

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by