How to make a small change to a big text file efficiently

9 visualizzazioni (ultimi 30 giorni)
Hi, I have a big normal text file, 5000 lines, where different forms of content exist, data or strings or symbols. And I want to change the data in only two locations.
e.g. data file content:
line 1-1000th ......
line 1001th ABCD
line 1002th 3 5 7
line till 5000th ......
What I know is that first to read the file line by line:
tline = fgetl(fid); data01{i} = tline;
find the line location after "ABCD", replace it with my new data:
data01{1002} = [9 0 0]; then write data01 line by line into the new file.
However, this takes too long time, due to the burden of reading each line. and I have a lot of files to work with.
Is there any faster way to do this? Any comment or hint will be appreciated! Thank you!
/Pengfei

Risposta accettata

Pengfei
Pengfei il 14 Mag 2012
Hi, this following code works fine, which strangely didn't take that long time as first try.
fin = fopen('inp.txt','r');
fout = fopen('out.txt','w');
idk=0;
while ~feof(fin)
idk=idk+1;
s = fgetl(fin);
if idk==250
s = num2str(5);
end
if idk==262
s = num2str(6);
end
if idk==1373
s = num2str([1 1 1]);
end
if idk==1380
s = num2str([8 8 8]);
end
fprintf(fout,'%s\n',s);
end
fclose(fin);
fclose(fout);
  3 Commenti
Hischam Hendy
Hischam Hendy il 18 Set 2017
How can i stay in the same txt file , I mean to readjust fin
Walter Roberson
Walter Roberson il 18 Set 2017
Modificato: Walter Roberson il 18 Set 2017
You can only edit a text file "in place" if you are replacing strings within a line with other strings that are exactly the same size. Otherwise you need to work like the above where you copy from the input file to a new file making changes as you go. I detailed the in-place editing below https://www.mathworks.com/matlabcentral/answers/38300-how-to-make-a-small-change-to-a-big-text-file-efficiently#answer_47747

Accedi per commentare.

Più risposte (2)

Walter Roberson
Walter Roberson il 14 Mag 2012
If your new data is exactly the same length as the old, not a single character difference, then:
Use the 'rt+' permission when you fopen() the file (this is important)
fgetl() as many times as you need to skip over the data you wish to leave the same.
Before you read in the line that is to be changed, use ftell() and record the value.
fgetl() the line you will be changing. Compute the new line as a string: it must be exactly the same length as the existing line. Warning: 'ABCD' is not the same length as '9 0 0' !
fseek() on the file, relative to the beginning of the file, with "offset" the value you got from ftell(). This will reposition you to the beginning of the line you wish to change.
fprintf(fid, '%s\n', TheNewLine)
fseek() on the file, 0 bytes relative to your current position. This is needed in order to switch from writing mode to reading mode.
You are now positioned to the beginning of the line after the one you changed, and can fgetl() or whatever is needed to change the second line.
After changing the last line you need to change, you can fclose().
WARNING: if anything goes wrong your file is likely to be ruined.
Please consider rewriting the whole operation in perl. perl is provided with MATLAB and is designed for efficiency in these kinds of operations.
  2 Commenti
Pengfei
Pengfei il 14 Mag 2012
Following your suggestions, I wrote the code below, which seems problematic. Please point out mistakes in it! I find the ftell fseek fwrite commands hard to understand, my head gets bigger reading instructions.
file='Input.txt'; NewValue = 60.865;
fid = fopen(file,'r+');
while ~feof(fid)
tline = fgetl(fid);
if strfind(tline, 'ABCD') > 0
ui = ftell(fid);
fseek(fid,ui,'bof');
fprintf(fid, '%f\n', NewValue)
end
end
p.s. I want to replace the line right after "ABCD" line, to replace the value e.g. 254.80 with 60.865.
Thanks!
Walter Roberson
Walter Roberson il 14 Mag 2012
Using %.3f would be safer; otherwise you do not know how many decimal places it is going to emit. Safer yet is to use a string,
NewValue = '60.865';
and
fprintf(fid,'%s\n', NewValue);
You are not using feof() properly. Please see http://www.mathworks.com/matlabcentral/answers/21210-arrarys
After you do the fprintf() you have to fseek() to the current location (0 bytes from 'cur') to switch back to reading mode.
The fseek() / ftell() requirements are not something that is obvious. They have to do with the standards about how reading and writing files actually works. Each time you switch between reading and writing (or writing and reading), you need to fseek(), even if it is an fseek() that leaves you in the same position.
You could replace the ftell() / fseek() that you have in your code with fseek(fid, 0, 'cur')

Accedi per commentare.


Titus Edelhofer
Titus Edelhofer il 14 Mag 2012
Hi,
if it is the same length, I would suggest to use memmapfile: open the file using memmapfile. Use e.g. strfind to find where you want to replace something ('ABCD'). Just overwrite the data and close the file. Open the file with write permissions, though ;-).
Titus

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by