Azzera filtri
Azzera filtri

How to read text files form sub-sub folders

3 visualizzazioni (ultimi 30 giorni)
Hi,
I want to read text files from sub-sub folders:
Architecture:
Mainfolder
Tool1
sub-subFolder1
sub-subFolder2
.....
.....
Tool2
sub-subFolder1
sub-subFolder2
.....
.....
......
1. Read text files by each sub-folder(i.e, Tool1, Tool2, etc)
2. Output
Tool1.xlsx, Tool2.xlsx
I use the following code, but I can access sub-sub folders.
% - Define output header.
header = {'RainFallID', 'IINT', 'Rain Result', 'Start Time', 'Param1.pipe', ...
'10 Un Para2.pipe', 'Verti 2 mixing.dis', 'Rate.alarm times'} ;
Mainfolder='Mainfolder';
outLocatorFolder='OutputFolder';
nHeaderCols = numel( header ) ;
% - Build listing sub-folders of main folder.
% D_main = dir( 'D:\Mekala_Backupdata\Matlab2010\Mainfolder' ) ;
D_main = dir(Mainfolder ) ;
D_main = D_main(3:end) ; % Eliminate "." and ".."
% - Iterate through sub-folders and process.
for dId = 1 : numel( D_main )
% - Build listing files of sub-folder.
D_sub = dir( fullfile(Mainfolder, D_main(dId).name, '*.txt' )) ;
nFiles = numel( D_sub ) ;
keyboard
% - Prealloc output cell array.
data = cell( nFiles, nHeaderCols ) ;
% - Iterate through files and process.
for fId = 1 : nFiles
% - Read input text file.
inLocator = fullfile(Mainfolder, D_main(dId).name, D_sub(fId).name ) ;
content = fileread( inLocator ) ;
% - Extract relevant data.
rainfallId = str2double( regexp( content, '(?<=RainFallID\s+:\s*)\d+', 'match', 'once' )) ;
iint = regexp( content, '(?<=IINT\s+:\s*)\S+', 'match', 'once' ) ;
rainResult = regexp( content, '(?<=Rain Result\s+:\s*)\S+', 'match', 'once' ) ;
startTime = strtrim( regexp( content, '(?<=Start Time\s+:\s*).*?(?= -)', 'match', 'once' )) ;
param1Pipe = str2double( regexp( content, '(?<=Param1.pipe\s+[\d\.]+\s+\w+\s+)[\d\.]+', 'match', 'once' )) ;
tenUn = str2double( regexp( content, '(?<=10 Un Para2.pipe\s+[\d\.]+\s+\w+\s+)[\d\.]+', 'match', 'once' )) ;
verti2 = regexp( content, '(?<=Verti 2 mixing.dis\s+\S+\s%\s+)\S+', 'match', 'once' ) ;
rateAlarm = strtrim( regexp( content, '(?<=Rate.alarm times\s+\S+\s+)[^\r\n]+', 'match', 'once' )) ;
% - Populate data cell array.
data(fId,:) = {rainfallId, iint, rainResult, startTime, ...
param1Pipe, tenUn, verti2, rateAlarm} ;
end
% - Output to XLSX.
% outLocator = fullfile( 'D:\Mekala_Backupdata\Matlab2010\OutputFolder', sprintf( '%s.xlsx', D_main(dId).name )) ;
outLocator = fullfile(outLocatorFolder, sprintf( '%s.xlsx', D_main(dId).name )) ;
fprintf( 'Output XLSX: %s ..\n', outLocator ) ;
xlswrite( outLocator, [header; data] ) ;
end
many thanks in advance,

Risposta accettata

Image Analyst
Image Analyst il 4 Ott 2017
You need to use in dir() instead of *. See attached demo.

Più risposte (1)

Cedric
Cedric il 4 Ott 2017
Modificato: Cedric il 4 Ott 2017
Look at the EDIT 4:09pm block in the thread:
update the pseudo-code
Iterate through sub folders of 'Mainfolder'
Iterate through files of sub folder
Extract data from file and store in data array
Export data array to relevant Excel file
specifically for your new problem, and it should show you how to restructure and update the former code. At first remove all the code that is not necessary to crawling through the folders and files, and run it to check that it is crawling as desired.
Big hint: you should be able to add a level of FOR loop. Define D_sub at a strategic place:
for dmId = 1 : numel( D_main )
D_sub = dir( fullfile( Mainfolder, D_main(dmId).name )) ;
D_sub = D_sub(3:end) ; % Eliminate "." and ".."
iterate through its elements (sub-sub-folders):
for dsId = 1 : numel( D_sub )
D_subsub = dir( fullfile( Mainfolder, D_main(dmId).name, D_sub(dsId).name, '*.txt' )) ;
nFiles = numel( D_subsub ) ;
and finally iterate through D_subsub elements (the text files):
for fId = 1 : nFiles
inLocator = fullfile( Mainfolder, D_main(dmId).name, D_sub(dsId).name, D_subsub(fId).name ) ;
content = fileread( inLocator ) ;
Note that if you have a recent version of MATLAB, you can replace most calls to FULLFILE by the value of the folder field of the relevant output of a former DIR, e.g.:
inLocator = fullfile( Mainfolder, D_main(dmId).name, D_sub(dsId).name, D_subsub(fId).name ) ;
could be replaced by:
inLocator = fullfile( D_subsub(fId).folder, D_subsub(fId).name ) ;
Finally, note that if you have a lot of different situations with varying depths of nested folders, a better approach would be to build a recursive crawler, but this is a bit more complex.
  4 Commenti
Cedric
Cedric il 4 Ott 2017
Modificato: Cedric il 4 Ott 2017
You should index D_main with dmId when you generate the output locator. When I wrote the hints above with an additional level of loop, I changed the name of the loop index variables to make them more consistent: dmId for "dir main ID" and dsId for "dir sub ID".

Accedi per commentare.

Categorie

Scopri di più su Data Type Conversion in Help Center e File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by