How to know the range automatically identified by readtable when importing an Excel file?

76 visualizzazioni (ultimi 30 giorni)
I am using readtable in an app to import data from an Excel spreadsheet, which contains several lines of heading and then the actual table with information: something like the example below
readtable successfully manages to skip the headers and to pull out the data. My problem however is that I also need to copy all the information in the header section because then I need to create a second file with the same header and only some of the elements of the table below. I cannot simply copy a given range (say A1:G6 in the example above) because the number of rows/columns of the header changes depending on the file.
Is there a way to know which range was automatically identified as a table by readtable, so that I could then deduct the range of the header that I need?

Risposta accettata

dpb
dpb il 13 Apr 2023
It will be only as good as the ability of the detectImportOptions bot's ability to recognize the start of the data, but
opt=detectImportOptions(yourfile);
disp(opt.DataRange)
disp(opt.VariableNamesRange)
disp(opt.VariableNames)
will show you what it determined automagically...the two ranges are the starting cell (upper LH corner of the range); the number of elements in the VariableNames array will let you deduce the number of columns. It does not return the Excel UsedRange range address; that could be a useful adjunct to ask for as an enhancement; as is, all you can do is then read the table and see how many rows were returned. That size, of course, is dependent upon the setting of the 'ImportError' and 'Missing' rules and the data content of the file itself.
IF (the proverbial "big if") the format is always similar to the shown table, it's probably going to be pretty reliable; less well formatted or more eclectic data collections may confuse the scanning and not always get the same answer for what appear to be very similar worksheets. In particular, missing or nonnumeric data in columns that should be recognized as numeric often will cause the whole column to be treated as a cell or may change the 'DataRange' value returned.
  7 Commenti
dpb
dpb il 14 Apr 2023
Modificato: dpb il 14 Apr 2023
I don't recall ever seeing it documented in the doc itself; I remember it being discussed here in the forum when behavior was changing fairly frequently and questions arose by poster's prior code not acting the same...but, I don't download/install every release and don't visit the release notes on other releases very often, either, so well could have missed what there might have been.
dpb
dpb il 15 Apr 2023
Modificato: dpb il 15 Apr 2023
function usedRange=GetUsedRange(Excel, sheetNameOrNumber)
% Returns used range address string for input sheet.
% Can pass in the number of the sheet (1,2,3,etc.) or name ('Results').
try
usedRange=Excel.Worksheets.Item(sheetNameOrNumber).UsedRange; % get range reference
usedRange=usedRange.Address; % return the address string
catch ME
errorMessage = sprintf('Error in function GetUsedRange.\n\nError Message:\n%s', ME.message);
fprintf('%s\n', errorMessage);
end
return;
end
To use, first create the Excel actx engine with actxserver and open the file; these were designed to be used to do various things to a given workbook, opening the workbook only once, do the work, and then closing and cleaning up the object rather than as complete standalone, create a new instance every time.
>> excel = actxserver('Excel.Application');
>> d=dir('TestAnnualReportScholarshipData.xls');
>> fn=fullfile(d.folder,d.name);
>> Workbook = excel.Workbooks.Open(fn)
>> GetUsedRange(excel,'Endowed Funds')
>> ans =
'$A$1:$P$74'
>> excel.ActiveWorkbook.Close(false);
>> excel.Quit
>> clear excel

Accedi per commentare.

Più risposte (1)

Walter Roberson
Walter Roberson il 13 Apr 2023
No. However you should first do a detectImportOptions on the file, and then pass the options to readtable() . The options object will indicate the range over which data was imported.
  1 Commento
dpb
dpb il 13 Apr 2023
" The options object will indicate the range over which data was imported."
Not exactly the range, Walter. It will return the ULH corner of the data and you can deduce the number of columns, but it does not return a length indication; the 'UsedRange' from Excel could be a useful addition to the returned output; it's got to be used internally but isn't reported to the user.

Accedi per commentare.

Prodotti


Release

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by