detectImportOptions
Create import options based on file content
Description
locates a table in a file and returns its import options. You can modify the options
object and use it with opts
= detectImportOptions(filename
)readtable
to control how MATLAB® imports tabular data. The type of the options returned depends on the
file extension. For example, the function returns a SpreadsheetImportOptions
object if
filename
is a spreadsheet file. However, the function returns
a DelimitedTextImportOptions
or FixedWidthImportOptions
object if
filename
is a text file, an XMLImportOptions
object if filename
is an XML file,
an HTMLImportOptions
object if filename
is an HTML
file, and a WordDocumentImportOptions
object if filename
is a
Microsoft® Word document.
locates a table in a file with the help of additional parameters specified by one or
more opts
= detectImportOptions(filename
,Name,Value
)Name,Value
pair arguments.
Examples
Read Spreadsheet File Using Import Options
Configure how readtable interprets your file using an import options object. For example, use an import options object to read only specified variables from a spreadsheet file.
First, create an import options object from a file by using detectImportOptions
to detect aspects of your spreadsheet file, including variable names and types. In this case, detectImportOptions
creates a SpreadsheetImportOptions
object.
opts = detectImportOptions("patients.xls")
opts = SpreadsheetImportOptions with properties: Sheet Properties: Sheet: '' Replacement Properties: MissingRule: 'fill' ImportErrorRule: 'fill' MergedCellColumnRule: 'placeleft' MergedCellRowRule: 'placetop' Variable Import Properties: Set types by name using setvartype VariableNames: {'LastName', 'Gender', 'Age' ... and 7 more} VariableTypes: {'char', 'char', 'double' ... and 7 more} SelectedVariableNames: {'LastName', 'Gender', 'Age' ... and 7 more} VariableOptions: [1-by-10 matlab.io.VariableImportOptions] Access VariableOptions sub-properties using setvaropts/getvaropts VariableNamingRule: 'modify' Range Properties: DataRange: 'A2' (Start Cell) VariableNamesRange: 'A1' RowNamesRange: '' VariableUnitsRange: '' VariableDescriptionsRange: '' To display a preview of the table, use preview
Specify which variables to import by modifying the import options object. Then, import the specified variables using readtable
with the import options object. Display the first 5 rows of the table.
opts.SelectedVariableNames = ["Systolic","Diastolic"]; T = readtable("patients.xls",opts); T(1:5,:)
ans=5×2 table
Systolic Diastolic
________ _________
124 93
109 77
125 83
117 75
122 80
Read Subset of Text File Using Import Options
Configure how readtable
interprets your file using an import options object. For example, use an import options object to read only a subset of a text file.
First, create an import options object by using detectImportOptions
to detect aspects of your text file, including variable names and types, delimiters, and white-space characters. In this case, detectImportOptions
creates a DelimitedTextImportOptions
object.
opts = detectImportOptions("airlinesmall.csv")
opts = DelimitedTextImportOptions with properties: Format Properties: Delimiter: {','} Whitespace: '\b\t ' LineEnding: {'\n' '\r' '\r\n'} CommentStyle: {} ConsecutiveDelimitersRule: 'split' LeadingDelimitersRule: 'keep' TrailingDelimitersRule: 'ignore' EmptyLineRule: 'skip' Encoding: 'ISO-8859-1' Replacement Properties: MissingRule: 'fill' ImportErrorRule: 'fill' ExtraColumnsRule: 'addvars' Variable Import Properties: Set types by name using setvartype VariableNames: {'Year', 'Month', 'DayofMonth' ... and 26 more} VariableTypes: {'double', 'double', 'double' ... and 26 more} SelectedVariableNames: {'Year', 'Month', 'DayofMonth' ... and 26 more} VariableOptions: [1-by-29 matlab.io.VariableImportOptions] Access VariableOptions sub-properties using setvaropts/getvaropts VariableNamingRule: 'modify' Location Properties: DataLines: [2 Inf] VariableNamesLine: 1 RowNamesColumn: 0 VariableUnitsLine: 0 VariableDescriptionsLine: 0 To display a preview of the table, use preview
Specify the subset of variables to import by modifying the import options object. Then, import the subset of data using readtable
with the import options object.
opts.SelectedVariableNames = ["TaxiIn","TaxiOut"]; T = readtable("airlinesmall.csv",opts);
Detect and Use Import Options for Microsoft Word Document File
Detect import options for a Microsoft Word document file, specify the table to import, and then read the data.
The file MaintenanceReport.docx
contains two tables. The last row of the second table contains a cell with merged columns that do not match the table variables.
Detect the import options using the detectImportOptions
function. Specify to read from the second table by setting TableIndex
to 2
.
filename = "MaintenanceReport.docx"; opts = detectImportOptions(filename,'TableIndex',2)
opts = WordDocumentImportOptions with properties: Replacement Properties: MissingRule: "fill" ImportErrorRule: "fill" EmptyRowRule: "skip" MergedCellColumnRule: "placeleft" MergedCellRowRule: "placetop" ExtraColumnsRule: "addvars" Variable Import Properties: Set types by name using setvartype VariableNames: ["Description" "Category" "Urgency" "Resolution" "Cost"] VariableTypes: ["string" "string" "string" "string" "string"] SelectedVariableNames: ["Description" "Category" "Urgency" "Resolution" "Cost"] VariableOptions: [1-by-5 matlab.io.VariableImportOptions] Access VariableOptions sub-properties using setvaropts/getvaropts VariableNamingRule: "preserve" Location Properties: TableSelector: "(//w:tbl)[2]" DataRows: [2 Inf] VariableNamesRow: 1 VariableUnitsRow: 0 VariableDescriptionsRow: 0 RowNamesColumn: 0
To skip reading rows that have cells with merged columns, set the MergedCellColumnRule
property to 'omitrow'
.
opts.MergedCellColumnRule = 'omitrow';
Read the table from the Microsoft Word document file using the readtable
function with the options object.
filename = "MaintenanceReport.docx";
T = readtable(filename,opts)
T=3×5 table
Description Category Urgency Resolution Cost
_____________________________________________________________________ ____________________ ________ __________________ ________
"Items are occasionally getting stuck in the scanner spools." "Mechanical Failure" "Medium" "Readjust Machine" "$45"
"Loud rattling and banging sounds are coming from assembler pistons." "Mechanical Failure" "Medium" "Readjust Machine" "$35"
"There are cuts to the power when starting the plant." "Electronic Failure" "High" "Full Replacement" "$16200"
Detect and Use Import Options for HTML File
Detect import options for an HTML file, specify the table to import, and then read the data.
Detect the import options of the first table from the URL https://www.mathworks.com/help/matlab/text-files.html
containing the text "readtable". Detect the import options using the detectImportOptions
function and specify the table to read using the XPath query "//TABLE[contains(.,'readtable')]"
. Specify to not read variable names by setting ReadVariableNames
to false
.
url = "https://www.mathworks.com/help/matlab/text-files.html"; opts = detectImportOptions(url,'TableSelector',"//TABLE[contains(.,'readtable')]",'ReadVariableNames',false)
opts = HTMLImportOptions with properties: Replacement Properties: MissingRule: "fill" ImportErrorRule: "fill" EmptyRowRule: "skip" MergedCellColumnRule: "placeleft" MergedCellRowRule: "placetop" ExtraColumnsRule: "addvars" Variable Import Properties: Set types by name using setvartype VariableNames: ["Var1" "Var2"] VariableTypes: ["string" "string"] SelectedVariableNames: ["Var1" "Var2"] VariableOptions: Show all 2 VariableOptions Access VariableOptions sub-properties using setvaropts/getvaropts VariableNamingRule: "preserve" Location Properties: TableSelector: "//TABLE[contains(.,'readtable')]" DataRows: [1 Inf] VariableNamesRow: 0 VariableUnitsRow: 0 VariableDescriptionsRow: 0 RowNamesColumn: 0
Read the table using the readtable
function.
T = readtable(url,opts)
T=4×2 table
Var1 Var2
________________ ____________________________
"readtable" "Create table from file"
"writetable" "Write table to file"
"readtimetable" "Create timetable from file"
"writetimetable" "Write timetable to file"
Designate Data Type for Imported Text Data
Import text data as a string data type by specifying import options.
Create an options object for the file.
opts = detectImportOptions('outages.csv');
Specify which variables to import using readtable
, and then show a summary. The data type of the selected variables is char
.
opts.SelectedVariableNames = {'Region','Cause'}; T = readtable('outages.csv',opts); summary(T)
T: 1468x2 table Variables: Region: cell array of character vectors Cause: cell array of character vectors Statistics for applicable variables: NumMissing Region 0 Cause 0
Import text data as a string
data type, and then create import options by specifying the TextType
name-value pair.
opts = detectImportOptions('outages.csv','TextType','string');
Specify which variables to import using readtable
, and then show a summary. The data type of the selected variables is now string
.
opts.SelectedVariableNames = {'Region','Cause'}; T = readtable('outages.csv',opts); summary(T)
T: 1468x2 table Variables: Region: string Cause: string Statistics for applicable variables: NumMissing Region 0 Cause 0
Read XML File as Table
Import the contents of an XML file into a table.
The students.xml
file has seven sibling nodes named Student
, which each contain the same child nodes and attributes.
type students.xml
<?xml version="1.0" encoding="utf-8"?> <Students> <Student ID="S11305"> <Name FirstName="Priya" LastName="Thompson" /> <Age>18</Age> <Year>Freshman</Year> <Address> <Street xmlns="https://www.mathworks.com">591 Spring Lane</Street> <City>Natick</City> <State>MA</State> </Address> <Major>Computer Science</Major> <Minor>English Literature</Minor> </Student> <Student ID="S23451"> <Name FirstName="Conor" LastName="Cole" /> <Age>18</Age> <Year>Freshman</Year> <Address> <Street xmlns="https://www.mathworks.com">4641 Pearl Street</Street> <City>San Francisco</City> <State>CA</State> </Address> <Major>Microbiology</Major> <Minor>Public Health</Minor> </Student> <Student ID="S119323"> <Name FirstName="Morgan" LastName="Yang" /> <Age>21</Age> <Year>Senior</Year> <Address> <Street xmlns="https://www.mathworks.com">30 Highland Road</Street> <City>Detriot</City> <State>MI</State> </Address> <Major>Political Science</Major> </Student> <Student ID="S201351"> <Name FirstName="Salim" LastName="Copeland" /> <Age>19</Age> <Year>Sophomore</Year> <Address> <Street xmlns="https://www.mathworks.com">3388 Moore Avenue</Street> <City>Fort Worth</City> <State>TX</State> </Address> <Major>Business</Major> <Minor>Japanese Language</Minor> </Student> <Student ID="S201351"> <Name FirstName="Salim" LastName="Copeland" /> <Age>20</Age> <Year>Sophomore</Year> <Address> <Street xmlns="https://www.mathworks.com">3388 Moore Avenue</Street> <City>Fort Worth</City> <State>TX</State> </Address> <Major>Business</Major> <Minor>Japanese Language</Minor> </Student> <Student ID="54600"> <Name FirstName="Dania" LastName="Burt" /> <Age>22</Age> <Year>Senior</Year> <Address> <Street xmlns="https://www.mathworks.com">22 Angie Drive</Street> <City>Los Angeles</City> <State>CA</State> </Address> <Major>Mechanical Engineering</Major> <Minor>Architecture</Minor> </Student> <Student ID="453197"> <Name FirstName="Rikki" LastName="Gunn" /> <Age>21</Age> <Year>Junior</Year> <Address> <Street xmlns="https://www.mathworks.com">65 Decatur Lane</Street> <City>Trenton</City> <State>ME</State> </Address> <Major>Economics</Major> <Minor>Art History</Minor> </Student> </Students>
First, create an XMLImportOptions
object by using detectImportOptions
to detect aspects of your XML file. Read just the street names into a table by specifying the VariableSelectors
name-value argument as the XPath expression of the Street
element node. Register a custom namespace prefix to the existing namespace URL by setting the RegisteredNamespaces
name-value argument.
opts = detectImportOptions("students.xml",RegisteredNamespaces=["myPrefix","https://www.mathworks.com"], ... VariableSelectors="//myPrefix:Street");
Then, import the specified variable using readtable
with the import options object.
T = readtable("students.xml",opts)
T=7×1 table
Street
___________________
"591 Spring Lane"
"4641 Pearl Street"
"30 Highland Road"
"3388 Moore Avenue"
"3388 Moore Avenue"
"22 Angie Drive"
"65 Decatur Lane"
Input Arguments
filename
— Name of file to read
character vector | string scalar
Name of the file to read, specified as a character vector or string scalar.
Depending on the location of your file, filename
can
take on one of these forms.
Location | Form | ||||||||
---|---|---|---|---|---|---|---|---|---|
Current folder or folder on the MATLAB path | Specify the name of the file in
Example:
| ||||||||
File in a folder | If the file is not in the current folder or in
a folder on the MATLAB path, then specify the full or
relative path name in
Example:
Example:
| ||||||||
Internet URL | If the file is specified as an internet uniform
resource locator (URL), then
Example:
| ||||||||
Remote Location | If the file is stored at a remote location,
then
Based on the remote location,
For more information, see Work with Remote Data. Example:
|
If filename
includes the file extension, then
detectImportOptions
determines the file format from
the extension. Otherwise, you must specify the 'FileType'
name-value pair to indicate the type of file.
The detectImportOptions
function supports these file
extensions: .txt
, .dat
,
.csv
, .xls
,
.xlsb
, .xlsm
,
.xlsx
, .xltm
,
.xltx
, .ods
,
.xml
, .docx
,
.html
, .xhtml
, and
.htm
.
Note
File extensions .xlsb
and .ods
are only supported on platforms with Excel® for Windows®.
Data Types: char
| string
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name
in quotes.
Example: 'FileType','spreadsheet'
FileType
— Type of file
'spreadsheet'
| 'text'
| 'delimitedtext'
| 'fixedwidth'
| 'xml'
| 'worddocument'
| 'html'
Type of file, specified as the comma-separated pair consisting of
'FileType'
and one of these values.
Value | Import Options for File |
---|---|
'spreadsheet' | Return a |
'text' | Return a |
'delimitedtext' | Return a |
'fixedwidth' | Return a |
'xml' | Return an |
'worddocument' | Return a |
'html' | Return a |
Use the 'FileType'
name-value pair argument when
filename
does not include the file extension, or
when the extension is not one of these:
.txt
,.dat
, or.csv
for text files.xls
,.xlsb
,.xlsm
,.xlsx
,.xltm
,.xltx
, or.ods
for spreadsheet files.xml
, for XML files.docx
for Microsoft Word document files.html
,.xhtml
, or.htm
for HTML files
File extensions .xlsb
and .ods
are only supported on platforms with Excel for Windows.
Example: 'FileType','text'
Data Types: char
| string
TextType
— Type for imported text data
"string"
| "char"
Type for imported text data, specified as one of these values:
"string"
— Import text data as string arrays."char"
— Import text data as character vectors.
Example: "TextType","char"
DatetimeType
— Type for imported date and time data
"datetime"
(default) | "text"
| "exceldatenum"
(spreadsheet files only)
Type for imported date and time data, specified as one of these values:
Value | Description |
---|---|
"datetime" | MATLAB
For more information, see
|
"text" | If
|
"exceldatenum"
| Excel serial date numbers The value
|
ReadVariableNames
— Read first row as variable names
true
| false
Indicator for reading the first row as variable names, specified as the comma-separated pair consisting of 'ReadVariableNames'
and either true
or false
. If unspecified, readtable
automatically detects the presence of variable names.
Indicator | Description |
---|---|
| Use when the first row of the region to read contains the variable names for the table. |
| Use when the first row of the region to read contains data in the table. |
unspecified | When left unspecified, the importing function automatically detects true or false and proceeds accordingly. |
Data Types: logical
VariableNamingRule
— Flag to preserve variable names
"modify"
(default) | "preserve"
Flag to preserve variable names, specified as either "modify"
or
"preserve"
.
"modify"
— Convert invalid variable names (as determined by theisvarname
function) to valid MATLAB identifiers."preserve"
— Preserve variable names that are not valid MATLAB identifiers such as variable names that include spaces and non-ASCII characters.
Starting in R2019b, variable names and row names can include any characters, including
spaces and non-ASCII characters. Also, they can start with any characters, not just
letters. Variable and row names do not have to be valid MATLAB identifiers (as determined by the isvarname
function). To preserve these variable names and row names, set
the value of VariableNamingRule
to "preserve"
.
Variable names are not refreshed when the value of VariableNamingRule
is changed from "modify"
to "preserve"
.
Data Types: char
| string
ReadRowNames
— Indicator for reading the first column as row names
false
(default) | true
Indicator for reading first column as row names, specified as the
comma-separated pair consisting of 'ReadRowNames'
and
either false
or
true
.
Indicator | Description |
---|---|
| Use when the first column of the region to read contains data, and not the row names for the table. |
| Use when the first column of the region to read contains the row names for the table. |
unspecified | When left unspecified, the importing function
assumes false . |
Data Types: logical
MissingRule
— Procedure to manage missing data
'fill'
(default) | 'error'
| 'omitrow'
| 'omitvar'
Procedure to manage missing data, specified as one of the values in this table.
Missing Rule | Behavior |
---|---|
'fill' | Replace missing data with the contents of the The |
'error' | Stop importing and display an error message showing the missing record and field. |
'omitrow' | Omit rows that contain missing data. |
'omitvar' | Omit variables that contain missing data. |
Example: opts.MissingRule = 'omitrow';
Data Types: char
| string
ImportErrorRule
— Procedure to handle import errors
'fill'
(default) | 'error'
| 'omitrow'
| 'omitvar'
Procedure to handle import errors, specified as one of the values in this table.
Import Error Rule | Behavior |
---|---|
'fill' | Replace the data where the error occurred with the contents of the
The
|
'error' | Stop importing and display an error message showing the error-causing record and field. |
'omitrow' | Omit rows where errors occur. |
'omitvar' | Omit variables where errors occur. |
Example: opts.ImportErrorRule = 'omitvar';
Data Types: char
| string
WebOptions
— HTTP
or HTTPS
request options
weboptions
object
HTTP
or HTTPS
request options,
specified as a weboptions
object. The
weboptions
object determines how to import data
when the specified filename
is an internet URL
containing the protocol type "http://"
or
"https://"
.
ExpectedNumVariables
— Expected number of variables
positive integer
Expected number of variables, specified as the comma-separated pair consisting of
'ExpectedNumVariables'
and a positive integer. If unspecified,
the importing function automatically detects the number of variables.
Data Types: single
| double
NumHeaderLines
— Number of header lines
positive integer
Number of header lines in the file, specified as the comma-separated pair consisting of 'NumHeaderLines'
and a positive integer. If unspecified, the importing function automatically detects the number of header lines in the file.
Example: 'NumHeaderLines',7
Data Types: single
| double
Range
— Portion of data to read
character vector | string scalar | numeric vector
Portion of the data to read from text or spreadsheet files, specified as the comma
separated pair consisting of 'Range'
and a character vector, string
scalar, or numeric vector in one of these forms.
Ways to specify Range | Description |
---|---|
Starting Cell
| Specify the starting cell for the data as a character vector or string scalar or a two element numeric vector.
Using the starting cell, the importing function automatically detects the extent of the data by beginning the import at the start cell and ending at the last empty row or footer range. Example:
|
Rectangular Range
| Specify the exact range to read using the rectangular range in one of these forms.
The importing function only reads the data contained in the specified range. Any empty fields within the specified range are imported as missing cells. |
Row Range or Column Range
| Specify the range by identifying the beginning and ending rows using Excel row numbers. Using the specified row range, the importing function automatically detects the column extent by reading from the first nonempty column to the end of the data, and creates one variable per column. Example:
Alternatively, specify the range by identifying the beginning and ending columns using Excel column letters or numbers. Using the specified column range, the import function automatically detects the row extent by reading from the first nonempty row to the end of the data or the footer range. The number of columns in
the specified range must match the number specified in the
Example:
|
Starting Row Number
| Specify the first row containing the data using the positive scalar row index. Using the specified row index, the importing function automatically detects the extent of the data by reading from the specified first row to the end of the data or the footer range. Example: |
Excel’s Named Range
| In Excel, you can create names to identify ranges in the
spreadsheet. For instance, you can select a rectangular portion of
the spreadsheet and call it Example:
|
Unspecified or Empty
| If unspecified, the importing function automatically detects the used range. Example:
Note: Used Range refers to the rectangular portion of the spreadsheet that actually contains data. The importing function automatically detects the used range by trimming any leading and trailing rows and columns that do not contain data. Text that is only white space is considered data and is captured within the used range. |
Data Types: char
| string
| double
Sheet
— Sheet to read from
''
empty character array (default) | character vector | string scalar | positive scalar integer
Sheet to read from, specified as an empty character array, a character vector or string scalar
containing the sheet name, or a positive scalar integer denoting the sheet
index. Based on the value specified for the Sheet
property, the import function behaves as described in the table.
Specification | Behavior |
---|---|
'' (default) | Import data from the first sheet. |
Name | Import data from the matching sheet name, regardless of order of sheets in the spreadsheet file. |
Integer | Import data from sheet in the position denoted by the integer, regardless of the sheet names in the spreadsheet file. |
Data Types: char
| string
| single
| double
DataRange
— Location of data
character vector | string scalar | positive scalar integer | array of positive scalar integers | cell array of character vector | string array
Location of data to be imported, specified as a character vector, string scalar, cell array of character vectors, string array, positive scalar integer or an N
-by-2
array of positive scalar integers. Specify DataRange
using one of these forms.
Specified by | Behavior |
---|---|
Starting Cell or Starting Row | Specify the starting cell for the data, using Excel Using the starting cell, the importing function automatically detects the extent of the data, by beginning the import at the start cell and ending at the last empty row or footer range. Alternatively, specify the first row containing the data using the positive scalar row index. Using the specified row index, the importing function automatically detects the extent of the data by reading from the specified first row to the end of the data or the footer range. Example: |
Rectangular Range | Specify the exact range to read using the rectangular range form, where The importing function only reads the data contained in the specified range. Any empty fields within the specified range are imported as missing cells. The number of columns must match the number specified in the Example: |
Row Range or Column Range | Specify the range by identifying the beginning and ending rows using Excel row numbers. Using the specified row range, the importing function automatically detects the column extent by reading from the first nonempty column to the end of the data, and creates one variable per column. Example: Alternatively, specify the range by identifying the beginning and ending columns using Excel column letters or numbers. Using the specified column range, the import function automatically detects the row extent by reading from the first nonempty row to the end of the data or the footer range. The number of columns in the specified range must match the number specified in the Example: |
Multiple Row Ranges | Specify multiple row ranges to read with an A valid array of multiple row ranges must:
Use of Example: |
Unspecified or Empty | Do not fetch any data. Example: |
Data Types: char
| string
| cell
| single
| double
RowNamesRange
— Location of row names
character vector | string scalar | positive scalar integer | ''
empty character array
Location of row names, specified as a character vector, string scalar, positive scalar integer, or an empty character array. Specify RowNamesRange
as one of the values in this table.
Specified by | Behavior |
---|---|
| Specify the starting cell for the data, using Excel The importing function identifies a name for each variable in the data. Example: |
Rectangular Range | Specify the exact range to read using the rectangular range form, where The number of rows contained in Example: |
Row Range | Specify range by identifying the beginning and ending rows using Excel row numbers. Row names must be in a single column. Example: |
Number Index | Specify the column containing the row names using a positive scalar column index. Example: |
Unspecified or Empty | Indicate that there are no row names. Example: |
Data Types: char
| single
| double
VariableNamesRange
— Location of variable names
character vector | string scalar | positive scalar integer | ''
empty character array
Location of variable names, specified as a character vector, string scalar, positive scalar integer, or an empty character array. Specify VariableNamesRange
as one of the values in this table.
Specified by | Behavior |
---|---|
| Specify the starting cell for the data, using Excel The importing function reads a name for each variable in the data. Example: |
Rectangular Range | Specify the exact range to read using the rectangular range form, where The number of columns must match the number specified in the Example: |
Row Range | Specify range by identifying the beginning and ending rows using Excel row numbers. Must be a single row. Example: |
Number Index | Specify the row containing the variable names using a positive scalar row index. Example: |
Unspecified or Empty | Indicate that there are no variable names. Example: |
Data Types: char
| single
| double
VariableUnitsRange
— Location of variable units
character vector | string scalar | positive scalar integer | ''
empty character array
Location of variable units, specified as a character vector, string scalar, positive scalar integer, or an empty character array. Specify VariableUnitsRange
as one of the values in this table.
Specified by | Behavior |
---|---|
| Specify the starting cell for the data, using Excel The importing function reads a unit for each variable in the data. Example: |
Rectangular Range | Specify the exact range to read using the rectangular range form, where The number of columns must match the number specified in the Example: |
Row Range | Specify range by identifying the beginning and ending rows using Excel row numbers. Must be a single row. Example: |
Number Index | Specify the row containing the data units using a positive scalar row index. Example: |
Unspecified or Empty | Indicate that there are no variable units. Example: |
Data Types: char
| string
| single
| double
VariableDescriptionsRange
— Location of variable descriptions
character vector | string scalar | ''
empty character array
Location of variable descriptions, specified as a character vector, string scalar, positive scalar integer, or an empty character array. Specify VariableDescriptionRange
as one of the values in this table.
Specified by | Behavior |
---|---|
| Specify the starting cell for the data, using Excel The importing function reads a description for each variable in the data. Example: |
Rectangular Range | Specify the exact range to read using the rectangular range form, where The number of columns must match the number specified in the Example: |
Row Range | Specify range by identifying the beginning and ending rows using Excel row numbers. Must be a single row. Example: |
Number Index | Specify the row containing the descriptions using a positive scalar row index. Example: |
Unspecified or Empty | Indicate that there are no variable descriptions. Example: |
Data Types: char
| string
| single
| double
MergedCellColumnRule
— Rule for cells merged across columns
"placeleft"
(default) | "placeright"
| "duplicate"
| "omitrow"
| "error"
Since R2024b
Rule for cells merged across columns, specified as one of the values in this table.
Import Rule | Behavior |
---|---|
"placeleft" | Place the data in the leftmost cell and
fill the remaining cells with the contents of the
You can specify the
|
"placeright" | Place the data in the rightmost cell and
fill the remaining cells with the contents of the
You can specify the
|
"duplicate" | Duplicate the data in all cells. |
"omitrow" | Omit rows where merged cells occur. |
"error" | Display an error message and cancel the import operation. |
MergedCellRowRule
— Rule for cells merged across rows
"placetop"
(default) | "placebottom"
| "duplicate"
| "omitvar"
| "error"
Since R2024b
Rule for cells merged across rows, specified as one of the values in this table.
Import Rule | Behavior |
---|---|
"placetop" | Place the data in the top cell and fill the
remaining cells with the contents of the
You can specify the
|
"placebottom" | Place the data in the bottom cell and fill
the remaining cells with the contents of the
You can specify the
|
"duplicate" | Duplicate the data in all cells. |
"omitvar" | Omit variables where merged cells occur. |
"error" | Display an error message and cancel the import operation. |
Delimiter
— Field delimiter characters
string array | character vector | cell array of character vectors
Field delimiter characters in a delimited text file, specified as a string array, character vector, or cell array of character vectors.
Example: "Delimiter","|"
Example: "Delimiter",[";","*"]
LeadingDelimitersRule
— Procedure to manage leading delimiters
"keep"
| "ignore"
| "error"
Procedure to manage leading delimiters in a delimited text file, specified as one of the values in this table.
Value | Behavior |
---|---|
"keep" | Keep the delimiter. |
"ignore" | Ignore the delimiter. |
"error" | Return an error and cancel the import operation. |
TrailingDelimitersRule
— Procedure to manage trailing delimiters
'keep'
| 'ignore'
| 'error'
Procedure to manage trailing delimiters in a delimited text file, specified as one of the values in this table.
Leading Delimiters Rule | Behavior |
---|---|
'keep' | Keep the delimiter. |
'ignore' | Ignore the delimiter. |
'error' | Return an error and abort the import operation. |
ConsecutiveDelimitersRule
— Procedure to manage consecutive delimiters
"split"
| "join"
| "error"
Procedure to manage consecutive delimiters in a delimited text file, specified as one of the values in this table.
Value | Behavior |
---|---|
"split" | Split the consecutive delimiters into multiple fields. |
"join" | Join the delimiters into one delimiter. |
"error" | Return an error and cancel the import operation. |
VariableWidths
— Field widths of variables
vector of positive integer values
Field widths of variables in a fixed-width text file, specified as a vector of positive
integer values. Each positive integer in the vector corresponds to the number of
characters in a field that makes up the variable. The VariableWidths
property contains an entry corresponding to each variable specified in the
VariableNames
property.
Whitespace
— Characters to treat as white space
character vector | string scalar
Characters to treat as white space, specified as a character vector or string scalar containing one or more characters.
Example: 'Whitespace',' _'
Example: 'Whitespace','?!.,'
LineEnding
— End-of-line characters
["\n","\r","\r\n"]
(default) | string array | character vector | cell array of character vectors
End-of-line characters, specified as a string array, character vector, or cell array of character vectors.
Example: "LineEnding","\n"
Example: "LineEnding","\r\n"
Example: "LineEnding",["\b",":"]
Encoding
— Character encoding scheme
''
| 'UTF-8'
| 'system'
| 'ISO-8859-1'
| 'windows-1251'
| 'windows-1252'
| ...
Character encoding scheme associated with the file, specified as the comma-separated
pair consisting of 'Encoding'
and 'system'
or a
standard character encoding scheme name.
When you do not specify any encoding, the function uses automatic character set detection to determine the encoding when reading the file.
Example: 'Encoding','system'
uses the system default
encoding.
Data Types: char
| string
CommentStyle
— Style of comments
string array | character vector | cell array of character vectors
Style of comments, specified as a string array, character vector, or cell array of character vectors. For single- and multi-line comments, the starting identifier must be the first non-white-space character. For single-line comments, specify a single identifier to treat lines starting with the identifier as comments. For multi-line comments, lines from the starting (first) identifier to the ending (second) identifier are treated as comments. No more than two character vectors of identifiers can be specified.
For example, to ignore the line following a percent symbol as the first
non-white-space character, specify CommentStyle
as
"%"
.
Example: "CommentStyle",["/*"]
Example: "CommentStyle",["/*","*/"]
DurationType
— Output data type of duration data
'duration'
(default) | 'text'
Output data type of duration data from text files, specified as the comma-separated pair consisting of 'DurationType'
and either 'duration'
or 'text'
.
Value | Type for Imported Duration Data |
---|---|
'duration' | MATLAB For more information, see |
'text' | If
|
Data Types: char
| string
ExtraColumnsRule
— Procedure to handle extra columns
'addvars'
| 'ignore'
| 'wrap'
| 'error'
Procedure to handle extra columns in the data, specified as one of the values in this table.
Extra Columns Rule | Behavior |
---|---|
'addvars' | To import extra columns, create new variables. If there are |
'ignore' | Ignore the extra columns of data. |
'wrap' | Wrap the extra columns of data to new records. This action does not change the number of variables. |
'error' | Display an error message and abort the import operation. |
Data Types: char
| string
TreatAsMissing
— Text to interpret as missing data
character vector | string scalar | cell array of character vectors | string array
Text to interpret as missing data, specified as a character vector, string scalar, cell array of character vectors, or string array.
When the importing function finds missing instances, it uses the
specification in the MissingRule
property to
determine the appropriate action.
Example: 'TreatAsMissing',{'NA','TBD'}
instructs the
importing function to treat any occurrence of NA
or
TBD
as a missing fields.
Data Types: char
| string
| cell
DateLocale
— Locale for reading dates
character vector | string scalar
Locale for reading dates, specified as the comma-separated pair consisting of
'DateLocale'
and a character vector or a string scalar of the
form
, where:xx
_YY
YY
is an uppercase ISO 3166-1 alpha-2 code indicating a country.xx
is a lowercase ISO 639-1 two-letter code indicating a language.
This table lists some common values for the locale.
Locale | Language | Country |
---|---|---|
'de_DE' | German | Germany |
'en_GB' | English | United Kingdom |
'en_US' | English | United States |
'es_ES' | Spanish | Spain |
'fr_FR' | French | France |
'it_IT' | Italian | Italy |
'ja_JP' | Japanese | Japan |
'ko_KR' | Korean | Korea |
'nl_NL' | Dutch | Netherlands |
'zh_CN' | Chinese (simplified) | China |
When using the %D
format specifier to read text as
datetime
values, use DateLocale
to specify the
locale in which the importing function should interpret month and day-of-week names and
abbreviations.
If you specify the DateLocale
argument in addition to
opts
the import options, then the importing function uses the
specified value for the DateLocale
argument, overriding the locale
defined in the import options.
Example: 'DateLocale','ja_JP'
ThousandsSeparator
— Characters that indicate the thousands grouping
character vector | string scalar
Characters that indicate the thousands grouping in numeric variables,
specified as a character vector or string scalar. The thousands grouping
characters act as visual separators, grouping the number at every three
place values. The importing function uses the characters in the
ThousandsSeparator
property to interpret the
numbers being imported.
Data Types: char
| string
DecimalSeparator
— Characters indicating decimal separator
character vector | string scalar
Characters indicating the decimal separator in numeric variables,
specified as a character vector or string scalar. The importing function
uses the DecimalSeparator
property to distinguish the
integer part of a number from the decimal part.
When converting to integer data types, numbers with a decimal part are rounded to the nearest integer.
Data Types: char
| string
TrimNonNumeric
— Remove nonnumeric characters
false
(default) | true
Remove nonnumeric characters from a numeric variable, specified as a
logical true
or false
.
Data Types: logical
HexType
— Output data type of hexadecimal data
'auto'
(default) | 'text'
| 'int8'
| 'int16'
| ...
Output data type of hexadecimal data, specified as the comma-separated pair consisting of 'HexType'
and one of the values listed in the table.
The input file represents hexadecimal values as text, using either 0x
or
0X
as a prefix and the characters
0
-9
,
a
-f
, and A
-F
as digits. (Uppercase and lowercase letters represent the same digits—for example,
'0xf'
and '0xF'
both represent
15
.)
The importing function converts the hexadecimal values to the data type specified by
the value of 'HexType'
.
Value of | Data Type of Output Table Variables |
---|---|
| data type detected automatically |
| unaltered input text |
| 8-bit integer, signed |
| 16-bit integer, signed |
| 32-bit integer, signed |
| 64-bit integer, signed |
| 8-bit integer, unsigned |
| 16-bit integer, unsigned |
| 32-bit integer, unsigned |
| 64-bit integer, unsigned |
Example: 'HexType','uint16'
converts text representing hexadecimal values (such as '0xFF'
) to unsigned 16-bit integers (such as 255
) in the output table.
Data Types: char
| string
BinaryType
— Output data type of binary data
'auto'
(default) | 'text'
| 'int8'
| 'int16'
| ...
Output data type of binary data, specified as the comma-separated pair consisting of
'BinaryType'
and one of the values listed in the table.
The input file represents binary values as text, using either 0b
or
0B
as a prefix and the characters 0
and
1
as digits.
The importing function converts the binary values to the data type specified by the
value of 'BinaryType'
.
Value of | Data Type of Output Table Variables |
---|---|
| data type detected automatically |
| unaltered input text |
| 8-bit integer, signed |
| 16-bit integer, signed |
| 32-bit integer, signed |
| 64-bit integer, signed |
| 8-bit integer, unsigned |
| 16-bit integer, unsigned |
| 32-bit integer, unsigned |
| 64-bit integer, unsigned |
Example: 'BinaryType','uint16'
converts text representing binary
values (such as '0b11111111'
) to unsigned 16-bit integers (such as
255
) in the output table.
Data Types: char
| string
ImportAttributes
— Import attributes
logical 1
or true
(default) | logical 0
or false
Import attributes, specified as the comma-separated pair consisting of
'ImportAttributes'
and either 1
(true
) or 0
(false
). If you
specify false
, then the reading function will not import the XML
attributes in the input file as variables in the output table.
Example: 'ImportAttributes',false
AttributeSuffix
— Attribute suffix
'Attribute'
(default) | character vector | string scalar
Attribute suffix, specified as the comma-separated pair consisting of
'AttributeSuffix'
and either a character vector or string scalar.
This argument specifies the suffix the reading function appends to all table variables
that correspond to attributes in the input XML file. If you do not specify
'AttributeSuffix'
, then the reading function defaults to
appending the suffix 'Attribute'
to all variable names corresponding
to attributes in the input XML file.
Example: 'AttributeSuffix','_att'
RowNodeName
— Table row XML node name
character vector | string array
Table row XML node name, specified as the comma-separated pair consisting of
'RowNodeName'
and either a character vector or string scalar.
This argument specifies the XML node name that delineates rows of the output
table.
Example: 'RowNodeName','XMLNodeName'
RowSelector
— Table row XPath expression
character vector | string scalar
Table row XPath expression, specified as a character vector or string scalar that the
reading function uses to select individual rows of the output table. You must specify
RowSelector
as a valid XPath version 1.0 expression.
Example: 'RowSelector','/RootNode/ChildNode'
VariableNodeNames
— Table variable XML node names
cell array of character vectors | string array
Table variable XML node names, specified as the comma-separated pair consisting of
'VariableNodeNames'
and either a cell array of character vectors
or string array. This argument specifies the XML node name that the reading function
uses to identify the XML nodes to read as variables in the output table.
Example: 'VariableNodeNames',{'XMLNodeName1','XMLNodeName2'}
Example: 'VariableNodeNames',"XMLNodeName"
Example: 'VariableNodeNames',["XMLNodeName1","XMLNodeName2"]
VariableSelectors
— Table variable XPath expressions
cell array of character vectors | string array
Table variable XPath expressions, specified as a cell array of character vectors or
string array that the reading function uses to select table variables. You must specify
VariableSelectors
as valid XPath version 1.0 expressions.
Example: 'VariableSelectors',{'/RootNode/ChildNode'}
Example: 'VariableSelectors',"/RootNode/ChildNode"
Example: 'VariableSelectors',["/RootNode/ChildNode1","/RootNode/ChildNode2"]
TableNodeName
— Table XML node name
character vector | string scalar
Table XML node name, specified as the comma-separated pair consisting of
'TableNodeName'
and either a character vector or string scalar.
This argument specifies the node in the input structure that the reading function should
read to a table.
Example: 'TableNodeName','NodeName'
VariableUnitsSelector
— Variable units XPath expression
character vector | string scalar
Variable units XPath, specified as a character vector or string scalar that the
reading function uses to select the table variable units. You must specify
VariableUnitsSelector
as a valid XPath version 1.0
expression.
Example: 'VariableUnitsSelector','/RootNode/ChildNode'
VariableDescriptionsSelector
— Variable descriptions XPath expression
character vector | string scalar
Variable descriptions XPath expression, specified as a character vector or string
scalar that the reading function reads uses to select the table variable descriptions.
You must specify VariableDescriptionsSelector
as a valid XPath
version 1.0 expression.
Example: 'VariableDescriptionsSelector','/RootNode/ChildNode'
RowNamesSelector
— Table row names XPath expression
character vector | string scalar
Table row names XPath expression, specified as a character vector or string scalar
that the reading function uses to select the names of the table rows. You must specify
RowNamesSelector
as a valid XPath version 1.0 expression.
Example: 'RowNamesSelector','/RootNode/ChildNode'
RepeatedNodeRule
— Procedure to handle repeated XML nodes
'addcol'
(default) | 'ignore'
| 'error'
Procedure to handle repeated XML nodes in a given row of a table, specified as
'addcol'
, 'ignore'
, or
'error'
.
Repeated Node Rule | Behavior |
---|---|
'addcol' | Add columns for the repeated nodes under the variable header in
the table. Specifying the value of
|
'ignore' | Skip importing the repeated nodes. |
'error' | Display an error message and abort the import operation. |
Example: 'RepeatedNodeRule','ignore'
RegisteredNamespaces
— Set of registered XML namespace prefixes
string array
Set of registered XML namespace prefixes, specified as the comma-separated pair
consisting of RegisteredNamespaces
and an array of prefixes. The
reading function uses these prefixes when evaluating XPath expressions on an XML file.
Specify the namespace prefixes and their associated URLs as an Nx2 string array.
RegisteredNamespaces
can be used when you also evaluate an XPath
expression specified by a selector name-value argument, such as
StructSelector
for readstruct
, or
VariableSelectors
for readtable
and
readtimetable
.
By default, the reading function automatically detects namespace prefixes to register
for use in XPath evaluation, but you can also register new namespace prefixes using the
RegisteredNamespaces
name-value argument. You might register a
new namespace prefix when an XML node has a namespace URL, but no declared namespace
prefix in the XML file.
For example, evaluate an XPath expression on an XML file called
example.xml
that does not contain a namespace prefix. Specify
'RegisteredNamespaces'
as ["myprefix",
"https://www.mathworks.com"]
to assign the prefix
myprefix
to the URL
https://www.mathworks.com
.
T = readtable("example.xml", "VariableSelector", "/myprefix:Data",... "RegisteredNamespaces", ["myprefix", "https://www.mathworks.com"])
Example: 'RegisteredNamespaces',["myprefix",
"https://www.mathworks.com"]
TableIndex
— Index of table to read
1
(default) | positive integer
Index of table to read from Microsoft Word document or HTML file containing multiple tables, specified as a positive integer.
When you specify TableIndex
, the software automatically sets TableSelector
to the equivalent XPath expression.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
MergedCellColumnRule
— Rule for cells merged across columns
"placeleft"
(default) | "placeright"
| "duplicate"
| "omitrow"
| "error"
Rule for cells merged across columns, specified as one of the values in this table.
Import Rule | Behavior |
---|---|
"placeleft" | Place the data in the leftmost cell and
fill the remaining cells with the contents of the
You can specify the
|
"placeright" | Place the data in the rightmost cell and
fill the remaining cells with the contents of the
You can specify the
|
"duplicate" | Duplicate the data in all cells. |
"omitrow" | Omit rows where merged cells occur. |
"error" | Display an error message and cancel the import operation. |
MergedCellRowRule
— Rule for cells merged across rows
"placetop"
(default) | "placebottom"
| "duplicate"
| "omitvar"
| "error"
Rule for cells merged across rows, specified as one of the values in this table.
Import Rule | Behavior |
---|---|
"placetop" | Place the data in the top cell and fill the
remaining cells with the contents of the
You can specify the
|
"placebottom" | Place the data in the bottom cell and fill
the remaining cells with the contents of the
You can specify the
|
"duplicate" | Duplicate the data in all cells. |
"omitvar" | Omit variables where merged cells occur. |
"error" | Display an error message and cancel the import operation. |
VariableNamesRow
— Row containing variable names
nonnegative integer
Row containing variable names, specified as a nonnegative integer.
If you do not specify
VariableNamesRow
, then the software reads variable names according to theReadVariableNames
argument.If
VariableNamesRow
is0
, then the software does not import the variable names.Otherwise, the software imports the variable names from the specified row.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
VariableUnitsRow
— Row containing variable units
0
(default) | nonnegative integer
Row containing variable units, specified as a nonnegative integer.
If VariableUnitsRow
is 0
, then the software does
not import the variable units. Otherwise, the software imports the variable units from
the specified row.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
VariableDescriptionsRow
— Row containing variable descriptions
0
(default) | nonnegative integer
Row containing variable descriptions, specified as a nonnegative integer.
If VariableDescriptionsRow
is 0
, then the
software does not import the variable descriptions. Otherwise, the software imports the
variable descriptions from the specified row.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
EmptyRowRule
— Rule to apply to empty rows in the table
"skip"
(default) | "read"
| "error"
Rule to apply to empty rows in the table, specified as one of the following:
"skip"
– Skip empty rows."read"
– Read empty rows."error"
– Ignore empty rows during table detection and error when reading.
EmptyColumnRule
— Rule to apply to empty columns in the table
"skip"
(default) | "read"
| "error"
Rule to apply to empty columns in the table, specified as one of the following:
"skip"
– Skip empty columns."read"
– Read empty columns."error"
– Ignore empty columns during table detection and error when reading.
TableSelector
— Table data XPath expression
character vector | string scalar
Table data XPath expression, specified as a character vector or string scalar that the
reading function uses to select the output table data. You must specify
TableSelector
as a valid XPath version 1.0 expression.
Example: 'TableSelector','/RootNode/ChildNode'
Output Arguments
opts
— Import options for file
SpreadsheetImportOptions
| DelimitedtextImportOptions
| FixedWidthImportOptions
| XMLImportOptions
Import options for the specified file, returned as a
SpreadsheetImportOptions
,
DelimitedTextImportOptions
,
FixedWidthImportOptions
, or
XMLImportOptions
object. The type of options object
depends on the type of file specified. For text files
(.txt
, .dat
, or
.csv
), the detectImportOptions
function returns a DelimitedTextImportOptions
or
FixedWidthImportOptions
object. For spreadsheet files (.xls
,
.xlsb
, .xlsm
,
.xlsx
, .xltm
,
.xltx
, or .ods
), the
detectImportOptions
function returns a SpreadsheetImportOptions
object. For XML files (.xml
),
detectImportOptions
returns an XMLImportOptions
object.
Tips
Updating Property Values After Creating the Import Options Object: Use of dot notation is not recommended to update the properties of the import options object created by
detectImportOptions
. When you set properties using dot notation, MATLAB does not re-detect all the import options for the file. Therefore, to update and re-detect all the properties, you must specify the new values by using name-value arguments. For example, update the value for theConsecutiveDelimitersRule
property and re-detect the import options as follows.opts = detectImportOptions(__,'ConsecutiveDelimitersRule','join')
Use XPath selectors to specify which elements of the XML input document to import. For example, suppose you want to import the XML file
myFile.xml
, which has the following structure:This table provides the XPath syntaxes that are supported for XPath selector name-value arguments, such as<data> <table category="ones"> <var>1</var> <var>2</var> </table> <table category="tens"> <var>10</var> <var>20</var> </table> </data>
VariableSelectors
orTableSelector
.Selection Operation Syntax Example Result Select every node whose name matches the node you want to select, regardless of its location in the document. Prefix the name with two forward slashes ( //
).data = readtable('myFile.xml', 'VariableSelectors', '//var')
data = 4×1 table var ___ 1 2 10 20
Read the value of an attribute belonging to an element node. Prefix the attribute with an at sign ( @
).data = readtable('myFile.xml', 'VariableSelectors', '//table/@category')
data = 2×1 table categoryAttribute _________________ "ones" "tens"
Select a specific node in a set of nodes. Provide the index of the node you want to select in square brackets ( []
).data = readtable('myFile.xml', 'TableSelector', '//table[1]')
data = 2×1 table var ___ 1 2
Specify precedence of operations. Add parentheses around the expression you want to evaluate first. data = readtable('myFile.xml', 'VariableSelectors', '//table/var[1]')
data = 2×1 table var ___ 1 10
data = readtable('myFile.xml', 'VariableSelectors', '(//table/var)[1]')
data = table var ___ 1
Version History
Introduced in R2016bR2024b: Specify how to import merged cells in spreadsheets
When importing data from spreadsheets, you can specify how
detectImportOptions
imports cells that are merged across rows
and columns by using the MergedCellRowRule
and
MergedCellColumnRule
name-value arguments.
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)