delimitedTextImportOptions
Import options object for delimited text
Description
A DelimitedTextImportOptions
object enables you to specify
how MATLAB® imports tabular data from delimited text files. The object contains
properties that control the data import process, including the handling of errors and
missing data.
Creation
You can create a DelimitedTextImportOptions
object using either the
detectImportOptions
function or the
delimitedTextImportOptions
function (described here):
Use
detectImportOptions
to detect and populate the import properties based on the contents of the delimited text file specified infilename
.opts = detectImportOptions(filename);
Use
delimitedTextImportOptions
to define the import properties based on your import requirements.
Syntax
Description
opts = delimitedTextImportOptions
creates a
DelimitedTextImportOptions
object with one
variable.
opts = delimitedTextImportOptions('NumVariables',
creates the object with the number of variables specified in
numVars
)numVars
.
opts = delimitedTextImportOptions(___,
specifies additional properties for
Name,Value
)DelimitedTextImportOptions
object using one or more
name-value pair arguments.
Input Arguments
numVars
— Number of variables
positive scalar integer
Number of variables, specified as a positive scalar integer.
Properties
Variable Properties
VariableNames
— Variable names
cell array of character vectors | string array
Variable names, specified as a cell array of character vectors or string array. The
VariableNames
property contains the names to use when importing
variables.
If the data contains N
variables, but no variable names are specified, then
the VariableNames
property contains
{'Var1','Var2',...,'VarN'}
.
To support invalid MATLAB identifiers as variable names, such as variable names containing spaces
and non-ASCII characters, set the value of VariableNamingRule
to
'preserve'
.
Example: opts.VariableNames
returns the current
(detected) variable names.
Example: opts.VariableNames(3)
= {'Height'}
changes the name of the third variable to Height
.
Data Types: char
| string
| cell
VariableNamingRule
— Flag to preserve variable names
"modify"
(default) | "preserve"
Flag to preserve variable names, specified as either "modify"
or
"preserve"
.
"modify"
— Convert invalid variable names (as determined by theisvarname
function) to valid MATLAB identifiers."preserve"
— Preserve variable names that are not valid MATLAB identifiers such as variable names that include spaces and non-ASCII characters.
Starting in R2019b, variable names and row names can include any characters, including
spaces and non-ASCII characters. Also, they can start with any characters, not just
letters. Variable and row names do not have to be valid MATLAB identifiers (as determined by the isvarname
function). To preserve these variable names and row names, set
the value of VariableNamingRule
to "preserve"
.
Variable names are not refreshed when the value of VariableNamingRule
is changed from "modify"
to "preserve"
.
Data Types: char
| string
VariableTypes
— Data types of variable
cell array of character vectors | string array
Data type of variable, specified as a cell array of character vectors, or string array
containing a set of valid data type names. The VariableTypes
property
designates the data types to use when importing variables.
To update the VariableTypes
property, use the setvartype
function.
Example: opts.VariableTypes
returns the current variable data
types.
Example: opts = setvartype(opts,'Height',{'double'})
changes the
data type of the variable Height
to
double
.
SelectedVariableNames
— Subset of variables to import
character vector | string scalar | cell array of character vectors | string array | array of numeric indices
Subset of variables to import, specified as a character vector, string scalar, cell array of character vectors, string array or an array of numeric indices.
SelectedVariableNames
must be a subset of
names contained in the VariableNames
property.
By default, SelectedVariableNames
contains all
the variable names from the VariableNames
property,
which means that all variables are imported.
Use the SelectedVariableNames
property to
import only the variables of interest. Specify a subset of variables
using the SelectedVariableNames
property and use readtable
to import only that subset.
To support invalid MATLAB identifiers as variable names, such as variable names
containing spaces and non-ASCII characters, set the value of
VariableNamingRule
to
'preserve'
.
Example: opts.SelectedVariableNames = {'Height','LastName'}
selects
only two variables, Height
and LastName
,
for the import operation.
Example: opts.SelectedVariableNames
= [1 5]
selects only two variables, the first variable and
the fifth variable, for the import operation.
Example: T = readtable(filename,opts)
returns
a table containing only the variables specified in the SelectedVariableNames
property
of the opts
object.
Data Types: uint16
| uint32
| uint64
| char
| string
| cell
VariableOptions
— Type specific variable import options
array of variable import options objects
Type specific variable import options, returned as an array
of variable import options objects. The array contains an object corresponding
to each variable specified in the VariableNames
property.
Each object in the array contains properties that support the importing
of data with a specific data type.
Variable options support these data types: numeric, text, logical
, datetime
,
or categorical
.
To query the current (or detected) options for a variable, use
the getvaropts
function.
To set and customize options for a variable, use the setvaropts
function.
Example: opts.VariableOptions
returns a collection
of VariableImportOptions
objects, one corresponding
to each variable in the data.
Example: getvaropts(opts,'Height')
returns
the VariableImportOptions
object for the Height
variable.
Example: opts = setvaropts(opts,'Height','FillValue',0)
sets
the FillValue
property for the variable Height
to 0
.
Location Properties
DataLines
— Data location
positive scalar integer | array of positive scalar integers
Data location, specified as a positive scalar integer or a
N-
by-2
array of positive scalar integers.
Specify DataLines
using one of these forms.
Specify as |
Description |
---|---|
|
Specify the first line that contains the
data. Specifying the value using
|
| Specify the line range that contains
the data. Values in the array |
|
Specify multiple line ranges to read with
an A valid array of multiple line ranges must:
When specifying multiple line ranges, use
|
Example: opts.DataLines = 5
sets the DataLines
property to the value [5 inf]
. Read all rows of data starting from
row 5
to the end-of-file.
Example: opts.DataLines = [2 6]
sets the property to read lines
2
through 6
.
Example: opts.DataLines = [1 3; 5 6; 8 inf]
sets the property to
read rows 1
, 2
, 3
,
5
, 6
, and all rows between
8
, and the end-of-file.
Data Types: single
| double
| uint8
| uint16
| uint32
| uint64
RowNamesColumn
— Row names location
0
(default) | positive scalar integer
Row names location, specified as a positive scalar integer.
The RowNamesColumn
property specifies the location
of the column containing the row names.
If RowNamesColumn
is specified as 0, then
do not import the row names. Otherwise, import the row names from
the specified column.
Example: opts.RowNamesColumn = 2;
Data Types: single
| double
| uint8
| uint16
| uint32
| uint64
VariableNamesLine
— Variable names location
0
(default) | positive scalar integer
Variable names location, specified as a positive scalar integer.
The VariableNamesLine
property specifies the line
number where variable names are located.
If VariableNamesLine
is specified as 0, then
do not import the variable names. Otherwise, import the variable names
from the specified line.
Example: opts.VariableNamesLine = 6;
Data Types: single
| double
| uint8
| uint16
| uint32
| uint64
VariableDescriptionsLine
— Variable description location
0
(default) | positive scalar integer
Variable description location, specified as a positive scalar
integer. The VariableDescriptionsLine
property
specifies the line number where variable descriptions are located.
If VariableDescriptionsLine
is specified
as 0, then do not import the variable descriptions. Otherwise, import
the variable descriptions from the specified line.
Example: opts.VariableDescriptionsLine = 7;
Data Types: single
| double
| uint8
| uint16
| uint32
| uint64
VariableUnitsLine
— Variable units location
0
(default) | positive scalar integer
Variable units location, specified as a positive scalar integer.
The VariableUnitsLine
property specifies the line
number where variable units are located.
If VariableUnitsLine
is specified as 0, then
do not import the variable units. Otherwise, import the variable units
from the specified line.
Example: opts.VariableUnitsLine = 8;
Data Types: single
| double
| uint8
| uint16
| uint32
| uint64
Delimited Text Properties
Delimiter
— Field delimiter characters
string array | character vector | cell array of character vectors
Field delimiter characters in a delimited text file, specified as a string array, character vector, or cell array of character vectors.
Example: "Delimiter","|"
Example: "Delimiter",[";","*"]
Whitespace
— Characters to treat as white space
character vector | string scalar
Characters to treat as white space, specified as a character vector or string scalar containing one or more characters.
Example: 'Whitespace',' _'
Example: 'Whitespace','?!.,'
LineEnding
— End-of-line characters
["\n","\r","\r\n"]
(default) | string array | character vector | cell array of character vectors
End-of-line characters, specified as a string array, character vector, or cell array of character vectors.
Example: "LineEnding","\n"
Example: "LineEnding","\r\n"
Example: "LineEnding",["\b",":"]
CommentStyle
— Style of comments
string array | character vector | cell array of character vectors
Style of comments, specified as a string array, character vector, or cell array of character vectors. For single- and multi-line comments, the starting identifier must be the first non-white-space character. For single-line comments, specify a single identifier to treat lines starting with the identifier as comments. For multi-line comments, lines from the starting (first) identifier to the ending (second) identifier are treated as comments. No more than two character vectors of identifiers can be specified.
For example, to ignore the line following a percent symbol as the first
non-white-space character, specify CommentStyle
as
"%"
.
Example: "CommentStyle",["/*"]
Example: "CommentStyle",["/*","*/"]
ConsecutiveDelimitersRule
— Procedure to manage consecutive delimiters
"split"
| "join"
| "error"
Procedure to manage consecutive delimiters in a delimited text file, specified as one of the values in this table.
Value | Behavior |
---|---|
"split" | Split the consecutive delimiters into multiple fields. |
"join" | Join the delimiters into one delimiter. |
"error" | Return an error and cancel the import operation. |
LeadingDelimitersRule
— Procedure to manage leading delimiters
"keep"
| "ignore"
| "error"
Procedure to manage leading delimiters in a delimited text file, specified as one of the values in this table.
Value | Behavior |
---|---|
"keep" | Keep the delimiter. |
"ignore" | Ignore the delimiter. |
"error" | Return an error and cancel the import operation. |
TrailingDelimitersRule
— Procedure to manage trailing delimiters
'keep'
| 'ignore'
| 'error'
Procedure to manage trailing delimiters in a delimited text file, specified as one of the values in this table.
Leading Delimiters Rule | Behavior |
---|---|
'keep' | Keep the delimiter. |
'ignore' | Ignore the delimiter. |
'error' | Return an error and abort the import operation. |
Encoding
— Character encoding scheme
''
| 'UTF-8'
| 'system'
| 'ISO-8859-1'
| 'windows-1251'
| 'windows-1252'
| ...
Character encoding scheme associated with the file, specified as the comma-separated
pair consisting of 'Encoding'
and 'system'
or a
standard character encoding scheme name.
When you do not specify any encoding, the function uses automatic character set detection to determine the encoding when reading the file.
Example: 'Encoding','system'
uses the system default
encoding.
Data Types: char
| string
Replacement Rules
MissingRule
— Procedure to manage missing data
'fill'
(default) | 'error'
| 'omitrow'
| 'omitvar'
Procedure to manage missing data, specified as one of the values in this table.
Missing Rule | Behavior |
---|---|
'fill' | Replace missing data with the contents of the The |
'error' | Stop importing and display an error message showing the missing record and field. |
'omitrow' | Omit rows that contain missing data. |
'omitvar' | Omit variables that contain missing data. |
Example: opts.MissingRule = 'omitrow';
Data Types: char
| string
EmptyLineRule
— Procedure to handle empty lines
'skip'
| 'read'
| 'error'
Procedure to handle empty lines in the data, specified as 'skip'
, 'read'
,
or 'error'
. The importing function interprets white
space as empty.
Empty Line Rule | Behavior |
---|---|
'skip' | Skip the empty lines. |
'read' | Import the empty lines. The importing function parses the empty
line using the values specified in VariableWidths , VariableOptions , MissingRule ,
and other relevant properties, such as Whitespace . |
'error' | Display an error message and abort the import operation. |
Example: opts.EmptyLineRule = 'skip';
Data Types: char
| string
ImportErrorRule
— Procedure to handle import errors
'fill'
(default) | 'error'
| 'omitrow'
| 'omitvar'
Procedure to handle import errors, specified as one of the values in this table.
Import Error Rule | Behavior |
---|---|
'fill' | Replace the data where the error occurred with the contents of the
The
|
'error' | Stop importing and display an error message showing the error-causing record and field. |
'omitrow' | Omit rows where errors occur. |
'omitvar' | Omit variables where errors occur. |
Example: opts.ImportErrorRule = 'omitvar';
Data Types: char
| string
ExtraColumnsRule
— Procedure to handle extra columns
'addvars'
| 'ignore'
| 'wrap'
| 'error'
Procedure to handle extra columns in the data, specified as one of the values in this table.
Extra Columns Rule | Behavior |
---|---|
'addvars' | To import extra columns, create new variables. If there are |
'ignore' | Ignore the extra columns of data. |
'wrap' | Wrap the extra columns of data to new records. This action does not change the number of variables. |
'error' | Display an error message and abort the import operation. |
Data Types: char
| string
Object Functions
getvaropts | Get variable import options |
setvaropts | Set variable import options |
setvartype | Set variable data types |
preview | Preview eight rows from file using import options |
Examples
Define Import Options for Variables in Delimited Text File
Define an import options object to read multiple variables from patients.dat
.
Based on the contents of your file, define these variable properties: names, types, delimiter character, data starting location, and the extra column rule.
varNames = {'LastName','Gender','Age','Location','Height','Weight','Smoker'} ; varTypes = {'char','categorical','int32','char','double','double','logical'} ; delimiter = ','; dataStartLine = 2; extraColRule = 'ignore';
Use the delimitedTextImportOptions
function and your variable information to initialize the import options object opts
.
opts = delimitedTextImportOptions('VariableNames',varNames,... 'VariableTypes',varTypes,... 'Delimiter',delimiter,... 'DataLines', dataStartLine,... 'ExtraColumnsRule',extraColRule);
Use the preview
function with the import options object to preview the data.
preview('patients.dat',opts)
ans=8×7 table
LastName Gender Age Location Height Weight Smoker
____________ ______ ___ _____________________________ ______ ______ ______
{'Smith' } Male 38 {'County General Hospital' } 71 176 false
{'Johnson' } Male 43 {'VA Hospital' } 69 163 false
{'Williams'} Female 38 {'St. Mary's Medical Center'} 64 131 false
{'Jones' } Female 40 {'VA Hospital' } 67 133 false
{'Brown' } Female 49 {'County General Hospital' } 64 119 false
{'Davis' } Female 46 {'St. Mary's Medical Center'} 68 142 false
{'Miller' } Female 33 {'VA Hospital' } 64 142 false
{'Wilson' } Male 40 {'VA Hospital' } 68 180 false
Import the data using readtable
.
T = readtable('patients.dat',opts); whos T
Name Size Bytes Class Attributes T 100x7 33987 table
Version History
Introduced in R2016bR2018b: Create options object using delimitedTextImportOptions
function
Use the delimitedTextImportOptions
function to create a
DelimitedTextImportOptions
object. Previously, you could
create this object only by using the detectImportOptions
function.
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: United States.
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)