int2nt
Convert nucleotide sequence from integer to letter representation
Syntax
SeqChar
= int2nt(SeqInt
)
SeqChar
= int2nt(SeqInt
,
...'Alphabet', AlphabetValue
, ...)
SeqChar
= int2nt(SeqInt
,
...'Unknown', UnknownValue
, ...)
SeqChar
= int2nt(SeqInt
,
...'Case', CaseValue
, ...)
Input Arguments
SeqInt | Row vector of integers specifying a nucleotide sequence. For valid integers, see the table Mapping Nucleotide Integers to Letter Codes. Integers are arbitrarily assigned to IUB/IUPAC letters. |
AlphabetValue | Character vector or string specifying a nucleotide alphabet. Choices are:
|
UnknownValue | Character to represent unknown nucleotides, that is 0 or
integers ≥ 17 . Choices are any character
other than the nucleotide characters A , C , G , T ,
and U and the ambiguous nucleotide characters N , R , Y , K , M , S , W , B , D , H ,
and V . Default is * . |
CaseValue | Character vector or string specifying the upper or lower case. Choices are
'upper' (default) or 'lower' . |
Output Arguments
SeqChar | Nucleotide sequence specified by a character vector of codes. |
Description
converts SeqChar
= int2nt(SeqInt
)SeqInt
,
a row vector of integers specifying a nucleotide sequence, to SeqChar
,
a character vector of codes specifying the same nucleotide sequence.
For valid codes, see the table Mapping Nucleotide Integers to Letter Codes.
Mapping Nucleotide Integers to Letter Codes
Nucleotide | Integer | Code |
---|---|---|
Adenosine | 1 | A |
Cytidine | 2 | C |
Guanine | 3 | G |
Thymidine | 4 | T |
Uridine (if 'Alphabet' set to 'RNA' ) | 4 | U |
Purine (A or G ) | 5 | R |
Pyrimidine (T or C ) | 6 | Y |
Keto (G or T ) | 7 | K |
Amino (A or C ) | 8 | M |
Strong interaction (3 H bonds) (G or C ) | 9 | S |
Weak interaction (2 H bonds) (A or T ) | 10 | W |
Not A (C or G or T ) | 11 | B |
Not C (A or G or T ) | 12 | D |
Not G (A or C or T ) | 13 | H |
Not T or U (A or C or G ) | 14 | V |
Any nucleotide (A or C or G or T or U ) | 15 | N |
Gap of indeterminate length | 16 | - |
Unknown (any integer not in table) | 0 or ≥ 17 | * (default) |
calls SeqChar
= int2nt(SeqInt
,
...PropertyName
', PropertyValue
,
...)int2nt
with optional properties
that use property name/property value pairs. You can specify one or
more properties in any order. Each PropertyName
must
be enclosed in single quotation marks and is case insensitive. These
property name/property value pairs are as follows:
specifies
a nucleotide alphabet. SeqChar
= int2nt(SeqInt
,
...'Alphabet', AlphabetValue
, ...)AlphabetValue
can
be 'DNA'
, which uses the symbols A
, C
, G
,
and T
, or 'RNA'
, which uses
the symbols A
, C
, G
,
and U
. Default is 'DNA'
.
specifies
the character to represent unknown nucleotides, that is SeqChar
= int2nt(SeqInt
,
...'Unknown', UnknownValue
, ...)0
or
integers ≥ 17
. UnknownValue
can
be any character other than the nucleotide characters A
, C
, G
, T
,
and U
and the ambiguous nucleotide characters N
, R
, Y
, K
, M
, S
, W
, B
, D
, H
,
and V
. Default is *
.
specifies
the upper or lower case. SeqChar
= int2nt(SeqInt
,
...'Case', CaseValue
, ...)CaseValue
can
be 'upper'
(default) or 'lower'
.
Examples
Convert a nucleotide sequence from integer to letter representation.
s = int2nt([1 2 4 3 2 4 1 3 2]) s = ACTGCTAGC
Convert a nucleotide sequence from integer to letter representation and define
#
as the symbol for unknown numbers17
and greater.si = [1 2 4 20 2 4 40 3 2]; s = int2nt(si, 'unknown', '#') s = ACT#CT#GC
Version History
Introduced before R2006a
See Also
aa2int
| baselookup
| int2aa
| nt2int