Main Content

aa2int

Convert amino acid sequence from letter to integer representation

    Description

    SeqInt = aa2int(SeqChar) converts a character vector or string containing single-letter codes specifying an amino acid sequence to a row vector of integers specifying the same amino acid sequence. For valid letter codes, see Mapping Amino Acid Letter Codes to Integers.

    example

    SeqInt = aa2int(SeqChar,Unknown=unknownAA) specifies the number used to represent an unknown amino acid.

    Examples

    collapse all

    Create a random amino acid sequence.

    SeqChar = randseq(20,Alphabet="amino")
    SeqChar = 
    'TYNYMRQLVVDVVITNHYSV'
    

    Convert the sequence from letter to integer representation.

    SeqInt = aa2int(SeqChar)
    SeqInt = 1x20 uint8 row vector
    
       17   19    3   19   13    2    6   11   20   20    4   20   20   10   17    3    9   19   16   20
    
    

    Input Arguments

    collapse all

    Amino acid sequence, specified as one of the following:

    • Character vector or string scalar containing single-letter codes specifying an amino acid sequence. For valid letter codes, see . Unknown characters are mapped to 0. Integers are arbitrarily assigned to IUB/IUPAC letters.

    • MATLAB® structure containing a Sequence field that contains an amino acid sequence, such as the output returned by fastaread, getgenpept, genpeptread, getpdb, and pdbread.

    Number representing an unknown amino acid character, specified as a numeric scalar.

    More About

    collapse all

    Mapping Amino Acid Letter Codes to Integers

    Amino AcidCodeInteger
    Unknown character (any character or symbol not in this table) ? 0
    Alanine A 1
    Arginine R 2
    Asparagine N 3
    Aspartic acid (Aspartate) D 4
    Cysteine C 5
    Glutamine Q 6
    Glutamic acid (Glutamate) E 7
    Glycine G 8
    Histidine H 9
    Isoleucine I10
    Leucine L 11
    Lysine K 12
    Methionine M 13
    Phenylalanine F 14
    Proline P 15
    SerineS16
    ThreonineT17
    Tryptophan W 18
    Tyrosine Y 19
    Valine V 20
    Asparagine or Aspartic acid (Aspartate) B 21
    Glutamine or Glutamic acid (Glutamate)Z 22
    Unknown amino acid (any amino acid) X 23
    Translation stop * 24
    Gap of indeterminate length - 25

    Version History

    Introduced before R2006a

    See Also

    Functions