Main Content

bioinfo.blastplus.MakeDatabaseOptions

Specify options to make BLAST database

Since R2024a

Description

A MakeDatabaseOptions object contains options to create a BLAST+ database [1][2].

Creation

Description

optionsObj = bioinfo.blastplus.MakeDatabaseOptions creates a MakeDatabaseOptions object with default property values.

example

optionsObj = bioinfo.blastplus.MakeDatabaseOptions(Name=Value) sets properties using one or more name-value arguments. Name is the property name and Value is the property value. For example, set ParseSequenceIDs=true to parse bar-delimited sequence identifiers.

Properties

expand all

Additional commands, specified as a character vector or string scalar.

The commands must be in the native syntax (prefixed by one dash). Use this option to apply undocumented flags and flags without corresponding MATLAB® properties.

Example: "-lcase_masking"

Data Types: char | string

Flag to include all object properties with their corresponding default values when converting to the original option syntax, specified as a numeric or logical 1 (true) or 0 (false). You can convert properties to the original syntax prefixed by a dash (such as -dbtype nucl) by using the getCommand function.

When IncludeAll=false and you call getCommand(optionsObject), the software converts only the specified properties. If the value is true, getCommand converts all available properties, using default values for unspecified properties, to the original syntax.

Note

If you set IncludeAll to true, the software translates all available properties, with default values for unspecified properties. The only exception is that when the default value of a property is NaN, Inf, [], '', or "", then the software does not translate the corresponding property.

Example: true

Data Types: logical

Input file type, specified as one of the following:

  • "fasta" — FASTA format

  • "blastdb" — BLAST database format

  • "asn1_txt" — Seq-entries in the text ASN.1 format

  • "asn1_bin" — Seq-entries in the binary ASN.1 format

Data Types: char | string

Flag to parse bar-delimited sequence identifiers, such as gi|129295, in a FASTA input, specified as a numeric or logical 1 (true) or 0 (false).

When the reference file is a FASTA file and ParseSequenceIDs=true, BLAST+ extracts database identifiers from the sequence IDs in the FASTA file and saves them in the created database. These identifiers are useful to filter or limit search results, for instance, by taxonomy. For details, see BLAST Command Line Applications User Manual. When ParseSequenceIDs=false (default), BLAST+ treats each sequence ID in the file only as a unique identifier for each sequence.

Data Types: double | logical

Title for the BLAST database, specified as a character vector or string scalar.

Data Types: char | string

This property is read-only.

Supported version of the original makeblastdb software, specified as a string scalar.

Data Types: string

Object Functions

getCommandTranslate object properties to original options syntax
getOptionsTableReturn table with all properties and equivalent options in original syntax
resetReset BLAST database options to default values

Examples

collapse all

Download some paired-end sequencing data in the FASTA format using the accession run number SRR26273031.

databaseFasta = srafasterqdump("SRR26273031",FastaOutput=true)

Create a local nucleotide database using the downloaded FASTA file. Specify "SRR26273031_nucl_db" as the base name of the output database. When creating the database, the function also generates multiple index files with the same base name. The blastplus function uses these index files automatically when you search the database later in this example.

blastplusdatabase("nucleotide","SRR26273031.fasta","SRR26273031_nucl_db");

You can also specify additional database creation options using a MakeDatabaseOptions object. For instance, specify the title of the database.

dbopts = bioinfo.blastplus.MakeDatabaseOptions;
dbopts.Title = "SRR26273031_Nucleotide_DB"
dbopts = 
  MakeDatabaseOptions with properties:

   Default properties:
        ExtraCommand: ""
          IncludeAll: 0
           InputType: "fasta"
    ParseSequenceIDs: 0
             Version: "2.14.0"

   Modified properties:
               Title: "SRR26273031_Nucleotide_DB"

You can then use the options object to make the database.

blastplusdatabase("nucleotide","SRR26273031.fasta","SRR26273031_nucl_db",dbopts);

Alternatively, you can use specify options, such as the title of the database, by using name-value arguments. For example:

blastplusdatabase("nucleotide","SRR26273031.fasta","SRR26273031_nucl_db",Title="SRR26273031_Nucleotide_DB");

To reset the property values to their default values, use the reset function.

dopts2 = reset(dbopts)
dopts2 = 
  MakeDatabaseOptions with properties:

   Default properties:
        ExtraCommand: ""
          IncludeAll: 0
           InputType: "fasta"
    ParseSequenceIDs: 0
               Title: [1×0 string]
             Version: "2.14.0"

   Modified properties:
    No properties.

Search the database using the FASTA file queryFile.fasta containing two nucleotide query sequences. This file is provided with the toolbox. Use the blastn query program which lets you search nucleotide queries against a nucleotide database. Specify "search1" as the name of the output report file. By default, the report file format is the traditional BLAST pairwise format. This format presents each query-subject pair alignment in detail.

blastplus("blastn","queryFile.fasta","SRR26273031_nucl_db","search1");

Open the file to review the search results. The first query sequence returns no hits, while the second query sequence returns multiple hits.

open search1;

You can also modify search options by creating a corresponding options object for the blastn query program. Use blastplusoptions or bioinfo.blastplus.*Options to create the options object. For instance, change the report format to an XML format.

bnopts = blastplusoptions("blastn"); % Or use bioinfo.blastplus.BLASTNOptions
bnopts.ReportFormat = "BLASTXML";
blastplus("blastn","queryFile.fasta","SRR26273031_nucl_db","search2_xml",bnopts);
open search2_xml;

Alternatively, you can set the value of a property of the options object, such as ReportFormat, using name-value argument syntax. For example:

blastplus("blastn","queryFile.fasta","SRR26273031_nucl_db","search2_xml",ReportFormat="BLASTXML");

You can use other query programs to search the database. For instance, use tblastx to search translated nucleotide queries against a translated nucleotide database. Both query sequences return hits for this search. Use the compact tabular format for the report. For details about the generated columns and other report formats, see ReportFormat.

blastplus("tblastx","queryFile.fasta","SRR26273031_nucl_db","search3_tab",ReportFormat="Tabular");
open search3_tab;

Delete the reports and downloaded FASTA file.

delete search1 search2_xml search3_tab SRR26273031.fasta

References

[1] Camacho, Christiam, George Coulouris, Vahram Avagyan, Ning Ma, Jason Papadopoulos, Kevin Bealer, and Thomas L Madden. “BLAST+: Architecture and Applications.” BMC Bioinformatics 10, no. 1 (December 2009): 421.

[2] “BLAST: Basic Local Alignment Search Tool.” https://blast.ncbi.nlm.nih.gov/Blast.cgi.

Version History

Introduced in R2024a