blastread
Read data from NCBI BLAST report file
Description
reads the NCBI BLAST report data from an XML-formatted file,
blastdata
= blastread(blastreport
)blastreport
, and returns blastdata
, a
structure containing the corresponding BLAST data.
Examples
Perform BLAST search
Perform a BLAST search on a protein sequence and save the results to an XML file.
Get a sequence from the Protein Data Bank and create a MATLAB structure.
S = getpdb('1CIV');
Use the structure as input for the BLAST search with a significance threshold of 1e-10
. The first output is the request ID, and the second output is the estimated time (in minutes) until the search is completed.
[RID1,ROTE] = blastncbi(S,'blastp','expect',1e-10);
Get the search results from the report. You can save the XML-formatted report to a file for an offline access. Use ROTE as the wait time to retrieve the results.
report1 = getblast(RID1,'WaitTime',ROTE,'ToFile','1CIV_report.xml')
Blast results are not available yet. Please wait ... report1 = struct with fields: RID: 'R49TJMCF014' Algorithm: 'BLASTP 2.6.1+' Database: 'nr' QueryID: 'Query_224139' QueryDefinition: 'unnamed protein product' Hits: [1×100 struct] Parameters: [1×1 struct] Statistics: [1×1 struct]
Use blastread
to read BLAST data from the XML-formatted BLAST report file.
blastdata = blastread('1CIV_report.xml')
blastdata = struct with fields: RID: '' Algorithm: 'BLASTP 2.6.1+' Database: 'nr' QueryID: 'Query_224139' QueryDefinition: 'unnamed protein product' Hits: [1×100 struct] Parameters: [1×1 struct] Statistics: [1×1 struct]
Alternatively, run the BLAST search with an NCBI accession number.
RID2 = blastncbi('AAA59174','blastp','expect',1e-10)
RID2 = 'R49WAPMH014'
Get the search results from the report.
report2 = getblast(RID2)
Blast results are not available yet. Please wait ... report2 = struct with fields: RID: 'R49WAPMH014' Algorithm: 'BLASTP 2.6.1+' Database: 'nr' QueryID: 'AAA59174.1' QueryDefinition: 'insulin receptor precursor [Homo sapiens]' Hits: [1×100 struct] Parameters: [1×1 struct] Statistics: [1×1 struct]
Input Arguments
blastreport
— Name of BLAST report file
character vector | string
Name of an XML-formatted BLAST report file, specified as a character vector or string.
Example: 'blastreport.xml'
Output Arguments
blastdata
— BLAST report data
structure
BLAST report data, returned as a structure that contains the following fields:
Field | Description |
---|---|
RID | Request ID for retrieving results from a specific NCBI BLAST search |
Algorithm | NCBI algorithm used to perform the BLAST search |
Database | All databases searched |
QueryID | Identifier of the query sequence |
QueryDefinition | Definition of the query sequence |
Hits | Structure containing information on the hit sequences, such as IDs, accession numbers, lengths, and HSPs (high-scoring segment pairs) |
Parameters | Structure containing information on the input parameters used to perform the search |
Statistics | Summary of statistical details about the performed search, such as lambda, kappa, and entropy values |
More About
Hits
This table lists each field of
blastdata.Hits
.
Field | Description |
---|---|
ID | ID of the subject sequence that matched the query sequence |
Definition | Description of the subject sequence |
Accession | Accession of the subject sequence |
Length | Length of the subject sequence |
Hsps | Structure containing Information on the high-scoring segment pairs (HSPs) |
Hits.Hsps
This table summarizes the fields of Hits.Hsps
.
Field | Description |
---|---|
Score | Pairwise alignment score for a high-scoring segment pair between the query sequence and a subject sequence. |
BitScore | Bit score for a high-scoring segment pair. |
Expect | Expectation value for a high-scoring segment pair. |
Identities | Number of identical or similar residues for a high-scoring segment pair between the query sequence and a subject sequence. |
Positives | Number of identical or similar residues for a high-scoring sequence pair between the query sequence and a subject amino acid sequence. This field applies only to translated nucleotide or amino acid query sequences and databases. |
Gaps | Nonaligned residues for a high-scoring segment pair. |
AlignmentLength | Length of the alignment for a high-scoring segment pair. |
QueryIndices | Indices of the query sequence residue positions for a high-scoring segment pair. |
SubjectIndices | Indices of the subject sequence residue positions for a high-scoring segment pair. |
Frame | Reading frame of the translated nucleotide sequence for a high-scoring segment pair. |
Alignment | 3-by-N character array showing the alignment for a high-scoring sequence pair between the query sequence and a subject sequence. The first row is the query sequence, the second row is the alignment, and the third row is the subject sequence. |
Version History
Introduced before R2006a
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)