BLAST-OER

prev toc next

Interpreting BLAST output

The NCBI BLAST output first reports a ‘search summary’ (click to open), which gives a summary of the parameter settings of the BLAST run. Then, for protein BLAST, an overview of recognised known protein domains is given, if any. Then, the hits are displayed in a graphical overview, where each hit has a colour code depending on its score. To save space, hits from different parts of the input sequence may be placed next to each other. These are not connected with a line. When connected with a thin line, hits represent multiple sub hits to the same sequence in the database.

Below the graphical overview, the hits are presented in a tabular format, indicating the name of the hit (‘Description’), the Max score, Total score, Query Cover, E value, Ident(ity), and Accession code. The columns have the following interpretation:

The expect value E of a score S is the number of alignments with scores greater than or equal to S that are expected to occur by chance in the specific database search. This is related to the probability that the alignment occurred by chance, given the formula p = 1 – e-E. For very small E-values, these can be interpreted as the probability that the hit occurred by random chance (for example, for E = 0.0001, the corresponding p = 0.0001), for high E-values it represents the number of hits to be expected by random chance (for example, for E = 10 one expects 10 hits with at least the same score by chance).

The score and E-value are not always easy to interpret as longer less exact matches and short exact matches will lead to similar values. It requires interpretation of the results (and of the biology) to determine the relevance of each hit. Also, hits may only be against known-protein domains that occur in many proteins, which are not necessarily related or homologous to the input protein.

Max and Total scores

Let’s take the following example:

tggcttagta gaagttgaaa aaggcgtttt gcctcaactt gaacagccct atgtgttcat
|||||||||| ||||||||||            |||||||||| |||||||||| ||||||||||
tggcttagta gaagttgaaa ggccccaaaa gcctcaactt gaacagccct atgtgttcat

Here, the max score is 15, while the total score is 25.

Only if the alignment consists of one continued alignment without gaps, then the max and total score is identical.

prev toc next