Blast And its Types

Special kinds of BLASTs

In addition to the standard BLAST algorithms (BLASTn, BLASTp, BLASTx, tBLASTn, tBLASTx), there are several special kinds of BLASTs that have been developed to address specific needs in sequence analysis. Here are a few examples:

Characteristics of BLAST

BLAST (Basic Local Alignment Search Tool) possesses a number of essential characteristics that contribute to its efficacy and pervasive application in sequence analysis. Here are some distinguishing features of BLAST:

How BLAST Works

The BLAST algorithm is a heuristic program, which means it uses intelligent shortcuts to perform the search more quickly.Advertisements
BLAST performs “local” alignments. Functional domains are frequently repeated within the same protein as well as across proteins from different species in the vast majority of proteins.
The BLAST algorithm is optimized to identify these domains or shorter sequence-similar segments. Local alignment also allows an mRNA to be aligned with a fragment of genomic DNA, which is frequently necessary for genome assembly and analysis.
If BLAST initially attempted to align two sequences along their entire lengths (known as a global alignment), fewer similarities would be detected, particularly in terms of domains and motifs.
When a query is submitted through one of the BLAST Web pages, the sequence, along with any other input information such as the database to be searched, word size, expected value, etc., is supplied to the algorithm on the BLAST server.
BLAST operates by first creating a look-up table of all the “words” (brief subsequences, which for proteins have a default length of three letters) and “neighboring words,” i.e., words in the query sequence that are similar to the query words.
The sequence database is then searched for these “hot spots” When a match is found, it is utilized to generate gap-free and gapped extensions of the “word.” Directly searching GenBank flatfiles (or any subset of GenBank flatfiles) is not supported by BLAST.
Sequences are instead added to BLAST databases. Each entry is divided into two files, one containing only the header information and the other containing only the sequence information.
These are the data utilized by the algorithm. If BLAST is to be executed in “stand-alone” mode, the data file may contain local, private data, downloaded NCBI BLAST databases, or a combination of both.
After the algorithm has searched for and maximally extended all possible “words” from the query sequence, it assembles the best alignment for each query–sequence pair and writes this information to a SeqAlign data structure. The SeqAlign structure does not contain sequence information; instead, it references the sequences in the BLAST database.
The BLAST Formatter, which resides on the BLAST server, can utilize the information in the SeqAlign to retrieve and display similar sequences in a variety of ways. Therefore, once a query has been executed, the results can be reformatted without rerunning the search. This is made feasible by the QBLAST system.

Genomics_command_line_quiz1 For all projects, you may use your own Unix-based system and, where applicable, ensure that you are running the version of the software specified in the assignments. Alternatively, you may use the VMBox virtual machine environment provided with the course materials. Instructions on how to download and use the environment can be found on the course web site. For the following questions, refer to the class workflow and use the data in the Online materials (‘gencommand_proj1_data.tar.gz’) to answer the questions. Assume you sequenced and assembled the genome of Malus domestica (apple), and performed gene annotation. You then collected samples and ran RNA-seq experiments to determine sets of genes that are expressed in the various tissues. This information was stored, respectively, in the following files: “apple.genome”, “apple.genes”, “apple.condition{A,B,C}”. NOTE: The apple genome and the apple gene annotations for this project were extracted from the Ro...

DECODE The Script of LIFE

Search This Blog