MuSeqBox Tutorial

For a tutorial purpose, the MuSeqBox online uses a local BLAST output file, called my_blastx_output created by executing a BLASTX search of the entries in the query against mypept database of 28 protein sequences, to demonstrate the usages of MuSeqBox online. We assume that following the instruction on BLAST usage as follows, the users may produce their own BLAST outputs and save as local files on their machine to replace our provided my_blastx_output

.

BLAST USAGE:

1. Formatting and searching mypept database using BLASTX:

makeblastdb -in mypept -dbtype prot -parse_seqids
blastx -db mypept -query query -out my_blastx_output -show_gis

Notes: The blastx command executes a BLASTX search of the entries in query against mypept; the results are saved in my_blastx_output . The "-show_gis" option selects to display the matching peptide sequences with the NCBI identifiers (GIs).

2. Searching NCBI non-redundant nucleic acid database using net-client BLASTN:

blastcl3 -p blastn -d nr -i query -e 1e-10 -o my_blastx_output -I

MUSEQBOX ONLINE USAGE:

The MuSeqBox online provides for users input file choices. Users may use our server provided MuSeqBox application outputs (e.g., maize EST BLASTX output, maize contig BLASTX output, and soybean contig BLASTX output) to search for queries of interests or supply their own BLAST outputs. If BLAST search produces a large output file, we strongly suggest that users download the MuSeqBox stand-alone version to post-process the BLAST results. Following is the tutorial showing the usage of MuSeqBox online using a local file my_blastx_output provided by our MuSeqBox distribution package.

1. Create default tabulated output to your Browser from a BLAST input file:

  • Click the checkbox left to "Supply your own BLAST output file"
  • Click the Browse to locate your local BLAST output, e.g., my_blastx_output
  • Click submit botton (see output derived from my_blastx_output)

    Notes: By default, the MuSeqBox online uses options "-n 3" and "-p 4" to create the output (for more detailed information, see the manual document in the distributed package). Those options are corresponding to online parameters Display Hits and Print format, respectively. Users can change those parameters by pulling down the selection menu and choose their desirable ones. For example, to retain only the top two BLAST hits for each query in condensed print format (pstyle=3), users may follow the first two steps above and do:

  • Click the Display hits pull-down menu and select "Top 2 if any" option
  • Click the Print format pull-down menu and select "Condensed" option
  • Click submit botton

    Note: The users may request the MuSeqBox output be sent via their email address. To do so, the users may need to check the checkbox left to "Send the (text format) output to this email address:" and fill in the blank with their email addresses.

2. Selecting queries satisfying complex criteria specified by the users and output to your browser:

  • Following steps in 1 to provide basic online settings (i.e., Display hits and Print format) and to select a local BLAST output file for the MuSeqBox
  • Clicking the checkbox left to "Select queries based on the following criteria:"
  • Clicking the checkboxes left to those criteria you wanted and fill in the blanks with the values. For example, to select the queries with query sequence length large than 600 and with expection value less than 1e-10, first check both checkboxs left to QLen and Eval, and then fill in the blanks with 600 and 1e-10, respectively
  • Clicking submit botton

3. Identification of potential retained introns in EST queries on the basis of matching peptide BLAST hits:

  • Following steps in 1 to provide basic online settings (i.e., Display hits and Print format) and to select a local BLAST output file (e.g., my_blastx_output) for the MuSeqBox
  • Clicking the ckeckbox left to "Select queries that represent potential alternatively spliced transcripts:"
  • Filling in the blank correspoding to indel parameter with the minimal insertion segment size value, for example, indel >=40 nt
  • Clicking the type pull-down menu and choose "Insertion in query relative to subject"
  • Clicking the submit botton (see the output)

4. Identification of potential skipped exons in the EST queries on the basis of matching peptide BLAST hits:

  • Following the first two steps in 3
  • Filling in the blank with the minimal deletion segment size value, for example, indel=90 nt
  • Clicking the type pull-down memu and choose "Deletion in query relative to subject"
  • clicking the submit botton

5. Identification of potential (near) full-length transcripts among EST queries on the basis of matching peptide BLAST hits:

  • Following steps in 1 to provide basic online settings (i.e., Display hits and Print format) and to select a local BLAST output file for the MuSeqBox
  • Clicking the checkbox left to "Select queries that potentially encode full-length coding sequences:"
  • Fill in the parameter requirement blanks (clicking help to see detailed information on those defined parameters). For example, option "-F 10 10 95.0 40.0" is corresponding to d5p <=10, d3p <=10, scv >=95.0%, and qsc >=40.0%, respectively.
  • Clicking submit botton (see output)

    Note: This example selects those hits for which the matching peptide sequence has HSPs covering at least 95% of the peptide sequence, with the terminal HSPs starting from within the first 10 amino acids of the peptide and extending into the last 10 amino acids, respectively. Moreover, a query sequence coverage of at least 40% is also required. In the example, it is clear that the maize EST AW065755 encodes the entire homolog of the Arabidopsis 60S ribosomal protein L18A.

Loading Help Page...Thanks for your patience!

Loading Video...Thanks for your patience!

Loading Image...Thanks for your patience!