Example 1: Single Annotation with Evidence (the simple life)

perl xGAEVAL.pl --clean_db --GFF examples/example1.gff3

Description:

This example illustrates the least complex and most often encountered usage of GAEVAL. In this case the example1.gff3 file encodes a single gene annotation ( the Arabidopsis thaliana locus At3g20090.1 ). In terms of the gff3 format this includes the gene, mRNA, exon, and CDS lines describing the structure and properties of this gene encoding transcript. Also included in example1.gff3 are match lines describing the alignments of three cDNA sequences (gi numbers: 42463391, 42460729, and 17381119) and two EST sequences (gi numbers: 19877306 and 59927596).

GAEVAL Results:

Annotation Summary

1: Annotation: At3g20090.1
2:  Genomic Source: 3
3:  Structure: join(7015803..7015913,7016352..7016407,7016733..7017561,7017643..7018366)
4:  Open Reading Frame: 7017052 to 7018293   
5:  5` UTR length:  484
6:  CDS length:     1159
7:  3` UTR length:  73
8:  Total length:   1716
    

This section is included in the basic report for each individual annotation and includes the follow information.

Structure Analysis

 1:  Integrity Score (0-1): 0.99
 2:  Exon Sequence Coverage: 100%
 3:  5` Terminus
 4:   Evidence supports the extension of this annotation boundary by 88 bases
 5:  3` Terminus
 6:   Evidence supports the extension of this annotation boundary by 2382 bases
 7:  Introns (total|confirmed|unsupported): 3 | 3 | 0
 8:  Individual Intron Support:
 9:   Intron 1 ( 7015914..7016351 )
10:    Supporting Alignments: 1
11:   Intron 2 ( 7016408..7016732 )
12:    Supporting Alignments: 1
13:   Intron 3 ( 7017562..7017642 )
14:    Supporting Alignments: 3
    

This section is also included in the basic report for each individual annotation and includes the follow information.

Incongruency Analysis:

 1: Incongruency Analysis:
 2:  No Ambiguously Overlapping Annotations Detected

 3:  Incongruent Introns Detected:
 4:   7016408..7016732
 5:     Conflicting Intron
 6:      Conflicting evidence: 1
 7:      Supporting evidence: 1
 8:   7018300..7018455
 9:     Alternative Intron
10:      Supporting evidence: 1
11:   7018469..7018493
12:     Additional Intron
13:      Supporting evidence: 1
14:   7018584..7018620
15:     Additional Intron
16:      Supporting evidence: 1
17:   7018701..7019102
18:     Additional Intron
19:      Supporting evidence: 1
20:   7019938..7020031
21:     Additional Intron
22:      Supporting evidence: 1

23:  No Complex Transcript Processing Detected
    

This section describes incongruencies found between the annotation and the supplied evidence alignments.

Discussion:

This example shows how GAEVAL can be used to quickly evaluate evidence for an annotation structure. For example, the At3g20090.1 locus in this scenerio is 100% covered by cDNA/EST alignments with all three of its annotated introns confirmed by at least 1 alignment. This along with the exceptable lengths of 484 bases and 73 bases for its 5` and 3` UTRs give this annotation and exceptional 0.99 integrity score. However, the GAEVAL report also details possible changes / additions applicable to the annotation. Such changes include the support for extending the 5` and 3` annotation boundaries and the documentation of possible alternatively spliced isoforms or additional introns.