07/17/2013 - Sanger's HAVANA group announces rat genome annotation initiative

Manual Annotation, the Vega Genome Browser and the Rat Reference Gene Set:

HAVANA: human and vertebrate analysis and annotation Havana, the Sanger Institute's "human and vertebrate analysis and annotation" group, announces a proposal to produce a rat reference gene set that will contain a depth of annotation similar to the GENCODE gene build (e.g. including long non-coding RNAs as well as protein coding genes) [1].  The GENCODE gene set is the reference set for the ENCODE project which aims to annotate all evidence based gene features on the human genome to a highly accurate level.  As the rat gene set matures and more transcriptomic data is incorporated, Havana will examine the possibility of producing a consensus gene set within the rat annotation community.  The Havana group has considerable experience with community based annotation [2] and recently manually annotated 1,369 immunity-related genes in the swine genome [3].

Towards this aim, Havana organised a Rat Annotation Jamboree at the Wellcome Trust Sanger Institute earlier this year. This meeting brought together community experts, RGD faculty, Havana manual annotators and Ensembl developers to discuss methods to improve rat gene annotation.

Havana's manually annotated gene sets are initially published in The Vega Genome browser (http://vega.sanger.ac.uk/index.html) [4], before being incorporated into Ensembl.  The delay between Havana making changes to annotation, and those changes being visible publically in Vega has in the past been considerable, particularly for species such as Zebrafish and Mouse, which are only updated 1-2 times a year. To address this issue an automated pipeline has been developed that will publish changes to annotations far more rapidly than was previously possible. This is now run every two weeks on human and mouse, and with the aim of increasing this to weekly releases, and to incorporate other species such as rat.  Annotations identified by this pipeline are presented in Vega as a separate track (Havana Update) with a distinct colour scheme while the genes and transcripts have the usual Gene and Transcript Views.


[1]  GENCODE: the reference human genome annotation for The ENCODE Project.  Harrow J, et al. Genome research 2012;22;9;1760-74, PUBMED: 22955987; PMC: 3431492; DOI: 10.1101/gr.135350.111.

[2]  Community gene annotation in practice. Loveland JE, Gilbert JG, Griffiths E, Harrow JL.  Database (Oxford). 2012 Mar 20;2012:bas009.  PUBMED: 22434843; PMC: 3308165; doi: 10.1093/database/bas009.

[3]  Structural and functional annotation of the porcine immunome. Dawson HD, et al. BMC Genomics. 2013 14:332.  PUBMED: 23676093; PMC: 3658956; doi: 10.1186/1471-2164-14-332.

[4]  The vertebrate genome annotation (Vega) database. L. G. Wilming, J. G. R. Gilbert, K. Howe, S. Trevanion,T. Hubbard and J. L. Harrow.  Nucleic Acid Res. 2008 Jan; Advance Access published on November 14, 2007;  PUBMED: 18003653; PMC: 2238886; doi:10.1093/nar/gkm987.


Based on input from the rat research community, the Havana group has selected, reviewed and annotated their first region of rat Rnor5.0—the region from 80-85 Mb on chromosome 13.  These annotations were released in September 2013. Click here to learn more!

To submit your genomic region of interest for review and annotation by the Havana group, click here to contact us.