Coffee Genome Project

In Colombia the Coffee Genome Project has as fundamental goals to learn about the location of genes of interest (cup quality, resistance to diseases and pests, yield, etc.) on the chromosomes, their sequences (structural genomics) and their function (functional genomics).  CENICAFE has developed this initiative in collaboration with Cornell University and the University of Maryland and more recently The Institute for Genomic Research – TIGR and with the financial support of the Colombia Ministry of Agriculture.

One of the main targets of the project is the development of a coffee variety with improved genetic resistance to the coffee berry borer Hypothenemus hampei, the major coffee pest in Colombia. As part of the project we are developing genomics and proteomics data on the interaction between coffee and H. hampei and the interaction between H. hampei and its biological control agent Beauveria bassiana. More

Coffee Sequencing and Bioinformatics

We have implemented a web-based Bioinformatics platform that functions as a genomics information resource for coffee and other organisms studied at CENICAFE.
The Bioinformatics platform includes a Laboratory Integrated Management System (LIMS), the implementation of wEMBOSS,  home-developed perl tools for data analysis, InterproScan for annotation of sequence domains, and the implementation of wBLAST and wNetBLAST among other tools available. The main backbone of the system is an adaptation of the SOL Genomics Network (SGN) databases developed at Cornell University for ESTs, molecular markers and BAC sequences storage and analysis. The system is based on the postgresQL relational database, the use of perl scripts for the manipulation of data, the Apache Web server with the mod_perl integrated perl interpreter, and the servers run the Debian distribution of the GNU/Linux operating system. Although SGN has mainly developed as a plant genomics oriented resource, the Cenicafe platform has implemented several new tools and databases for the analysis of other organisms sequence data such as fungi and insects.

The Cenicafe databases contain to date over 35,000 coffee EST sequences, around 6,000 Beauveria bassiana EST sequences and more than 4,000 Hypothenemus hampei (coffee berry borer) EST sequences. The sequences are annotated based on Solanaceae, Arabidopsis, Swissprot and Genbank sequence comparisons using BLAST homology searches, aminoacids are predicted using ESTScan, the domains are annotated using InterproScan and Gene families are annotated using a perl script developed at SGN.
The system will implement in the near future a database of coffee genetics resources developed at Cenicafe, a proteomics platform, and a Microarray database. We will also be incorporating other components to the platform specially for the visualization of genetic maps from the Gmod project (gbrowse), the SGN system, TIGR, and other open source projects.

A number of the C. arabica generated sequences will be published on SGN in the near future

