Navigation

 ·   Wiki Home
 ·   Data Processing
 ·   Hemileia vastatrix
 ·   Hypothenemus hampei
 ·   Coffea
 ·   Beauveria bassiana
 ·  
 ·   Title List
 ·   Uncategorized Pages
 ·   Random Page
 ·   Recent Changes
 ·   Wiki Help
 ·   What Links Here

Active Members:

Search:

 

Create or Find Page:

 

View Running Cegma to All Assemblies

CEGMA (Core Eukaryotic Genes Mapping Approach)

There are 458 core proteins that are present in a wide range of taxa. Since these proteins are highly conserved, sequence alignment methods can reliably identify their exon-intron structures in genomic sequences. The resulting dataset can be used to train a gene finder or to assess the completness of the genome or annotations.

Assembly %Completeness
Hv387CLCTag 44.35
Hv494CLCTag 35.08
HvCatCLCTag 39.92
HvDQ952CLCTag 31.85
HvH_179CLCTag 31.05
HvH_569CLCTag 37.50
HvH_701CLCTag 56.85
HvHMarCLCTag 22.18
CLCAssemblyAll454HvCatIllum 57.26
miraHvTotal9sff 14.92
ThirdHybridAssembly 91.94

Cegma Details

ThirdHybridAssembly

less output.completeness_report 

#      Statistics of the completeness of the genome based on 248 CEGs      #

              #Prots  %Completeness  -  #Total  Average  %Ortho 

  Complete      228       91.94      -   338     1.48     39.91

   Group 1       57       86.36      -    90     1.58     50.88
   Group 2       51       91.07      -    75     1.47     33.33
   Group 3       59       96.72      -    84     1.42     35.59
   Group 4       61       93.85      -    89     1.46     39.34

   Partial      235       94.76      -   400     1.70     49.79

   Group 1       60       90.91      -   100     1.67     56.67
   Group 2       51       91.07      -    86     1.69     47.06
   Group 3       60       98.36      -   103     1.72     45.00
   Group 4       64       98.46      -   111     1.73     50.00

#    These results are based on the set of genes selected by Genis Parra   #

#    Key:                                                                  #
#    Prots = number of 248 ultra-conserved CEGs present in genome          #
#    %Completeness = percentage of 248 ultra-conserved CEGs present        #
#    Total = total number of CEGs present including putative orthologs     #
#    Average = average number of orthologs per CEG                         #
#    %Ortho = percentage of detected CEGS that have more than 1 ortholog   #

Hv494CLCTag

cegma -g /data/process/Roya/assemblies/illuminaAssembly/CLCAssembly/Hv494CLCTag.fasta

less output.completeness_report 

#      Statistics of the completeness of the genome based on 248 CEGs      #

              #Prots  %Completeness  -  #Total  Average  %Ortho 

  Complete       87       35.08      -   197     2.26     70.11

   Group 1       18       27.27      -    43     2.39     83.33
   Group 2       15       26.79      -    41     2.73     73.33
   Group 3       26       42.62      -    58     2.23     76.92
   Group 4       28       43.08      -    55     1.96     53.57

   Partial      144       58.06      -   311     2.16     67.36

   Group 1       29       43.94      -    64     2.21     75.86
   Group 2       33       58.93      -    82     2.48     69.70
   Group 3       41       67.21      -    89     2.17     75.61
   Group 4       41       63.08      -    76     1.85     51.22

#    These results are based on the set of genes selected by Genis Parra   #

#    Key:                                                                  #
#    Prots = number of 248 ultra-conserved CEGs present in genome          #
#    %Completeness = percentage of 248 ultra-conserved CEGs present        #
#    Total = total number of CEGs present including putative orthologs     #
#    Average = average number of orthologs per CEG                         #
#    %Ortho = percentage of detected CEGS that have more than 1 ortholog   #

HvCatCLCTag

cegma -g /data/process/Roya/assemblies/illuminaAssembly/CLCAssembly/HvCatCLCTag.fasta

less output.completeness_report 

#      Statistics of the completeness of the genome based on 248 CEGs      #

              #Prots  %Completeness  -  #Total  Average  %Ortho 

  Complete       99       39.92      -   213     2.15     66.67

   Group 1       29       43.94      -    69     2.38     72.41
   Group 2       22       39.29      -    50     2.27     77.27
   Group 3       25       40.98      -    52     2.08     64.00
   Group 4       23       35.38      -    42     1.83     52.17

   Partial      144       58.06      -   325     2.26     69.44

   Group 1       37       56.06      -    96     2.59     78.38
   Group 2       28       50.00      -    63     2.25     71.43
   Group 3       37       60.66      -    83     2.24     72.97
   Group 4       42       64.62      -    83     1.98     57.14

#    These results are based on the set of genes selected by Genis Parra   #

HvH_179CLCTag

cegma -g ../../assemblies/illuminaAssembly/CLCAssembly/HvH_179CLCTag.fasta

less output.completeness_report 

#      Statistics of the completeness of the genome based on 248 CEGs      #

              #Prots  %Completeness  -  #Total  Average  %Ortho 

  Complete       77       31.05      -   185     2.40     80.52

   Group 1       20       30.30      -    41     2.05     80.00
   Group 2       14       25.00      -    40     2.86     92.86
   Group 3       15       24.59      -    39     2.60     80.00
   Group 4       28       43.08      -    65     2.32     75.00

   Partial      143       57.66      -   317     2.22     73.43

   Group 1       30       45.45      -    61     2.03     76.67
   Group 2       30       53.57      -    73     2.43     76.67
   Group 3       37       60.66      -    81     2.19     72.97
   Group 4       46       70.77      -   102     2.22     69.57

#    These results are based on the set of genes selected by Genis Parra   #

CLCAssemblyAll454HvCatIllum

cegma -g ../assemblies/CLCAssemblyAll454HvCatIllum/CLCAssemblyAll454HvCatIllum.fasta

less output.completeness_report 

#      Statistics of the completeness of the genome based on 248 CEGs      #

              #Prots  %Completeness  -  #Total  Average  %Ortho 

  Complete      142       57.26      -   305     2.15     70.42

   Group 1       34       51.52      -    80     2.35     82.35
   Group 2       31       55.36      -    65     2.10     70.97
   Group 3       44       72.13      -    96     2.18     70.45
   Group 4       33       50.77      -    64     1.94     57.58

   Partial      191       77.02      -   431     2.26     73.30

   Group 1       42       63.64      -    99     2.36     80.95
   Group 2       42       75.00      -    91     2.17     73.81
   Group 3       53       86.89      -   128     2.42     77.36
   Group 4       54       83.08      -   113     2.09     62.96

#    These results are based on the set of genes selected by Genis Parra   #

#    Key:                                                                  #
#    Prots = number of 248 ultra-conserved CEGs present in genome          #
#    %Completeness = percentage of 248 ultra-conserved CEGs present        #
#    Total = total number of CEGs present including putative orthologs     #
#    Average = average number of orthologs per CEG                         #
#    %Ortho = percentage of detected CEGS that have more than 1 ortholog   #

MiraAssembly454

cegma -g ../../assemblies/miraAssemblyAll454HvCatIllum/miraHvTotal9sff_assembly/miraHvTotal9sff_d_results/miraHvTotal9sff_out.unpadded.fasta

less output.completeness_report 
#      Statistics of the completeness of the genome based on 248 CEGs      #

              #Prots  %Completeness  -  #Total  Average  %Ortho 

  Complete       37       14.92      -    89     2.41     78.38

   Group 1        6        9.09      -    10     1.67     66.67
   Group 2        8       14.29      -    19     2.38     75.00
   Group 3       12       19.67      -    32     2.67     83.33
   Group 4       11       16.92      -    28     2.55     81.82

   Partial       84       33.87      -   181     2.15     69.05

   Group 1       11       16.67      -    22     2.00     72.73
   Group 2       13       23.21      -    28     2.15     69.23
   Group 3       28       45.90      -    62     2.21     67.86
   Group 4       32       49.23      -    69     2.16     68.75

References

1. Cegma
2. File:Bioinformatics-2007-Parra-1061-7.pdf