Navigation

 ·   Wiki Home
 ·   Data Processing
 ·   Hemileia vastatrix
 ·   Hypothenemus hampei
 ·   Coffea
 ·   Beauveria bassiana
 ·  
 ·   Title List
 ·   Uncategorized Pages
 ·   Random Page
 ·   Recent Changes
 ·   Wiki Help
 ·   What Links Here

Active Members:

Search:

 

Create or Find Page:

 

View Process Ca03

BGI Reads

grep -c ‘@FCB068PABXX’ Coffea_arabica_l1_1_All.fq
89742338

grep -c ‘@FCB068PABXX’ Coffea_arabica_l1_2_All.fq
89742338

Total reads form BGI 179,484,676

Quality Trim

quality_trim -p Coffea_arabica_BGI_trim.fasta -r -i Coffea_arabica_l1_1_All.fq Coffea_arabica_l1_2_All.fq
Input reads: 179484676
Input residues: 16153620840

Output reads: 154552014 86.11 %
Output residues: 13119605124 81.22 %

Quality range: 2 to 40

real 36m40.642s
user 34m42.720s
sys 1m47.980s

Hybrid Assembly

Hybrid Assembly with clean reads from cenicafe and reads from BGI

time clc_novo_assemble -o CarabicaBGItrim_CenicafeCArabica.fasta -q ReadsCenicafeCArabica.clean.fasta -q -p fb ss 200 400 Coffea_arabica_BGI_trim.fasta —cpus 4
Progress: 100.0 %

real 80m30.000s
user 304m26.520s
sys 0m51.190s

sequence_info -n -r CarabicaBGItrim_CenicafeCArabica.fasta

File                           CarabicaBGItrim_CenicafeCArabica.fasta

Number of sequences                 55554

Residue counts:
  Number of A's                  11851538   25.82 %
  Number of C's                   9896111   21.56 %
  Number of G's                  10187527   22.19 %
  Number of T's                  11888282   25.90 %
  Number of N's                   2081647    4.53 %
  Total                          45905105

Sequence lengths:
  Minimum                             200
  Maximum                           26359
  Average                             826.32
  N50                                1168

*Mapping reads on assembly*

clc_ref_assemble_long -o CarabicaBGItrim_CenicafeCArabica.fasta.cas -d CarabicaBGItrim_CenicafeCArabica.fasta -q ReadsCenicafeCArabica.clean.fasta -q -p fb ss 200 400 Coffea_arabica_BGI_trim.fasta

assembly_info -p fb ss 200 400 CarabicaBGItrim_CenicafeCArabica.fasta.cas > CarabicaBGItrim_CenicafeCArabica.fasta.cas.txt

General info:

  Program name         clc_ref_assemble_long
  Program version      4.01beta.59919
  Program parameters   -o CarabicaBGItrim_CenicafeCArabica.fasta.cas -d CarabicaBGItrim_CenicafeCArabica.fasta -q ReadsCenicafeCArabica.clean.fasta -q -p fb ss 200 400 Coffea_arabica_BGI_trim.fasta

  Contig files:
    CarabicaBGItrim_CenicafeCArabica.fasta [ 55554 / 45905105 ]

  Read files:
    ReadsCenicafeCArabica.clean.fasta [ 38858 / 31674382 ]
    Coffea_arabica_BGI_trim.fasta [ 154552014 / 13119605124 ] <paired>

Read info:

  Contigs                         55554
  Reads                       154590872
    Unmapped reads             22163641   14.34 %
    Mapped reads              132427231   85.66 %
      Multi hit reads          23230903   17.54 %
    Paired                     57095608   36.94 %
    Unpaired                   97456406   63.06 %

Paired end info:

  Paired reads                 57095658   36.93 %
    Average distance                215.95
    99.9 % of pairs between         200 - 399
    99.0 % of pairs between         200 - 399
    95.0 % of pairs between         200 - 399

  Unpaired reads               97495214   63.07 %
    Both seqs not matching     13977426   14.34 %
    One seq not mathing        16372430   16.79 %
    Both seqs matching         67145358   68.87 %
      Different contigs        18708806   27.86 %
      Wrong directions          6015774    8.96 %
      Too close                42302502   63.00 %
      Too far                    118276    0.18 %

Coverage info:

  Mapped nucleotides        10940734249   83.19 %
  Total sites                  45905105
  Average coverage                  238.33

Blast With NR

time blastall -p blastx -i CarabicaBGItrim_CenicafeCArabica.fasta -d /opt/DBs/nr -e 1e-5 -o CarabicaBGItrim_CenicafeCArabica.fastaVsNr.bx.xml -m 7 -v 10 -b 10 -a 4

Screen_shot_2011-11-28_at_3.04.41_PM.png

Screen_shot_2011-11-28_at_3.07.25_PM.png