Navigation

 ·   Wiki Home
 ·   Data Processing
 ·   Hemileia vastatrix
 ·   Hypothenemus hampei
 ·   Coffea
 ·   Beauveria bassiana
 ·  
 ·   Title List
 ·   Uncategorized Pages
 ·   Random Page
 ·   Recent Changes
 ·   Wiki Help
 ·   What Links Here

Active Members:

Search:

 

Create or Find Page:

 

View Bb9119 Clean CLC

Assembly Clean data Bb9119 with CLC

time clc_novo_assemble -q -p fb ss 100 400 ../Bb9119_Clean_Velvet/Bb9119_1_trim.fasta -o Bb9119_trim.fasta
Progress: 100.0 %
real 70m15.477s
user 86m19.170s
sys 0m18.300s

time clc_ref_assemble_long -o Bb9119_trim.fasta.cas -d Bb9119_trim.fasta -q -p fb ss 100 400 -q ../Bb9119_Clean_Velvet/Bb9119_1_trim.fasta
Progress: 100.0 %
real 20m50.551s
user 27m11.870s
sys 0m18.690s

sequence_info -n -r Bb9119_trim.fasta

File                           Bb9119_trim.fasta

Number of sequences                  3717

Residue counts:
  Number of A's                   8140386   24.76 %
  Number of C's                   8268193   25.15 %
  Number of G's                   8264957   25.14 %
  Number of T's                   8112495   24.67 %
  Number of N's                     93367    0.28 %
  Total                          32879398

Sequence lengths:
  Minimum                             200
  Maximum                          148910
  Average                            8845.68
  N50                               19561

assembly_info -p fb ss 100 400 Bb9119_trim.fasta.cas > Bb9119_trim.fasta.cas.txt

General info:

  Program name         clc_ref_assemble_long
  Program version      4.01beta.59919
  Program parameters   -o Bb9119_trim.fasta.cas -d Bb9119_trim.fasta -q -p fb ss 100 400 -q ../Bb9119_Clean_Velvet/Bb9119_1_trim.fasta

  Contig files:
    Bb9119_trim.fasta [ 3717 / 32879398 ]

  Read files:
    ../Bb9119_Clean_Velvet/Bb9119_1_trim.fasta [ 33188332 / 2865501772 ] <paired>

Read info:

  Contigs                          3717
  Reads                        33188332
    Unmapped reads               214511    0.65 %
    Mapped reads               32973821   99.35 %
      Multi hit reads            163571    0.50 %
    Paired                     30637448   92.31 %
    Unpaired                    2550884    7.69 %

Paired end info:

  Paired reads                 30637454   92.31 %
    Average distance                278.88
    99.9 % of pairs between         107 - 400
    99.0 % of pairs between         130 - 393
    95.0 % of pairs between         159 - 370

  Unpaired reads                2550878    7.69 %
    Both seqs not matching        45972    1.80 %
    One seq not mathing          337078   13.21 %
    Both seqs matching          2167828   84.98 %
      Different contigs         1844328   85.08 %
      Wrong directions           179950    8.30 %
      Too close                   21620    1.00 %
      Too far                    121930    5.62 %

Coverage info:

  Mapped nucleotides         2818255248   98.35 %
  Total sites                  32879398
  Average coverage                   85.71

R Statistics

a<-read.table(“Bb9119_trim.fasta.cas.txt”, header=T)
View(a)
summary(a)

    Contig         Sites            Reads            Coverage       
 Min.   :   1   Min.   :   200   Min.   :      2   Min.   :    0.57  
 1st Qu.: 930   1st Qu.:  1045   1st Qu.:    929   1st Qu.:   49.58  
 Median :1859   Median :  4328   Median :   3795   Median :   64.84  
 Mean   :1859   Mean   :  8846   Mean   :   8871   Mean   :  102.66  
 3rd Qu.:2788   3rd Qu.: 12193   3rd Qu.:  10649   3rd Qu.:   81.43  
 Max.   :3717   Max.   :148910   Max.   :2914453   Max.   :25621.76  

m=ggplot(a,aes(Reads))
View(a)
m + geom_histogram(aes(x=Reads),binwidth=5)+xlab(“Number of Reads in Contigs”)+xlim(0,500)

ReadsBb9119CLCZoom.png

With Zoom
ReadsBb9119CLC.png

Contig Size

m=ggplot(a,aes(x=Sites))
m + geom_histogram(aes(x=Sites),binwidth=70)+xlab(“Contig Size”)+xlim(0,5000)

ContigSizeBb9119CLC.png

ContigSizeBb9119CLCZoom.png

Coverage Representation

newDataFrame=merge(as.data.frame(categoriesCoverage),as.data.frame(a))
categoriesCoverage=add_labelsFromCoverage(a)
newDataFrame=merge(as.data.frame(categoriesCoverage),as.data.frame(a))
hist_cut = ggplot(newDataFrame, aes(x=Sites, fill=CoverageCategory))
hist_cut + geom_bar(position=“fill”,binwidth=150) + xlab(“Contig Size”) + ylab(“Proportion of Coverage Category”)

CoverageRepresentBb9119CLC.png