Navigation

 ·   Wiki Home
 ·   Data Processing
 ·   Hemileia vastatrix
 ·   Hypothenemus hampei
 ·   Coffea
 ·   Beauveria bassiana
 ·  
 ·   Title List
 ·   Uncategorized Pages
 ·   Random Page
 ·   Recent Changes
 ·   Wiki Help
 ·   What Links Here

Active Members:

Search:

 

Create or Find Page:

 

View Process Hh 12

Preparing Reads

cat ../../../RawDataBroca/TranscriptomaFalsaBroca/G4GNOKD01.RL6_clean.fasta ../../../RawDataBroca/TranscriptomaFalsaBroca/G4GNOKD02.RL6_clean.fasta > readsFalsaBroca.fasta

cat ../../../RawDataBroca/TranscriptomaFalsaBroca/G4GNOKD01.RL6_clean.fasta.qual ../../../RawDataBroca/TranscriptomaFalsaBroca/G4GNOKD02.RL6_clean.fasta.qual > readsFalsaBroca.fasta.qual

root@tunebo:/data/process/Broca/Transcriptoma/Ensamblajes/FalsaBrocaCLC# /opt/CLC/clc-assembly-cell-4.0.1beta-linux_64/quality_trim -r readsFalsaBroca.fasta -q readsFalsaBroca.fasta.qual -o readsFalsaBroca_trim.fasta
Input reads: 1223893
Input residues: 609633940
Output reads: 775043 63.33 %
Output residues: 291732395 47.85 %
Quality range: 0 to 40

/opt/seqclean/seqclean readsFalsaBroca_trim.fasta -m 70 -o readsFalsaBroca_trim_seqclean.fasta -N -A
seqclean running options:
seqclean readsFalsaBroca_trim.fasta -m 70 -o readsFalsaBroca_trim_seqclean.fasta -N -A Standard log file: seqcl_readsFalsaBroca_trim.fasta.log Error log file: err_seqcl_readsFalsaBroca_trim.fasta.log Using 1 CPUs for cleaning
= Rebuilding readsFalsaBroca_trim.fasta cdb index = Launching actual cleaning process: psx -p 1 -n 1000 -i readsFalsaBroca_trim.fasta -d cleaning -C ‘/data/process/Broca/Transcriptoma/Ensamblajes/FalsaBrocaCLC/readsFalsaBroca_trim.fasta:LMS100:::11:0’ -c ‘/opt/seqclean/bin/seqclean.psx’
Collecting cleaning reports **************************************************
Sequences analyzed: 775043
—————————————————- valid: 771302 (0 trimmed) trashed: 3741 **************************************************
= Trashing summary = by ‘short’: 3733 by ‘dust’: 8
———————————————
Output file containing only valid and trimmed sequences: readsFalsaBroca_trim_seqclean.fasta
For trimming and trashing details see cleaning report : readsFalsaBroca_trim.fasta.cln
—————————————————————————
seqclean (readsFalsaBroca_trim.fasta) finished on machine in /data/process/Broca/Transcriptoma/Ensamblajes/FalsaBrocaCLC, without a detectable error.

time /opt/tgicl_linux/bin/mdust readsFalsaBroca_trim_seqclean.fasta > readsFalsaBroca_trim_seqclean_mdust.fasta
—-
real 11m8.162s
user 9m19.630s
sys 0m2.700s

Assembly

time /opt/CLC/clc-assembly-cell-4.0.1beta-linux_64/clc_novo_assemble -o FalsaBroca.fasta -q readsFalsaBroca_trim_seqclean_mdust.fasta
Progress: 100.0 %
—-
real 14m24.037s
user 16m10.880s
sys 0m2.280s

sequence_info -n -r FalsaBroca.fasta

File                           FalsaBroca.fasta

Number of sequences                  7934

Residue counts:
  Number of A's                   1477534   27.81 %
  Number of C's                   1151524   21.68 %
  Number of G's                   1155508   21.75 %
  Number of T's                   1470120   27.67 %
  Number of N's                     57911    1.09 %
  Total                           5312597

Sequence lengths:
  Minimum                             200
  Maximum                            4118
  Average                             669.60
  N50                                 743

time /opt/CLC/clc-assembly-cell-4.0.1beta-linux_64/clc_ref_assemble_long -o FalsaBroca.fasta.cas -q readsFalsaBroca.fasta -d FalsaBroca.fasta
Progress: 100.0 %
real 7m21.980s
user 13m28.490s
sys 0m3.370s

assembly_info FalsaBroca.fasta.cas > FalsaBroca.fasta.cas.txt

General info:

  Program name         clc_ref_assemble_long
  Program version      4.01beta.59919
  Program parameters   -o FalsaBroca.fasta.cas -q readsFalsaBroca.fasta -d FalsaBroca.fasta

  Contig files:
    FalsaBroca.fasta [ 7934 / 5312597 ]

  Read files:
    readsFalsaBroca.fasta [ 1223893 / 609633940 ]

Read info:

  Contigs                          7934
  Reads                         1223893
    Unmapped reads               163312   13.34 %
    Mapped reads                1060581   86.66 %
      Multi hit reads             29558    2.79 %

Coverage info:

  Mapped nucleotides          387047648   63.49 %
  Total sites                   5312597
  Average coverage                   72.85