View Process Hh 12
Preparing Reads
cat ../../../RawDataBroca/TranscriptomaFalsaBroca/G4GNOKD01.RL6_clean.fasta ../../../RawDataBroca/TranscriptomaFalsaBroca/G4GNOKD02.RL6_clean.fasta > readsFalsaBroca.fasta
cat ../../../RawDataBroca/TranscriptomaFalsaBroca/G4GNOKD01.RL6_clean.fasta.qual ../../../RawDataBroca/TranscriptomaFalsaBroca/G4GNOKD02.RL6_clean.fasta.qual > readsFalsaBroca.fasta.qual
root@tunebo:/data/process/Broca/Transcriptoma/Ensamblajes/FalsaBrocaCLC# /opt/CLC/clc-assembly-cell-4.0.1beta-linux_64/quality_trim -r readsFalsaBroca.fasta -q readsFalsaBroca.fasta.qual -o readsFalsaBroca_trim.fasta
Input reads: 1223893
Input residues: 609633940
Output reads: 775043 63.33 %
Output residues: 291732395 47.85 %
Quality range: 0 to 40
/opt/seqclean/seqclean readsFalsaBroca_trim.fasta -m 70 -o readsFalsaBroca_trim_seqclean.fasta -N -A
seqclean running options:
seqclean readsFalsaBroca_trim.fasta -m 70 -o readsFalsaBroca_trim_seqclean.fasta -N -A Standard log file: seqcl_readsFalsaBroca_trim.fasta.log Error log file: err_seqcl_readsFalsaBroca_trim.fasta.log Using 1 CPUs for cleaning
= Rebuilding readsFalsaBroca_trim.fasta cdb index = Launching actual cleaning process: psx -p 1 -n 1000 -i readsFalsaBroca_trim.fasta -d cleaning -C ‘/data/process/Broca/Transcriptoma/Ensamblajes/FalsaBrocaCLC/readsFalsaBroca_trim.fasta:LMS100:::11:0’ -c ‘/opt/seqclean/bin/seqclean.psx’
Collecting cleaning reports **************************************************
Sequences analyzed: 775043
—————————————————- valid: 771302 (0 trimmed) trashed: 3741 **************************************************
= Trashing summary = by ‘short’: 3733 by ‘dust’: 8
———————————————
Output file containing only valid and trimmed sequences: readsFalsaBroca_trim_seqclean.fasta
For trimming and trashing details see cleaning report : readsFalsaBroca_trim.fasta.cln
—————————————————————————
seqclean (readsFalsaBroca_trim.fasta) finished on machine in /data/process/Broca/Transcriptoma/Ensamblajes/FalsaBrocaCLC, without a detectable error.
time /opt/tgicl_linux/bin/mdust readsFalsaBroca_trim_seqclean.fasta > readsFalsaBroca_trim_seqclean_mdust.fasta
—-
real 11m8.162s
user 9m19.630s
sys 0m2.700s
Assembly
time /opt/CLC/clc-assembly-cell-4.0.1beta-linux_64/clc_novo_assemble -o FalsaBroca.fasta -q readsFalsaBroca_trim_seqclean_mdust.fasta
Progress: 100.0 %
—-
real 14m24.037s
user 16m10.880s
sys 0m2.280s
sequence_info -n -r FalsaBroca.fasta
File FalsaBroca.fasta Number of sequences 7934 Residue counts: Number of A's 1477534 27.81 % Number of C's 1151524 21.68 % Number of G's 1155508 21.75 % Number of T's 1470120 27.67 % Number of N's 57911 1.09 % Total 5312597 Sequence lengths: Minimum 200 Maximum 4118 Average 669.60 N50 743
time /opt/CLC/clc-assembly-cell-4.0.1beta-linux_64/clc_ref_assemble_long -o FalsaBroca.fasta.cas -q readsFalsaBroca.fasta -d FalsaBroca.fasta
Progress: 100.0 %
real 7m21.980s
user 13m28.490s
sys 0m3.370s
assembly_info FalsaBroca.fasta.cas > FalsaBroca.fasta.cas.txt
General info: Program name clc_ref_assemble_long Program version 4.01beta.59919 Program parameters -o FalsaBroca.fasta.cas -q readsFalsaBroca.fasta -d FalsaBroca.fasta Contig files: FalsaBroca.fasta [ 7934 / 5312597 ] Read files: readsFalsaBroca.fasta [ 1223893 / 609633940 ] Read info: Contigs 7934 Reads 1223893 Unmapped reads 163312 13.34 % Mapped reads 1060581 86.66 % Multi hit reads 29558 2.79 % Coverage info: Mapped nucleotides 387047648 63.49 % Total sites 5312597 Average coverage 72.85