View Rust Illumina Assembly of Differents Strains
Quality Trim
/data/process/Roya/Ensamblajes/illuminaAssembly$ time quality_trim -r -i ../../RawDataRoya/Illumina/Originales/Hv387_1.fastq ../../RawDataRoya/Illumina/Originales/Hv387_2.fastq -p Hv387_trim.fasta
time quality_trim -r -i ../../RawDataRoya/Illumina/Originales/Hv494_1.fastq ../../RawDataRoya/Illumina/Originales/Hv494_2.fastq -p Hv494.fasta &
time quality_trim -r -i ../../RawDataRoya/Illumina/Originales/HvCat_1.fastq ../../RawDataRoya/Illumina/Originales/HvCat_2.fastq -p HvCat.fasta &
time quality_trim -r -i ../../RawDataRoya/Illumina/Originales/HvDQ952_1.fastq ../../RawDataRoya/Illumina/Originales/HvDQ952_2.fastq -p HvDQ952.fasta &
time quality_trim -r -i ../../RawDataRoya/Illumina/Originales/HvH_179_1.fastq ../../RawDataRoya/Illumina/Originales/HvH_179_2.fastq -p HvH_179.fasta &
time quality_trim -r -i ../../RawDataRoya/Illumina/Originales/ ../../RawDataRoya/Illumina/Originales/HvH_569_2.fastq -p HvH_569.fasta &
time quality_trim -r -i ../../RawDataRoya/Illumina/Originales/HvH_701_1.fastq ../../RawDataRoya/Illumina/Originales/HvH_701_2.fastq -p HvH_701.fasta &
time quality_trim -r -i ../../RawDataRoya/Illumina/Originales/HvMar_1.fastq ../../RawDataRoya/Illumina/Originales/HvMar_2.fastq -p HvMar.fasta &
Removing Duplicates
remove_duplicates -p -r Hv387_trim.fasta -o Hv387_trim_NoDupicates.fasta
remove_duplicates -p -r Hv494_trim.fasta -o Hv494_trim_NoDupicates.fasta &
remove_duplicates -p -r HvCat_trim.fasta -o HvCat_trim_NoDupicates.fasta &
Genome stimated size = 110Mb = 11000000 pb
Velvet Assembly
sequence_info sequence_info Hv387_trim_NoDupicates.fasta
File Hv387_trim_NoDupicates.fasta Number of sequences 50782526 Residue counts: Total 5405317372 Sequence lengths: Minimum 55 Maximum 110 Average 106.44
exp_cov= (106*50782526)/110.000.000=49
time /opt/velvet_1.1.05/velveth HvH_179VelvetAsmK33 33 -fasta -shortPaired HvH_179_trim_NoDupicates.fasta
real 78m37.828s
user 74m26.870s
sys 1m15.360s
/opt/velvet_1.1.06/velvetg HvH_179VelvetAsmK33 -exp_cov 49 -ins_length 400 -cov_cutoff auto -shortMatePaired yes -amos_file yes
/opt/scripts/fastasizefilter.pl HvH_179VelvetAsmK33/contigs.fa 200 sequence_info -n -r HvH_179VelvetAsmK33/contigs.fa.200 File HvH_179VelvetAsmK33/contigs.fa.200 Number of sequences 166629 Residue counts: Number of A's 40142129 29.87 % Number of C's 19473978 14.49 % Number of G's 19450796 14.48 % Number of T's 40096195 29.84 % Number of N's 15204657 11.32 % Total 134367755 Sequence lengths: Minimum 200 Maximum 12986 Average 806.39 N50 1073
CLC Assembly
time clc_novo_assemble -q -p fb ss 100 400 Hv387_trim_NoDuplicatesTag.fasta -o Hv387CLCTag.fasta
Progress: 100.0 %
real 628m53.249s
user 956m36.390s
sys 1m14.670s
File | Hv387 | Hv494 | HvCat | HvDQ952 | HvH_179 | HvH_569 | HvH_701 | HvMar |
Minimum | 200 | 200 | 200 | 200 | 200 | 200 | 200 | 200 |
Maximum | 45464 | 45679 | 31396 | 46233 | 59893 | 31087 | 46102 | 45663 |
Average | 712.58 | 653.25 | 622.21 | 611.94 | 645.70 | 659.64 | 734.10 | 618.68 |
N50 | 1072 | 921 | 847 | 823 | 888 | 928 | 1131 | 832 |
Contigs | 211495 | 211700 | 197394 | 197927 | 203770 | 202168 | 215628 | 203360 |