View Process Hh 07
Extracting Reads from Sff
/data/process/Broca/PipeLineTranscripFalsaBroca# /opt/scripts/sff_extract.pl -c —min_left_clip=4 /data/process/Broca/RawDataBroca/TranscriptomaFalsaBroca/G4GNOKD0* -o TotalFalsaBroca
********************************************************************************
WARNING: weird sequences in file /data/process/Broca/RawDataBroca/TranscriptomaFalsaBroca/G4GNOKD01.RL6_clean.sff
After applying left clips, 332115 sequences (=57%) start with these bases:
G
This does not look sane.
Countermeasures you probably must take:
1) Make your sequence provider aware of that problem and ask whether this can be corrected in the SFF.
2) If you decide that this is not normal and your sequence provider does not react, use the —min_left_clip of sff_extract. (Probably ‘—min_left_clip=17’ but you should cross-check that) ********************************************************************************
Working on ‘/data/process/Broca/RawDataBroca/TranscriptomaFalsaBroca/G4GNOKD02.RL6_clean.sff’:
Converting ‘/data/process/Broca/RawDataBroca/TranscriptomaFalsaBroca/G4GNOKD02.RL6_clean.sff’ … done.
Converted 637967 reads into 637967 sequences. ********************************************************************************
WARNING: weird sequences in file /data/process/Broca/RawDataBroca/TranscriptomaFalsaBroca/G4GNOKD02.RL6_clean.sff
After applying left clips, 364804 sequences (=57%) start with these bases:
G
This does not look sane.
Countermeasures you probably must take:
1) Make your sequence provider aware of that problem and ask whether this can be corrected in the SFF.
2) If you decide that this is not normal and your sequence provider does not react, use the —min_left_clip of sff_extract. (Probably ‘—min_left_clip=17’ but you should cross-check that) ********************************************************************************
/opt/scripts/sff_extract.pl —min_left_clip=17 /data/process/Broca/RawDataBroca/TranscriptomaFalsaBroca/G4GNOKD0* -o TotalFalsaBroca
Working on ‘/data/process/Broca/RawDataBroca/TranscriptomaFalsaBroca/G4GNOKD01.RL6_clean.sff’:
Converting ‘/data/process/Broca/RawDataBroca/TranscriptomaFalsaBroca/G4GNOKD01.RL6_clean.sff’ … done.
Converted 585926 reads into 585926 sequences.
Working on ‘/data/process/Broca/RawDataBroca/TranscriptomaFalsaBroca/G4GNOKD02.RL6_clean.sff’:
Converting ‘/data/process/Broca/RawDataBroca/TranscriptomaFalsaBroca/G4GNOKD02.RL6_clean.sff’ … done.
Converted 637967 reads into 637967 sequences.
grep ‘Singleton’ ../Transcriptoma/Ensamblajes/EnsambajeTransFalsaBroca/EnsTransFalsaBroca/assembly/454ReadStatus.txt | sed ‘s/Singleton//’ > IdsSingleton
cdbfasta TotalFalsaBroca.fasta
cdbyank TotalFalsaBroca.fasta.cidx < IdsSingleton > SingletonsTotalFalsaBroca.fasta
cat ../Transcriptoma/Ensamblajes/EnsambajeTransFalsaBroca/EnsTransFalsaBroca/assembly/454AllContigs.fna SingletonsTotalFalsaBroca.fasta > TRANSCRIPTOMAFALSABROCA.fasta
/opt/RepeatModeler/BuildDatabase -name TRANSCRIPTOMAFALSABROCA_RMOD.fasta TRANSCRIPTOMAFALSABROCA.fasta Building database TRANSCRIPTOMAFALSABROCA_RMOD.fasta: Adding TRANSCRIPTOMAFALSABROCA.fasta to database
Number of sequences (bp) added to database: 213621 ( 96608318 bp )
time /opt/RepeatModeler/RepeatModeler -database TRANSCRIPTOMAFALSABROCA_RMOD.fasta > run.out