A little Perl one liner I borrowed from The Edwards Lab that converts FASTQ to FASTA. Please note I had to truncate the line to make it show properly in this blog entry.
$ cat file_to_covert.fq | perl -e \ '$i=0;while(<>){if(/^\@/&&$i==0){s/^\@/\>/;print;}elsif($i==1){print;$i=-3}$i++;}' \ > output.fastaThanks Edwards Lab!
I wrote the above post and a reader rightly pointed out that it is incorrect (see post comments below). I solved the problem of converting FASTQ to FASTA with the following script, which seems to work fine:
use Bio::SeqIO; my ($file1,$file2)=@ARGV; my $seqin = Bio::SeqIO -> new (-format => 'fastq',-file => $file1); my $seqout = Bio::SeqIO -> new (-format => 'fasta',-file => ">$file2"); while (my $seq_obj = $seqin -> next_seq) { $seqout -> write_seq($seq_obj); }
kbradnam
An awk solution:
cat file.fq |awk ‘{print “>” substr($0,2);getline;print;getline;get line}’ > file.fa
gasstationwithoutpumps
Also buggy. There is no guarantee that FASTQ files have a simple alternation of lines. Line wrap can occur (and is allowed). The FASTQ format is rather a crummy design. As it says on the Wikipedia FASTQ page:
“The original Sanger FASTQ files also allowed the sequence and quality strings to be wrapped (split over multiple lines), but this is generally discouraged as it can make parsing complicated due to the unfortunate choice of “@” and “+” as markers (these characters can also occur in the quality string).”
gasstationwithoutpumps
Sorry, this code is buggy. It is possible for a fastq quality line to begin with an at-sign, which completely messes up this simplisitic approach.
admin
Thanks for your comment