Skip to content

About
Engagements
Publications
Books

Converting FASTQ to FASTA

May 21, 2012

·

Genomics, Tutorials

A little Perl one liner I borrowed from The Edwards Lab that converts FASTQ to FASTA. Please note I had to truncate the line to make it show properly in this blog entry.
$ cat file_to_covert.fq | perl -e \
'$i=0;while(<>){if(/^\@/&&$i==0){s/^\@/\>/;print;}elsif($i==1){print;$i=-3}$i++;}' \
> output.fasta
Thanks Edwards Lab!

I wrote the above post and a reader rightly pointed out that it is incorrect (see post comments below). I solved the problem of converting FASTQ to FASTA with the following script, which seems to work fine:

use Bio::SeqIO; 
my ($file1,$file2)=@ARGV; 
my $seqin = Bio::SeqIO -> new (-format => 'fastq',-file => $file1); 
my $seqout = Bio::SeqIO -> new (-format => 'fasta',-file => ">$file2"); 

while (my $seq_obj = $seqin -> next_seq) 
{ 
   $seqout -> write_seq($seq_obj); 
}

Share this if you like it!

X
Facebook
LinkedIn
Email
WhatsApp
Flattr

Like Loading…

Podcast also available on PocketCasts, SoundCloud, Spotify, Google Podcasts, Apple Podcasts, and RSS.

4 responses

kbradnam

June 8, 2012 at 5:44 pm

An awk solution:

cat file.fq |awk ‘{print “>” substr($0,2);getline;print;getline;get line}’ > file.fa

LikeLike

Reply
1. gasstationwithoutpumps
  
  June 9, 2012 at 9:11 pm
  
  Also buggy. There is no guarantee that FASTQ files have a simple alternation of lines. Line wrap can occur (and is allowed). The FASTQ format is rather a crummy design. As it says on the Wikipedia FASTQ page:
  “The original Sanger FASTQ files also allowed the sequence and quality strings to be wrapped (split over multiple lines), but this is generally discouraged as it can make parsing complicated due to the unfortunate choice of “@” and “+” as markers (these characters can also occur in the quality string).”
  
  LikeLike
gasstationwithoutpumps

May 22, 2012 at 3:35 am

Sorry, this code is buggy. It is possible for a fastq quality line to begin with an at-sign, which completely messes up this simplisitic approach.

LikeLike

Reply
1. admin
  
  May 22, 2012 at 12:39 pm
  
  Thanks for your comment
  
  LikeLike

Leave a comment Cancel reply

Δ

Type your email…

About the podcast

Read Latest Blog Entries

The Bias Amplification Cascade: When AI Inherits Medicine’s Structural Neglect

March 17, 2026
Agentic Genomics: Why the Future of Biology Belongs to AI Agents

March 9, 2026
ClawBio at Imperial College: Live Demo from DoraHacks Demo Day

March 7, 2026
Introducing ClawBio and what 86% European GWAS participation means for the AI tools we build

February 25, 2026
50 Skills for Agent Automation: A Practical Playbook

February 15, 2026
Why LLMs Can Hurt Your Academic Writing If You Trust Them Too Much

January 25, 2026
How I Built a Second Brain for 26,000 Documents Using AI

January 10, 2026
Missing Pieces: Why Genomic Diversity Is the Key to Better Science

December 10, 2025
My AI Manifesto: Transformation, Trust, and the Future Users Deserve

December 7, 2025
The Precision Medicine Paradox: Why Partial Data Cannot Deliver Global Health

December 1, 2025
5 Strategic Imperatives for Global Genomics Equity (2026–2030)

November 10, 2025
Rise in youth mortality fuelled by mental illness, drugs, violence and other preventable causes

October 19, 2025
Health Data Equity in Latin America in the Age of AI and Genomics

October 7, 2025
Biobanking Meets AI

September 25, 2025
My Journey to Advancing Health Equity in Genomics

August 13, 2025
Why Diversity Must Be at the Heart of Precision Medicine

July 13, 2025
Bridging Genomics’ Greatest Challenge: The Diversity Gap

June 24, 2025
🔬 Fireside Chat with Professor Yves Moreau: AI, Genomics & the Ethics of Technology

May 10, 2025
A ChatGPT Moment for Genomics: Why Diversity Can’t Wait

May 8, 2025
The Forgotten Genomes: Lessons from the Descendants of the Inca Empire

April 28, 2025
Shining a Light on Discovery: A Conversation with VizBi Organizer Sean O’Donoghue

April 4, 2025
Key Points from #AIUk 2025

March 18, 2025
Genomics and Big Data: The Next Frontier in Precision Medicine

February 14, 2025
Bringing Amazonian Genomics to the Global Stage: Media Impact and Knowledge Exchange

February 13, 2025
Lessons Learned in 2024: A Year of Reflection and Growth

January 6, 2025
The Case for Genomic Representation in Latin America

December 14, 2024
AI in Drug Development and Career Lessons from Paul Agapow

December 12, 2024
Unlocking the Future of Genomics Education

November 24, 2024
My Journey with Social Responsibility and Diversity, Festival of Genomics, 2025

October 21, 2024
Unique genetics associated with immunity in indigenous peoples of Peru

October 17, 2024
Interview: First Impressions of the Initial Trip to Indigenous Communities in Peru

October 2, 2024
🌍🎙️ New Interview: Exploring the Genetic Secrets of Latin American Indigenous Communities

October 1, 2024
Understanding the Genomic Legacy of the Inca Empire and the Peruvian Genome Project

September 26, 2024
Vaccines and Health Preparations for My Amazon Expedition

August 28, 2024
Exploring the Amazon: A Journey to Uncover the Genetic Secrets of Indigenous Communities

August 26, 2024
Announcing My Participation in the VI Simposio Latinoamericano de Genética Médica

August 22, 2024
Addressing Ancestry and Sex Bias in Global Genomics Datasets

August 8, 2024
A Research Statement Pledge

July 19, 2024
Unveiling Critical Genetic Insights Using Low-Coverage Whole Genome Sequencing

June 23, 2024
The Missing Leg of UK Genomics

June 14, 2024
Letter to the Three Wise Men

May 17, 2024
9 Solutions Driving Equity, Diversity and Inclusion for Big Data in Healthcare

April 7, 2024
AI and Data Equity in for Health and Democracy

March 3, 2024
The Importance of Representation in Genetic Research

February 25, 2024
Genetic Diseases in the Era of Precision Medicine

February 3, 2024
Opinion

January 18, 2024
Addressing the Critical Need for Ancestry and Sex Diversity in Global Genomic Research

November 30, 2023
Clinical Genome Interpretation of Inca Empire Descendants

November 30, 2023
Guiding Principles for Effective Research: Insights from a Supervisor

November 17, 2023
Incorporating Equity, Diversity, and Inclusion in Genomic Health

September 6, 2023

About the podcast

Welcome to Manuel Corpas homepage.

Delve into Genomics, Data Science, Artificial Intelligence and current technological trends.

PocketCasts

Spotify

Youtube

Apple

RSS Feed

Subscribe for updates

Join our mailing list and get notified when we release new episodes and blog posts. No spam, we guarantee.

Type your email…

Manuel Corpas

About
Engagements
Publications
Books

Designed with WordPress

LinkedIn
Tumblr
Facebook

Comment
Reblog
Subscribe Subscribed
- Manuel Corpas
- Already have a WordPress.com account? Log in now.

%d