Corpas Family Exome Data Available For Public Download

Readers may remember the Crowdfunding Campaign that we run to collect funds to sequence the genomes of the Corpas family. We are pleased to announce the immediate release of our personal exomes (the coding portions of our genes) currently under a CC-BY license, just for issues of compatibility of license. At this point you have permission to use these data in any way you wish as long as you attribute it to the Corpas family.

Where is it available?

We have decided to make the data available through figshare because it makes the data immediately citable, providing a doi identifier. So here is where the trio data can be downloaded:

Please note that the above data only include the latest sequencing data from our family: exome data from mother, father and daughter. Previous released data from son’s exome are here:

Why do we release our personal exomes?

When my family and myself made our genotypes available on the Internet, we immediately received results from researchers from around the world who took our data for analysis and came back with interesting insights. As a result of this, we have been able to learn much about ourselves. I have reported this in a previous entry on this blog entitled “Benefits for Publishing Family Genomes on the Internet“. We now follow the same principle: if we make our exomes available for people to analyse them, we can expect that some researchers may come back with interesting results.

What new data do we actually release?

Fastq files for whole exome sequencing from the Corpas family: mother, father, daughter. The data comes from 3 saliva samples. Exome capture was performed using Agilent SureSelect Human All Exon 44.

The captured material was sequenced using Illumina’s HiSeq technology.

The data is expected to have 30X effective mean depth per sample, having removed adaptor pollution and low quality sequence.

What do we ask in return?

We do appeal to the good will of potential users to report back to us anything interesting they might find.

How big are the files?

They are huge. On average they are about 1 Gb per file and we have 6 of these. That means that it can take several hours for each file to be downloaded. Please be patient!

Where can I get them?


The top link is for mother, father and daughter. The botton link is for son.

How did we get our personal exome sequenced?

Completely independently. If you want to know the story on how I did it myself, please refer to my blog entries “Getting My Genome Sequencing Done” Part I and Part II. As it is implied there, we managed to get my personal genome sequenced by knocking on quite a few doors and then finding someone who would sponsor us to do so. In fact, part of this exercise’s aim was to prove that it is possible now a days for ordinary citizens to get their genomes sequenced if they so wish. We now go step ahead by publishing our whole exomes on the Internet.

Leave a Reply

%d bloggers like this: