Manuel Corpas' Blog

Genomes, Web 2.0 and Bioethics

My Personal Exome Now Publicly Released

After 5 months of having performed the sequencing of my personal exome, I now make it available to the community for public use. I release it under a CC BY-SA 3.0 license, giving you permission to use this data in any way, as long as it provides attribution to the source and it is shared under a similar license.

What is an exome?

An exome is the ~1% of my genome that encodes for proteins.

Why do I release my personal exome?

When my family and myself made our genotypes available through the Internet, we immediately received results from researchers around the world who took our data for analysis and came back with interesting results. As a result of this, we have been able to learn much about ourselves. I have reported this in a previous entry on this blog entitled “Benefits for Publishing Family Genomes on the Internet“. I now follow the same principle: if I make my exome available for people to analyse it, I can expect that some researchers may come back with interesting results.

What data do I actually release?

I release my 4 FastQ files that were given to me by my sequencing provider. This is the same kind of information that 23andMe gives in their current exome analysis offer. This information basically consists of raw reads that need to be aligned to a reference assembly. Once aligned, interesting variation data can be inferred.

What do I ask in return?

Nothing. I do appeal though to the good will of potential users to report back to me anything interesting they might find.

How big are the files?

They are huge. On average they are about 0.6 Gb per file and I have 4 of these. That means that it can take several hours for each file to be downloaded. Be patient!

Where can I get them?

Here:

  1. File 1
  2. File 2
  3. File 3
  4. File 4

How did I get my personal exome sequenced?

Completely independently. If you want to know the story on how I did it, please refer to my blog entries “Getting My Genome Sequencing Done” Part I and Part II. As it is implied there, I managed to get my personal genome sequenced by knocking on quite a few doors and then finding someone who would sponsor me to do so. In fact, part of this exercise’s aim was to prove that it is possible now a days for ordinary citizens to get their genomes sequenced if they so wish.

Advertisement

Filed under: Genomics, Personal, Personal Genomes, , ,

10 Responses

  1. [...] Exome Analysis (Part I): First Findings You may have read in a previous entry about the release of my raw personal exome data. Although users were not required to report back any finding derived from this data, my hope was [...]

  2. Dan Gaston says:

    Thanks Daniel for the info, I’ll definitely pass it along and keep that in mind as our project moves forward.

    • admin says:

      Cc-BY-SA license mainly to be consistent with my blog’s license. Is this license a problem for you?

    • Pepetideo says:

      It is not a problem… I am asking this because there are a discussion about your blog post on google plus and a person commented that that by placing it in CC3.0 SA would make the data more difficult to integrate into already existing public databases because it requires that the data be provided according to the same license you selected so it would be better to have the least restrictive license. I do not know enough on the subject to have an opinion on that :)

  3. Dan Gaston says:

    Out of curiosity what was coverage like? I’m part of a disease genomics group and exome sequencing is part of our overall pipeline. One problem we sometimes run in to is that the exon capture kits can result in poor coverage of some exons, or no coverage at all.

    • admin says:

      Hi Dan,
      thanks for your comment. I’ve been told by the provider that the coverage for this exome is at least 30x.

    • Daniel Swan says:

      Dan, I can’t quote exact coverage figures because I don’t know what capture kit was used, but if we assume an Agilent SureSelect 50Mb kit, then mean target coverage is >37x for Manuel according to my analysis of the data. I think there are always going to be issues with biased capture. It’s easier to get this right for more focused targeted re-sequencing where the baits can be designed for more even coverage. We find most people ask for 50x mean target coverage for rare-disease studies.

    • Dan Gaston says:

      Thanks for the info, good to know. It has turned out oddly with two or three of our projects that the most likely causal variant was found in some random exome somewhere that had just not been captured at all so it is something we now pay attention to when we get our data back.

    • Daniel Swan says:

      Dan, this is one reason we don’t include depth filters in our pipeline. No coverage is one thing, but I’m quite wary of throwing away low coverage variants – especially if the read quality and mapping quality is good. For trio analysis where we need to be confident about genotype calls we do filter by depth by proxy. We recommend to people to start with variants of 20x depth or higher first, but if nothing is found then we don’t discourage exploration of the lower coverage data for potential causal variants. It just implies that a little more confirmation is required, but to follow up a tranche of low-coverage variants of interest isn’t a terrible burden by other genotyping methods.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow via Twitter

Disclaimer

Any views expressed here are the author's alone and do not necessarily form part of the official positions of his employer.
Creative Commons License
Unless otherwise stated, this work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

Recently Tweeted

Follow

Get every new post delivered to your Inbox.

Join 211 other followers