How Facebook Helped Me Discover I’m A Red Hair Gene Carrier
April 12, 2013 § 4 Comments
Is it weird for a Spaniard to have red hair? The typical stereotype for a Mediterranean person is brown-skinned, not too tall and with dark hair. I do not seem to fit all those stereotypes very well, except for the dark hair. At least so I thought until I posted this picture of my beautiful family on my Facebook profile:

How is it possible that my children have the red hair gene expressed so dominantly? I have dark hair and my parents do so too!
Of the five of us I am the only one without red hair. Seeing this picture really brought it home to me, it was strange that everyone of my three children had inherited my wife’s ‘recessive’ red hair!
I did not give a lot of importance to this until my colleague and Facebook friend Dave Adams, who happens to lead a research group at the Wellcome Trust Sanger Institute, asked me whether I had checked the MC1R gene.
The protein encoded by the MC1R gene is found in melanocytes, the cells that give hair and skin their color. The variants associated with red hair alter the protein’s function, tipping the balance of pigment production in melanocytes from black-brown eumelanin to red-yellow pheomelanin [1].
Dave is well aware of my efforts to crowdsource my genome data analysis and those of my blood relatives (parents, siblings and aunts and uncles). Since I have had my exome done, and following Dave’s suggestion, I looked for the animo acid changes he suggested (r151c, r160w and d294h) in the MC1R gene. Below you can see some of the comments of our conversation on Facebook:

Facebook chat showing Dave Adam’s conversation with me about finding the origin of the red hair in my offspring.
I have a VCF file for all variations in my genome available in figshare for public download. I searched the file for the 89978527-89987385 interval in which the MC1R gene is located in chromosome 16 and found:
16 89986091 rs11547464 G A
This indicates that in position 89986091, there is a small change of one letter (SNP rs11547464) that makes my DNA in that position differ from the one of the human genome reference. The reference genome has a G whereas I have an A.
I also looked at my 23andMe genotype using myKaryoView, which also includes this rs11547464 SNP, and found that my genotype is ‘AG’. Doing some research with this I found that AG in the rs11547464 SNP encodes a missense change on the protein sequence (R142H), making me a ‘carrier’ state for ‘red hair’ [2].
More information about the relation of this SNP the phenotype showed that this mutation has been shown to be deleterious [3] and that this MC1R variant is “functional” [4].
According to Dave, I am a carrier for this red hair mutation and presumably my wife is homozygous for another variant with my kids being compound heterozygous. This means that perhaps my wife has another variant somewhere that also contributes to my children having red hair.
This explains, at least partly, how my offspring’s red hair is so strong, something that in principle should be self evident from the picture above. There is something satisfying though about being able to confirm the obvious with scientific evidence.
References
[2] http://www.ianlogan.co.uk/23andme/open/nancy-grossman.htm
[3] http://www.medwelljournals.com/fulltext/?doi=javaa.2011.928.931
iAnn: Scientific Events Should Be Curated Only Once!
April 8, 2013 § Leave a Comment
In this presentation I introduce iAnn, an open source community-driven platform for dissemination of life science events, such as courses, conferences and workshops. This presentation was given at the GMOD workshop in Cambridge on April 6th, 2013.
BioJS Presentation at GMOD Meeting April 2013 in Cambridge
April 5, 2013 § Leave a Comment
This talk was given on the morning of April 5th 2013 at the GMOD meeting, preceding the Biocuration 2013 conference. GMOD is the Generic Model Organism Database project, a collection of open source software tools for creating and managing genome-scale biological databases.
BioJS: Web 2.0 Reusable Components For Visualization of Biological Data
March 4, 2013 § 1 Comment
Despite may Bio* being developed (BioPerl, BioJava, etc.), to date no coordinated Bio* community effort has been established for JavaScript. JavaScript is the language of choice for implementation of dynamic and interactive web applications. BioJS provides a catalogue of open source modules in JavaScript for Life Sciences. These modules include many commonly used functionalities, available for developers or scientists to download. This consistency of development promotes reutilization of existing components and a genuine one-stop shop for development of bioinformatics web applications. Resource discovery is enabled by BioJS’s registry which includes all of BioJS source code libraries, documentation and guidelines. These are freely available for public use in what we believe it is to date the most extensive catalogue of open source JavaScript biological widgets. As more bioinformaticians continue to develop modern web applications, we expect the BioJS community to continue to grow.
The BioJS publication is now out in the Bioinformatics journal.
Corpas Family Exome Data Available For Public Download
January 21, 2013 § Leave a Comment
Readers may remember the Crowdfunding Campaign that we run to collect funds to sequence the genomes of the Corpas family. We are pleased to announce the immediate release of our personal exomes (the coding portions of our genes) currently under a CC-BY license, just for issues of compatibility of license. At this point you have permission to use these data in any way you wish as long as you attribute it to the Corpas family.
Where is it available?
We have decided to make the data available through figshare because it makes the data immediately citable, providing a doi identifier. So here is where the trio data can be downloaded:
http://dx.doi.org/10.6084/m9.figshare.106340
Please note that the above data only include the latest sequencing data from our family: exome data from mother, father and daughter. Previous released data from son’s exome are here:
http://dx.doi.org/10.6084/m9.figshare.92584
Why do we release our personal exomes?
When my family and myself made our genotypes available on the Internet, we immediately received results from researchers from around the world who took our data for analysis and came back with interesting insights. As a result of this, we have been able to learn much about ourselves. I have reported this in a previous entry on this blog entitled “Benefits for Publishing Family Genomes on the Internet“. We now follow the same principle: if we make our exomes available for people to analyse them, we can expect that some researchers may come back with interesting results.
What new data do we actually release?
Fastq files for whole exome sequencing from the Corpas family: mother, father, daughter. The data comes from 3 saliva samples. Exome capture was performed using Agilent SureSelect Human All Exon 44.
The captured material was sequenced using Illumina’s HiSeq technology.
The data is expected to have 30X effective mean depth per sample, having removed adaptor pollution and low quality sequence.
What do we ask in return?
We do appeal to the good will of potential users to report back to us anything interesting they might find.
How big are the files?
They are huge. On average they are about 1 Gb per file and we have 6 of these. That means that it can take several hours for each file to be downloaded. Please be patient!
Where can I get them?
Here:
http://dx.doi.org/10.6084/m9.figshare.106340
http://dx.doi.org/10.6084/m9.figshare.92584
The top link is for mother, father and daughter. The botton link is for son.
How did we get our personal exome sequenced?
Completely independently. If you want to know the story on how I did it myself, please refer to my blog entries “Getting My Genome Sequencing Done” Part I and Part II. As it is implied there, we managed to get my personal genome sequenced by knocking on quite a few doors and then finding someone who would sponsor us to do so. In fact, part of this exercise’s aim was to prove that it is possible now a days for ordinary citizens to get their genomes sequenced if they so wish. We now go step ahead by publishing our whole exomes on the Internet.




