Benefits for Publishing Family Genomes on the Internet
June 6, 2011 § 7 Comments
It has been for a long while since I’ve been wanting to write about the stuff that Mike Cariaso, founder of SNPedia, has been doing with my family genotypes. Initially, he performed their data analysis with Promethease for assignment of traits and annotation to observed SNPs. More recently, he has also developed a tool for visualization and comparison of genotypes between different people. He has used my family’s and Manu Sporny’s genotypes as test cases.
This is an unanticipated benefit we have experienced as a family for publishing our genomes on the Internet. Using Promethease’s report we were able to learn that dad is lactose intolerant. The fact that he did not like milk and had not taken milk in years kind of made sense when we discovered that his two SNPs rs4988235(C;C) and rs182549(C;C) make him unlikely to digest lactose with 70% probability. This result regarding lactose intolerance was in fact in the 23andMe report but we missed it.
It is clear that Direct-to-consumer genetic companies do try to cater to the non-expert, i.e. the majority of its customer base. The novel SNPedia visualization tool will be an useful addition to those of us who strive to DIY our own discoveries about our personal genomes data.
Using his visualization tool, when I compare all my SNPs with those of my sister’s, I find that 68% of mine are identical to hers, a total of 389,250 (see below).
Note that the graph is using a logarithmic scale. Of all our analyzed SNPs, 25% are halfmatch (i.e. one of the alleles is common to both of us) and about 2% are conflicts. Example of conflicts may include different SNPs with the same position. This, according to Mike, may not be an accident. Because I know that we were analyzed in two different array platforms, version 2 and version 3 respectively, I can now tell the number of SNPs that are different between both of us, i.e. not present in either genotype. Of the total 0.5 Million plus SNPs in my genome about 29,082 do not match hers.
The other nice feature this tool provides is an actual graphical representation of chromosomal SNPs in a map of pixels, colored consistently with the above notations: light blue means match, dark blue halfmatch, red conflict and grey different SNPs:
The above figure shows two representations for chromsome/chromosome comparison between my chromosome 1 and my sister’s. Clearly most of the area is light blue, indicating complete match. Also the number of differences, halfmatches and conflicts are reported. Clicking on any of these links, one can find the actual SNPs in conflict, getting an output that looks like this:
1 rs9729550 1 1125105 CC AA 2 rs12142199 1 1239050 GG AA 3 rs7531583 1 1696020 GG AA 4 rs6681938 1 1771080 CC TT 5 rs41307846 1 1949559 GG -- 6 rs3128296 1 2058766 TT GG 7 rs262654 1 2079386 AA GG 8 rs262688 1 2103425 GG TT 9 rs6659405 1 2362949 TT GG 10 rs4648482 1 2739781 CC TT 11 rs2483266 1 3225901 CC TT 12 rs868688 1 3290667 TT CC 13 rs10492939 1 3292731 AA GG 14 rs2493268 1 3298358 TT CC 15 rs871822 1 3302774 GG TT 16 rs12024847 1 3310659 TT CC 17 rs2821017 1 3510731 GG AA 18 rs3765761 1 3620336 CC TT 19 rs3765766 1 3624520 TT CC 20 rs4233262 1 4136842 CC TT 21 rs966321 1 4215064 GG TT 22 rs964715 1 4216644 TT CC 23 rs1390136 1 4241703 CC TT 24 rs4654545 1 4425464 TT CC 25 rs446529 1 4695274 CC TT
This table shows that for the first SNP, rs9729550, I have CC while my sister has AA.
In conclusion, Promethease and the SNPedia visualization tool is helping me learn more about my SNP genotype results, complementing the information that I initially got from my Direct-to-consumer provider. Hopefully I will be able to do some additional research based on the results hereby obtained.
If you want to see my family’s genomes with Mike Cariaso’s tool you can find it here:
Don’t forget to send me any exciting findings that you might encounter!