BioJS: Web 2.0 Reusable Components For Visualization of Biological Data

March 4, 2013 § 1 Comment

Despite may Bio* being developed (BioPerl, BioJava, etc.), to date no coordinated Bio* community effort has been established for JavaScript. JavaScript is the language of choice for implementation of dynamic and interactive web applications. BioJS provides a catalogue of open source modules in JavaScript for Life Sciences. These modules include many commonly used functionalities, available for developers or scientists to download. This consistency of development promotes reutilization of existing components and a genuine one-stop shop for development of bioinformatics web applications. Resource discovery is enabled by BioJS’s registry which includes all of BioJS source code libraries, documentation and guidelines. These are freely available for public use in what we believe it is to date the most extensive catalogue of open source JavaScript biological widgets. As more bioinformaticians continue to develop modern web applications, we expect the BioJS community to continue to grow.

The BioJS publication is now out in the Bioinformatics journal.

Screen Shot 2013-03-04 at 15.12.43

BMC Update Newsletter Features Two Of Our Most Recent Works

July 19, 2012 § Leave a Comment

This week’s BMC Update, the official newsletter for BioMed Central has featured two of our recent works, the Crowdfunding Genome Project and the ‘How Not To be a Bioinformatician‘ article, published in BMC’s Source Code for Biology and Medicine journal. This is a huge achievement, given that BMC Update reaches a great number of recipients world-wide, in the order of hundreds of thousands.

On behalf of all the people who have been involved in our two projects, I would like to thank BioMed Central for their enthusiastic support. We look forward to sending BMC more novel research results and opinions in the near future.

To access BMC Update newsletter follow this link:

http://www.biomedcentral.com/update/20120718-update

 

 

A Genome Blogger Manifesto

September 28, 2011 § 9 Comments

Have you ever wondered why some people have no reparation in sharing their genetic profiles? Why do they openly talk about something supposedly so private? I believe that no contradiction exists between wanting to protect one’s privacy yet sharing one’s genomic data with the world. I am more concerned about the information that Facebook collects about my profile than my genome data (provided that I live in a country where there I public health).

Sharing and comparing one’s genome with other personal genomes is a matter of necessity if one is to shed light on the meaning of one’s personal DNA.

This is why I became a genome blogger myself. Why one should be constrained by the information that genomic test reports provide? No personal genome analysis report can ever be complete, they will always be influenced by the biases of whomever is providing such a report.

*   *   *

Although no formal document seems to have been produced on what the core values for genome blogging should be yet, core beliefs driving personal genome-sharing should be made explicit. Here I present an initial and inherently imperfect first attempt to put in writing of what I believe genome blogger values could be. I do not expect every fellow blogger to agree with them, but I hope that at least they inspire some debate. These are not a fixed set of rules; on the contrary, I expect this thinking to evolve with the genomics technology itself. I base some of the ideas below on Marcus Wohlsen’s ‘Biopunk’ book, Meredith Patterson’s ‘biopunk manifesto’, Misha Angrist’s ‘Here is a human being’ book and Pekka Himanen’s ‘Hacker’s ethics’ book.

Core Values for Genome Blogging

  1. Intelligent exploration, experimentation and trial to push the boundaries of knowledge are a right for ordinary people. The days in which genetic science was only done by university professors or people working in corporate labs are now over. Now everyone should have the power and legitimacy to be able to discover, develop and find new things about their own genome data. « Read the rest of this entry »

Personal Genomic Software: A Review of What Is Available

February 14, 2011 § 5 Comments

Readers may have seen that a few previous entries in Manuel Corpas’ Blog have been dedicated to myKaryoView, a personal genome visualization free software. In this post I review some of the software that is currently available for analysis of personal genomes. These are all free third party packages independent from providers such as 23andMe, Navigenics or deCODEme.

Andrew Scheidecker’s Personal Genome Explorer apparently is the first piece of software that was created for analysis of 23andMe personal data. This is a console application that allows 23andme data import, deCODEme data import, SNP database import from SNPedia, analysis of genome based on SNPedia metadata and random genome generation based on population frequency data.

I found that Personal Genome Explorer is a light-weight application that can be easily downloaded and installed. A lot of potential information can be extracted and browsed from a database based on SNPedia data. I tried to upload my own 23andMe chromosome 16 with file extention ‘.txt’ and unfortunately it did not recognize it or gave a clue as to what kind of extension it accepts.

Personal Genome Explorer showing randomly selected SNPs

SNPTips is a firefox plugin extension that allows customers of 23andMe to access their SNP genotype information. SNPTips allows one to hover the mouse cursor over the SNP id in any article text or webpage. Clicking on the SNP icon it creates, a pop up window appears with one’s genotype (i.e. the DNA letters found in your analysis) with links to SNPedia, Google Scholar and dbSNP. I tried to upload my 23andMe chromosome 16 and it worked quickly and neatly. Unfortunately it does not allow simultaneous visualization of more than one personal genome.

Enlis Genome is another tool that can be downloaded as a console. The interface is quite intuitive and it managed to upload my chromosome 16 SNPs in about a couple of minutes. The report it gave back was very neat. However it seemed to provide a very similar kind of information to what is already available to 23andMe customers. The main added value I could find in this tool was that it colated most information provided by a 23andMe’s customer report into a sort of document that can be easily handled. It was unfortunate though that the report concluded I was female. How it infers my gender when I only provided autosome data puzzles me slightly.

 

My results for Enlis Genome uploading my 23andMe chromosome 16.

myKaryoView is to my knowledge the only personal genomics tool that allows navigation and visualization of this genetic data directly as a genome browser. myKaryoView uses the DAS technology, which makes it capable of representing any available DAS source together with one’s genome, such as known genes, OMIM genes, normally variant regions, etc. Currently, adding one’s genome into a DAS source is a process that requires expert knowledge of another tool called easyDAS. Once the DAS source for one’s genome is created, the url where the DAS source genome is located can be added to myKaryoView for exploration via its interface. myKaryoView does not require any download for installation, as it is a web tool, and many personal genomes can be navigated at the same time.

myKaryoView showing my personal genome SNPs in green for a subregion of 10q11.23

The Perfect Tool

If I was able to pick the strengths of each of the reviewed softwares and put them together into one piece I would choose the richness of SNP information from the Personal Genome Explorer, the ease for uploading one’s genome from SNPTips, the reporting capabilities of Enlis Genome and the navigation and visualization capabilities of myKaryoView. Since all of these implementations are already available, the winner of this software “market” will be the one that combines all of these strengths in manner that is easily accessible to lay people. I think 23andMe has a lesson to teach in terms of making accessible to all of us the ability to analyze one’s genome and reporting the relevant information succintly.

Conclusion

Several tools are now available specifically tailored to the analysis and discovery of information related to one’s personal genome. Not a single tool is perfect and to some extent all require some computer and biology knowledge in order to properly operate and understand them. This is clearly not the ideal situation for lay people who are curious to know a bit more about their own personal genome. Certainly if all the strong points of each of the above were combined a much better tool and service to the community could be rendered. Personal genome coders: it’s time to join forces!

myKaryoView: First Open Source Visualization Software for 23andMe Data

September 1, 2010 § Leave a Comment

myKaryoView Logo

Following my previous post on the First Publicly Available Genome Via DAS I would like to present an open source software that Rafael Jimenez and myself have developed for visualization of genomic data. Here we have it configured to display 23andMe data as a test case. We call it myKaryoView and it is available for free use and download. Its website is located at the following address:

http://mykaryoview.com

myKaryoView works in most contemporary browsers without lengthy installations and uses publicly available data distributed throughout the Internet via DAS. This means that there is no need to hold the data locally and that it is capable of visualizing any data as long as it is available via DAS. In order to visualize 23andMe data, myKaryoView requires the set up of a DAS source, which currently limits myKaryoView’s usage to those familiar with this technology. However, configuration and addition of sources are extremely simple and the amount of data able to display is limited only to the time of request completion and data rendering.

Here we show myKaryoView to display personal genomics data with a dummy 23andMe genome data source. This source is based on real 23andMe results data from my own genome, randomly modified in a manner that is irrecognizably different.

The myKaryoView website shows an implementation that allows search of genome data via gene name or genome coordinates. For example, type in the search box 1:2000000,6000000 and hit “Submit Query”.

myKaryoView Zoom and Chromosome views.

The figure above shows results of that query, with two tracks containing the source from 23andMe with dummy data plus genes for a subchromosomal region in chromosome 1, Start: 2000000, End: 60000000. Gene names and SNP data and are shown in red and blue respectively. Different color shades indicate the density of annotation for any given point. If the “Gene Names” data track name is clicked, a popup window appears with a link “Display Original Data Source” that allows the download of the raw data from its DAS source. Any feature can be clicked for retrieval of specific information contained in the DAS source. Here a blue SNP mark is clicked and a popup window appears describing the selected SNP and a link to its corresponding dbSNP entry.

A simple manual explaining how to install and configure myKaryoView to show different data sources is provided from the website. myKaryoView is still in beta testing and any feedback is welcome. We have some plans for the near future for myKaryoView, which we will reveal in due time. Meanwhile I hope you find it interesting and useful.

By the way, the claim that this is the First Open Source Visualization for 23andMe data is, of course, arguable.

Open Tech 2010

July 25, 2010 § Leave a Comment

I will be speaking at Open Tech 2010 in London (UK) on Friday 11 September 2010. My talk, entitled ‘Who Owns my Genome Data’, will be delivered at the Seminar Room (First Session, 10:30 a.m.). If you are planning to attend Open Tech 2010 this year, let me know and be sure to attend my talk!

Cloud computing: a new standard platform?

February 8, 2009 § 1 Comment

Cloud computing is becoming a technology mature enough for its use in genome research experiments. The use of large datasets, its highly demanding algorithms and the need for sudden computational resources, make large-scale sequencing experiments an attractive test-case for cloud computing. So far I have seen cloud computing demonstrated using R (1). However, it remains to be seen a rigorous comparison of its performance using a BLAST (2) search and its ability to cope with ever-increasing databases and open source frameworks such as bioperl (3) or bioconductor (4).

Cloud computing claims to be a resource where IT power is delivered over the Internet as you need it, rather than drawn from a desktop computer (5), in a fashion seemingly similar to having your own virtual servers available over the Internet (6). Some of the most important aspects of cloud computing are:

* Software as a Service (SaaS): where you buy a software license for a determined period of time.
* Utility Computing: storage and virtual servers that IT can access on demand.
* Web Services.

My first exposure to cloud computing came of an email from Matt Wood (7), a newly established group leader at the Sanger Institute (8), announcing the Cloud Computing Group (9) in Cambridge, UK. At that point I had no idea of what it meant. When I attended the meeting at Cambridge University’s Centre for Mathematical Sciences (10), to my surprise I found there a very select audience, ranging from the director of IT at Sanger, Phil Butcher (11), one of the Ensembl (12) software coordinators, Glenn Proctor (13), and quite a few local start-up companies.

Among the presenters, we had Simone Brunozzi, from Amazon’s Cloud Computing (14). I think he had an interesting story to tell: how Amazon, a well known company, is now involved in the business of cloud computing and selling it. Apparently, this technology they sell was developed for Amazon’s own business. Among their main challenges was to be able to address the capricious shopping habits of customers, with orders peaking around Christmas and quite flat the rest of the year. These trends required rapid adaptability of computational resources. The idea of cloud computing fitted well with their business model of e-commerce: you don’t need to care about where your computation is done, the only thing you care about is that you have the needed resources and do not have to pay for them when you don’t need them. One of the things that stroke me about Amazon’s presentation was that they would not tell us the number of processors they had at their disposal.

When it comes to using cloud computing for genomics research, prices may be quite expensive when they add up. The bioinformatics field, greatly influenced by the open-source movement, is not likely to rush to join Amazon’s cloud. Private efforts trying to make money out of human genome technology have remained rather unsuccessful to date: think of Celera Genomics or Lion Bioscience. I am skeptical of the bioinformatics community adopting cloud computing unless open source ideals are embraced: i) allowing people to develop and contribute to the technology if and when they want to, ii) allowing total openness in terms of its achievements and pitfalls and iii) making it free to use for everyone. I do not think that making it free does not mean there is no margin for profit. Think of the profitability of free-to-use technologies such as java (15) or MySQL (16), both components of SUN Microsystems’ (17) business.

Despite the promise of potential benefits for the bioinformatics community, the way the cloud is being portrayed does not conform the ideals of free access and openness. Unless these ideals are implemented to some extent, I see it difficult for the cloud to take root in the bioinformatics field and become a new standard platform for genome research.

References

1. http://www.r-project.org/
2. http://blast.ncbi.nlm.nih.gov/Blast.cgi
3. http://www.bioperl.org/wiki/Main_Page
4. http://www.bioconductor.org/
5. http://www.guardian.co.uk/technology/2008/sep/29/cloud.computing.richard.stallman
6. http://www.infoworld.com/article/08/04/07/15FE-cloud-computing-reality_1.html
7. http://www.sanger.ac.uk/Users/mw4/
8. http://www.sanger.ac.uk/
9. http://cloudcamb.org/
10. http://www.cms.cam.ac.uk/site/
11. http://www.yourgenome.org/people/phil_butcher.shtml
12. http://www.ensembl.org/index.html
13. http://www.ebi.ac.uk/Information/Staff/person_maintx.php?s_person_id=299
14. http://aws.amazon.com/ec2/
15. http://www.java.com/en/
16 http://www.mysql.com/
17. http://www.sun.com/

Where Am I?

You are currently browsing entries tagged with Open Source at Manuel Corpas' Blog.

Follow

Get every new post delivered to your Inbox.

Join 27 other followers