Life in early 2011:
- Work around a server, one process, Gigabyte datasets.
Life in early 2012:
- Work around a cluster, many processes, Terabyte datasets.
I remember the old days, when I had to pipette to run an experiment. Today I do not have to pipette, I run a command or pipeline in a computer terminal connecting remotely to a cluster of a few thousand nodes. Sometimes it might be quicker to run a PCR than running my workflow script.
I consider a privilege being “drown” in data. Why? Because this is the future. More data brings more hypotheses and more hypotheses bring more knowledge. One either learns to surf the waves or a tsunami ends up catching one soon enough.
How does it feel from the inside? It feels exciting, overwhelmingly exhilarating! It feels like wanting to surf in a sea of data yet happy to be able to barely keep afloat: this is the inevitable fate of those genome bioinformaticians dealing with Next Generation Sequencing data.
What next in my todo list?
Cloud computing. I am counting the days when my experiments will be run in the cloud, not the cluster.
I look forward to welcoming you to the data feast. Will you join?