It was widely thought that once the Human Genome Project was complete one would simply have to provide a sample of DNA and the diagnosis of the underlying malaise would be at hand. This is not the case, and will not be the case for some time. Most news is pessimistic except for news on technology breakthroughs, which is overly optimistic.
The BBC declares: “DNA perfect for digital storage” and you think that molecular computing and DNA drives will soon be reality. DNA is after all Mother Nature’s knowledge management system.
Where does this come from? A paper published in Science in June 2012 describes the encoding of a book onto DNA using an inkjet printer and a piece of glass. The book was then read using an advanced form of the same technique that was used to decode DNA in the Human Genome Project. In January, a group of authors from the European Bioinformatics Institute lead by Nick Goldman and Ewan Birney encoded some of Shakespeare’s sonnets onto DNA and transferred the information between locations by shipping the DNA. This time the article was in Nature. For those of you who don’t follow the scientific literature, Science and Nature are leading journals in terms of impact factors, meaning that these findings will most probably re-polarize data storage research toward DNA.
So, you begin to anticipate having to get yet another computer with the newest DNA drive technology. You then ask a few knowledge management experts and have the cold water of reality thrown onto your face.
The method described in these publications is way too expensive – $12,000 for 1 MB of storage. Perhaps more importantly, it is too slow (high latency). When you expect websites to respond in terms of milli-seconds, waiting for a lab technician to churn out the experiments to decode DNA in order to retrieve data is worse than going back using audio cassettes to store programs. Even worse, you would have to pay $12,000 for such cassette latency.
The real reason why we will need DNA drives
If you listen closely, you can hear the roar of a rapidly expanding data explosion. If you look at the amount of data produced between 2009 and 2011 it is equivalent to 90% of all the data generated up until 2009. There is no end in sight. Think about Gmail. You do not need to consider which emails you need to keep. Just keep them all – boom!
Even though it is on the cloud, data still has to be physically stored. Physical storage requires energy and costs money. The New York Times estimates that the equivalent of the output of 30 nuclear power plants is used to run data centers worldwide.
For all the data that is out there, 30 nuclear power plants is not unreasonable. However, when you consider that 90% of all data output was produced in 2 years, it becomes evident that our thirst for data will soon be insatiable. Add to this recent high profile calls for the generation of even more data. The Obama administration recently announced an effort to map the human brain. It requires 300 petabytes of storage or the equivalent of 600 million PCs. An alternative to electronic storage is inevitable.
The beauty of storing data on DNA is that it is like carving your data into a stone tablet. It requires no electricity to maintain. Of course, there are many ways to store data without consuming electricity, but DNA beats them all because it enables you to store a lot of data in a very small space.
Small machines need small computers
DNA drives may see more immediate use as data storage systems for nano-devices. DNA has been used as nano motors and nano biosensors. A natural follow-on is the use of DNA for nano computing. Any high performing nano-device needs to be able to store and compute data. E.Coli, the same bacteria responsible for Montezuma’s revenge, are being transformed into simple computers by manipulating their DNA.
Human Genome Project does have a return on investment
While those expecting miracles from the Human Genome Project are disappointed, the general agreement is that it is a smashing success. One calculation has it that for every dollar spent there is a return of 140 dollars in new developments. As Lee Hood and David Gales point out “it illustrated the concept of ‘discovery science’ – the idea that all the elements of the system (that is the complete genome sequence and the entire RNA and protein output encoded by the genome) can be defined, archived in a database, and made available to facilitate hypothesis-driven science and global analyses.”
So, to conclude, you will not be using your DNA sequencer as a computer. However, like the Human Genome Project storing data on DNA is a forward looking concept that will enable the unthinkable and solve the inevitable. What one must notice is that the ability to store data does have a limit. Will storage of data on DNA obviate that limit?