When we run out of storage space for digital data, we’ll use DNA

There is a shortage of free space in the data storage. This problem has existed for several years, but ordinary people hardly ever thought about it. Not so long ago there was a time when the free space for recording digital data was limited by the size of the hard disk of your computer. When the limit was reached, we either followed the new hard drive or recorded everything on optical media. When they ended, we simply deleted the old data and recorded the new ones. But there are those who never delete data.

For example, many companies do notespecially those whose scope and value is dependent on the digital information they possess. Times change. Technology is evolving. Now the information is not deleted, it is transferred to the "cloud". By the way, the term “cloud” itself is very ephemeral and does not at all reflect a real physical natural phenomenon. He just seemed very comfortable and beautiful, and left him. Where is the data stored? It is completely unimportant, at least, as long as we can at any moment turn to them. Is it likely that we will eventually run out of space in the cloud storage? Nobody thinks about it. While you pay a subscription - everything is fine. Little space? Choose a new tariff plan and get even more space for your information.

This messiness led to the fact that peopleit became difficult to even imagine that one day we could run out of space to store digital data. As before, it was difficult to imagine that sooner or later, fresh water on the Earth could end, the reserves of which are replenished due to its circulation in nature. But here's the reality. In 2018, the water reserves in Cape Town (South Africa) rapidly approached their total depletion. And we, people who do not think about it, are rapidly approaching the lack of free space for storing digital data.

Data, data, around the same data.

The main reason for this depletion of free spaceOf course, it is connected with the pace with which we produce new data. Every day in the world, thanks to 3.7 billion Internet users, about 2.5 quintillion bytes of information are generated. Among all the digital data available today, 90 percent was created only in the last two years. And with the growing number of used smart devices that connect to the World Wide Web (the same “Internet of Things”), these numbers will soon grow even stronger.

“Speaking of cloud storage, people oftenimply the presence of some kind of endless free space for storing information, ”commented on the Digital Trends portal to Hyun June Park, head and co-founder of Catalog, a data storage company.

“However, the cloud is the same computerwhich stores your data. People simply do not realize that so much digital data is generated in the world, that the pace with which they are produced is far ahead of our ability to preserve all this. In the very near future we will get a huge gap between the amount of useful data and our ability to save them using traditional media. ”

Since companies involved in clouddata storage is constantly busy building new data centers or expanding existing ones, it is very difficult to predict when we really lose all free space. Nevertheless, according to the same Park, by 2025, humanity as a whole can generate more than 160 Zettabytes of digital information (Zettabyte, for those who do not know, this is a trillion gigabytes). How much of this volume can we really save? About 12.5 percent, says Park.

This question definitely requires a solution.

Could this answer be DNA?

So consider Park, Nathaniel Rocket, and also theircolleagues at the Massachusetts Institute of Technology. Together, they founded the Catalog company, within whose walls a technology was developed that, according to its creators, is able to change our understanding of how all our digital data will be stored in the near future. According to them, more precisely to the statement, in a short time digital data from around the world will be able to fit on an area no larger than a wardrobe.

Catalog offers assuitable solution to encode the data in the DNA. All this sounds like one of the plots of American science fiction writer Michael Crichton, but their proposed scalable and affordable solution is quite realistic and even attracted $ 9 million in venture capital funding, as well as support from leading professors from Stanford and Harvard universities.

“I often get asked a question: whose DNA do we use? It’s as if people think that we take the DNA of some person and turn them into mutants or something like that, ”Park laughs.

But this is not what the company does.Catalog. The DNA that Catalog uses to encode the data is a synthetic polymer. It is not of biological origin and is not created on pairs of nitrogenous bases on which information is recorded. A series of zeros and ones, which is recorded in the polymer also can not be the code of what is alive. Nevertheless, the resulting product is biologically practically indistinguishable from what we are used to meet in a living cell.

The idea that DNA can be viewed inAs an alternative means of storing digital information, it began several decades ago. In fact, when James Watson and Francis Crick only came to the DNA structure model in 1953. However, until today, a number of significant restrictions did not allow us to see the enormous potential of using DNA as a means of storing digital information, not to mention how to put all this into reality.

In the normal view, the method of storing informationDNA concentrates around the synthesis of new DNA molecules; matching sequences of bits of information with sequences of four pairs of DNA, as well as producing a sufficient number of molecules that will represent all the numbers you want to keep. The problem of this method is the high cost and slowness of the process. In addition, there are many limitations associated with the actual storage of the data itself.

The approach of the company Catalog offers a shutdownthe process of synthesis of molecules from the process of their encoding. Essentially, the company first produces a huge amount of only certain molecules (which significantly reduces the cost of production), and then encodes the information in them through the use of a variety of ready-made molecules.

As an analogy, Catalog compares the previous one.approach with the production of custom hard drives with information already pre-recorded on it. Recording new information in this case implies the need to create a new hard disk from scratch. The new approach proposed by the Catalog can be compared with the mass production of empty hard disks and writing to them as needed with new coded information.

It's all about storage

The beauty of it all is howA huge amount of data can be stored in a very compact area. As a demonstration, Catalog used its technology to encode various fantastic books into DNA. For example, the whole cycle of novels “Hitchhiker's Guide to the Galaxy”. But this is all the little things before the opening opportunities.

"If we compare comparable values, thenthe number of bits that you can save using DNA will be a million times higher than what is offered by the same solid-state drives. For example, take the size of a regular flash drive. When using the DNA method of preserving information, you can write to the device the size of this USB flash drive a million times more information than a regular flash drive. ”

Comparison with SSDs is notedthe developers are still not entirely accurate. DNA allows you to store much more information in a comparable volume, but the technology does not allow instant access to it, as in the case of the same USB drives. Catalog technology transforms information into solid physical pellets (granules) of synthetic polymer.

To access this information you need to takecoded synthetic polymer pellets, rehydrate it with water, and then “read” with a DNA sequencer. As part of the process, it will be possible to isolate the base pairs of DNA, which can then be used to calculate the number of zeros and ones that make up the information. From start to finish this process can take at least a few hours.

For this reason, this technology is primarilyfocused on the archiving market, where fast access to information is not required. Usually in this case data is meant that is not used or very rarely used after the recording, but it is extremely important to save. Let's say, as your guarantee for the refrigerator, only in the scale of corporate significance.

What benefits will it bring to ordinary ones?users? At the beginning of the article we talked about the fact that most of us do not think about what is happening and where our information is stored. On solid media? Yes, even if only on magnetic tape. We are not interested as long as we have access to it at any time.

Due to the length of the recovery processinformation, we are unlikely to ever reach the level when some Google Cloud or Yandex.Disk will store our information in giant vats with DNA. If the same Catalog technology confirms its effectiveness, then, most likely, it will find its niche in the areas where the long-term information storage approach is applied. As for the short-term method of information storage, where both hard disks and solid-state drives are currently used, we will have to rely on other methods.

Presenting perspectives

This tube contains millions of copies of DNA encoded data.

Nevertheless, here you can see almost sci-fi possibilities.

“Imagine that in the implanted to you underThe granule skin contains all the information about your health: data about your magnetic resonance angiography, information about your blood type, X-ray for your dentist, ”says Park.

"You probably want all this data to bealways available to you, but do not want to store them somewhere in the "cloud" or on some unprotected hospital server. Having always with you this data in the form of DNA, you can physically manage it, get access if necessary, limit it to everyone else and open it directly to your doctor.

“Practically in every modern hospitalThere is a DNA sequencer. I’m not saying that we are pursuing exactly this goal of using this technology, but in the future all this may become quite possible, ”the developer says.

Currently, Catalog is engaged in experimental projects aimed at demonstrating the effectiveness of the technology developed by them.

“We are not facing any unsolvable scientific difficulties, we are now talking more likely about the problems of optimizing mechanical processes,” said Park.

By Park's own admission, he decidedconnect to study the methods of storing data using DNA simply because it seemed to him a very cool and innovative technological approach to solving the existing big problem. Now, according to the expert, this technology can become one of the most important technologies of our time.