Everything in computing needs storage: blogs, instant messages, social networks and personal documents all reside on our own computers or on someone else’s, for example Gmail in the case of emails. As the amount of data available increases, so do storage requirements and units of measure.

Storage units in computing start with bytes (or 8 bits). Just over a thousand bytes (ie 1024 bytes) comprised a Kilobyte (KB), 1024 KB comprised a Megabyte (MB), and 1024 MB comprised a Gigabyte (GB), which is the most common storage unit today. This multiplication of 1024 continues to define Terabyte and then Petabyte.

Petabyte is considered a milestone in the scientific approach, to the point that it is sometimes called the Petabyte Era. What sets this massive amount of data apart from the limited data previously available is the prediction that in the Petabye Era, scientific researchers would no longer need to create hypotheses, models, and then test whether their hypothesis and model are correct or not.

For example, instead of hypothesizing that a certain age group is more susceptible to health risks, or that a certain geographic area is likely to be affected by unrest or political uncertainty for a certain reason and testing this with some data, advanced data mining could be used. Such mining of petabytes of data would make it possible to process a virtually limitless stream of information, such as scanning news from around the world to identify problem areas along with trends and issues of “great importance or severity” without the need to identify their underlying causes. This type of ‘geotagging’ has already started in the form of projects like Google Zeitgeist and Europe Media Monitor – EMM. So, in the age of petabytes, the old scientific methods of hypotheses, models, and tests are about to be replaced by what big amounts of data tell us. In short, inferences from huge data collected from around the world would not need models to explain it, as the numbers would speak for themselves. For example, fast tracking of epidemics, war prediction, voting patterns, etc. In his article titled ‘The End of Theory,’ Chris Anderson, editor-in-chief of Wired, writes: ‘Science can advance even without coherent models, unified theories, or really any mechanistic explanations.’ The most radical views have even called the petabyte era the end of science, while others have dismissed it as too futuristic.

Terminologies beyond Petabyte have already been defined: this includes Exabyte, Zettabyte, Yottabyte, and Brontobyte, each starting with Petabyte and multiplying by 1024 to arrive at the next terminology. But only time will tell if Petabyte Age with the ability to process millions of data points and aggregate information across numerous sources and sensors using processing clouds would change the science or not.

Leave a Reply

Your email address will not be published. Required fields are marked *