Sustainable Big Data Retention

I finally got round to watching the documentary The Corporation (2003) on DVD last night. If you haven’t seen it, it’s quite disturbing as it portrays a bleak picture of how large corporations generate great wealth, but can also cause great harm.

Environmental impact was of course one of the subjects reviewed in the film. Ray Anderson founder and chairman of Interface Inc.., the world’s largest manufacturer of modular carpet for commercial and residential applications, talked about the non-sustainability of the industrial revolution. He called for a strong drive towards industrial ecology and leading by example through a reduction of waste from his company’s manufacturing process.

Certainly since the film was released over 7 years ago (what can I say, I’m backlogged in my movie watching), things have changed quite a bit. Main Street and corporations talk a lot about Carbon footprint and as a society we are very much aware of the ramifications of emissions and consumption of devices that use fossil fuels such as gas, oil, coal and electricity.

With the Big Data deluge driving increasing need for storage across all industries (Communications, Healthcare, Financial Services, SmartGrid Utilities etc.), vendors such as Hitachi Data Systems have been vocal and committed to eco-friendly storage solutions as part of their corporate strategy. While hardware manufacturers can lead the charge in reduced power consumption and emissions, data retention and management software plays a critical role by ensuring that the physical storage space is logically expanded through compression and data reduction. Aside from pure software enabling an increase in storage capacity, how the data is retrieved or extracted (write once, retrieve many times) for use on demand plays an important part in the data retention “carbon footprint” of a corporation.

At RainStor (disclosure: my current company) our repository performs algorithmic value and pattern de-duplication on big structured data which leads to 40 to 1 or greater compression rates, while allowing on demand queries which do not require re-inflation. We ensure that queries are executed with minimal disk reads leading to lower power consumption, more capacity per physical TB and also lower cost. We originally named our company to reflect our data retention software that runs on premise and in the Cloud (In which rain is stored) but one of the effects of our technology results in a decidedly “green” benefit. Compressing and reducing data to 95% less footprint provides not only significant savings for on-premise data center deployments but are even more economically attractive with cloud deployments.

So maybe corporations can have both the technology which gives them access to the vast amounts of information they need to generate untold wealth, as well as the ecological sustainability that everyone on the planet wishes and hopes for. If you are an individual, pat yourself on the back next time you recycle a bottle, if you are a corporation running a massive IT datacenter, pat yourself on the back if you have deployed storage and the supporting software that allows you to sustainably retain Big Data.

Originally published at http://www.rainstor.com/news-blog/blog/sustainable-big-data

One thought on “Sustainable Big Data Retention

Leave a Reply