$7.5M and 12 Reasons Why RainStor

It is a wonderful day today, sunny, cloudy and definitely RainStormy, all at the same time.

As you may have heard, RainStor (my current company) has received $7.5M series B funding from the wonderful teams at Storm Ventures, Informatica Corporation as well as existing investors Doughty Hanson and Dow Chemical

I would like to add a note of personal thanks to everyone involved, and also take an opportunity to offer up a list of 12 reasons “Why RainStor?” to add to the 7.5 Million validated by these fine VCs.

Reason #1 Why RainStor: Because I have better things to do with my hardware than keeping zombie apps alive

While every dog and their repository is focused on the highly lucrative high performance analytics and data warehousing segment, database archiving or as Curt Monash (www.dbms2.com) calls it “information preservation”, and Merv Adrian (mervadrian.wordpress.com) who has recently coined it “usable archiving“, it has gotten relatively little love in comparison.

With thousands of legacy apps out there running on “life support”, these zombie applications might well be queued up for “retirement”, “decommissioning” or “sun setting”, whatever word you want to use to describe the process, however, there is one simple dilemma, the end users of these systems may not be ready to completely turn them off.

Rather, either for business or regulatory reasons, they require the data within the systems to still be online accessible. So IT has to continue to keep them going just for occasional queries or access.

What options are there? Well, they could turn off the apps and keep the data stored in the databases and continue to pay the licenses fees for a full blown RDBMS which still needs large amounts of storage, hardware and admin care and feeding. Alternatively, you could back the data off to a compressed archive that isn’t queriable without re-inflating the data (which itself could take a significant amount of time, depending on the size of the data set). But all these methods continue to be constrained by the use of 20+ year old RDBMS technologies. Until now …

RainStor provides a better way. You can instantly retire these applications, because your data can be loaded into the RainStor repository and preserved in a significantly compressed footprint (est. 40 to 1 ratio) and the data would still be accessible via on demand query using standard SQL92 or BI tools such as Microstrategy, Business Objects and others. All this with low touch and next to zero administration.

So you can free up large amounts of hardware, software licenses and even resources and redirect them to focus on other more business critical IT initiatives. And in this current climate, who wouldn’t want the opportunity to do more with less?

Reason #2 Why RainStor: You are using a SaaS application. What would you do under the following situations?

What if you had only the single copy of your SaaS data held by your app vendor and you encountered …

  • Corruption: Erroneous mass changes made to your SaaS records but discovered only 2 months later
  • Loss: Deliberate deletion of SaaS records by rogue employees again discovered later with no undo available
  • Growth: Your data volumes outstrip quotas permitted by your service or vendor and you are forced to delete your data to reduce the size
  • Outage: Loss of connectivity or service outage and you can’t get to critical data
  • Compliance: You are asked to show historical changes required for audit or produce data exactly as it was 3 years ago
  • Reporting: You want to run analytics that are not available through the default reports of your SaaS vendor

If any of these issues apply, then you need SaaS Data Escrow, which is the ability to have a copy of your data separate to the SaaS vendor. This copy could be on-premise or in alternate clouds.

RainStor’s on-premise and cloud capabilities allows you to have as many copies of your SaaS data as you want, at the frequency of update/refresh you need. Why RainStor? Because you can cost-effectively deploy SaaS Data Escrow.

Reason #3 Why RainStor: When a RDBMS is overkill for data that is immediately historical the moment it is created

RDBMS’ are great for storing and managing data that is constantly being updated and modified. For over 20+ years database technologies have evolved with a plethora of features and functions such as two phase commit, referential integrity, and index optimization. In fact, databases are so complex and functional that highly skilled database administrators are at a premium to keep these Ferrari’s humming at a high rate.

For years RDBMS’ were the only game in town, but variants started appearing in the form of object databases and stores, there are even columnar style databases for more efficient processing for analytics.

Curiously if you look at the massive quantities of data that are generated in high volume apps, you would notice a trend around vast quantities of data that need to be captured and stored. That is they NEVER EVER get modified or updated once they are created. Text messages, call data records, sensor data from smart devices, security events, logs and more are examples of structured data that are immediate historical once they are created. More and more that data is required through compliance to be retained for legal purposes while remaining fully queriable within a given SLA which can range from seconds to hours.

This is why RainStor is ideally suited as a primary repository for such applications. Without all the overhead of 20+ year old RDBMS design and features that would never get used in these scenarios, RainStor instead is specialized for the long term and legally compliant retention of such data at the lowest possible cost. Total cost of ownership of preserving such data can be up to 20 times less that using a traditional RDBMS.

RainStor would never replace an RDBMS in instances where an application clearly performs OLTP and updates to stored records, but for historical structured data retention, RainStor shows that RDBMS’ are clearly overkill and there is a newer, better and more cost-efficient way to preserve your data.

Reason #4 Why RainStor: Because a main barrier to cloud adoption is bandwidth need to upload Big Data

Application data, whether in the form of traditional RDBMS records, log file entries, or Call Detail Records (CDR) can occupy petabytes of space, and increase in size by terabytes per day. It is clearly cost prohibitive, if not physically impossible today for common network connections to support these data volumes.

Take Amazon’s AWS Import/Export service which even goes so far as to suggest that customers to PHYSICALLY ship their data to Amazon rather than upload it. This has been tongue in cheek noted as “SneakerNet” where the drives are removed and Fedexed to Amazon. According to an Amazon Import/Export calculator, this provides an approximate 50% savings to the standard S3 Data Transfer-in charges. Additionally, Werner Vogels, CTO of Amazon blogged about the longs times and costs it would take to transfer a terabyte of data over a range of network bandwidths into S3.

Because RainStor is optimized for massive structured data volumes on an ongoing basis, costs are dramatically reduced when RainStor’s value and pattern de-duplication (e.g. 40 to 1) compression ratio is applied.
This would mean a 1 TB transfer would be only 25GB. A far less costly and more practical value proposition, thereby solving the Big Data cloud upload bandwidth dilemma problem.

Reason #5 Why RainStor: Go Green! Extreme compression makes for more efficient storage. Cloud storage leverages elasticity, use only what/when you need

The article http://www.storagenewsletter.com/news/miscellaneous/frost-sullivan-green-data-centers highlights the moves towards greener computing. RainStor’s technology happens to hit the sweet spot by supporting application retirement, reduced need for storage space, allowing cloud based storage services to be offered and more.

RainStor’s repository for historical structured data retention allows eco friendly green data centers to be even more cost and energy efficient

Reason #6 Why RainStor: Because you would prefer to not have to teach your SEIM (Security Event and Incident Management) appliance to “Rollover”

Most SEIM, (Security Event and Incident Management) appliances record their detection of events and incident activity in logs. The capacity of the logs are constrained by the available disk storage on these appliances. Typically an SEIM log could hold approximately 90 days worth of activity. Once the log is “full” you either need to back off the data and clear the log or “rollover” the log and overwrite the previous 90 days data. In some cases, the log data needs to be retained and 90 days worth of capacity just isn’t sufficient.

RainStor’s value and pattern de-duplication is particularly effective with structured data such as logs, events, texts and call data records. With compression rates 40 to 1 and sometimes higher, a SEIM appliance with RainStor embedded could increase the log capacity of a device 40x which would mean 90×40=3600 days …. approximately 10 years! Basically you would never have to rollover the appliance ever … assuming the appliance itself even lasts 10 years!

Reason #7 Why RainStor: Because your Oracle or other production RDBMS systems should run at their best performance without the drag of old historical, less frequently accessed data

If you are running large production applications, your RDBMS’ are no doubt getting quite massive. You can choose to delete and backup older data in order to speed up the performance of these systems, OR you could choose to store them in a repository like RainStor which will continue to allow online query access to your data with little disruption and storage in one 40th the size of your previous data. RainStor’s partner Informatica embeds our repository inside their Data Archiving suite to offer exactly that functionality.

Reason #8 Why RainStor: Because you should Rationalize before you Virtualize

With all the excitement over virtualization and the undoubted success of VMware including the VCE (Virtual Computing Environment) announced by VMware, Cisco and EMC (also VCE) (see http://www.emc.com/campaign/global/vce/index.htm ) a 2010 resolution for many companies should be to clean out their closets of their legacy apps before spending time and effort virtualizing.

RainStor’s ability to load up all of the structured data from legacy systems while continuing to allow standard BI tools and SQL to access the data online allows old legacy apps to be retired thereby rationalizing the portfolio of apps that eventually should be virtualized. Much like it’s time to take your old clothes and donation items to Goodwill or Salvation Army, it’s time to think about clearing out your app closet before you look forward to the year of 2010 cloud computing and virtualizing.

Reason #9 Why RainStor: As a SaaS provider I need to overcome objections about my prospect continuing to maintain their on-premise app

If you are a SaaS vendor and you are hoping to lure your prospect into your new service, chances are that they already have a legacy on premise system which they are using as their solution today. The first question they will ask you is, “what is the migration process?”. To which you typically would answer, we would bring over the most current and relevant data in your on premise app. But what of the rest of the data? It is probably your preference not to migrate the large amount of data in their legacy app as it would clog up your production databases and increase your costs. If you could assure your prospect that they could migrate ALL of their data and effectively shut down their on-premise app, and that they could access all of their legacy data in the cloud through your interface. Then you would really have something, you might even be able to charge a few extra dollars per month per user, since you’ll be saving them a bundle in not having to maintain their existing system.

RainStor can do exactly that for you. Help you take the data you don’t want to place in your production environment, compress it to a fraction of its original footprint and still leave it accessible on demand.

Reason #10 Why RainStor: Because Data Warehouses Need Love Too

Whether you are using one of the new high performance analytical products like Vertica, ParAccel, Aster Data or Greenplum, or going Hadoop and Map Reduce with the help of Cloudera or Cascading or using more “traditional” technologies and hardware like Teradata and Netezza, you still have to figure out a strategy around what to do with all that “big data” once you’ve crunched the numbers. The same concept applies as production transaction apps in that they should be tuned to focus on the most recent and important data being gathered. Older data (and in some instances that might be merely hours or days) would still have value and the more data you can keep online and accessible, the more opportunities and trends your business will uncover.

RainStor is complementary to data warehousing initiatives and can provide a highly efficient compressed store in which to hold historical structured data. Thereby freeing up your production environments to perform at their best.

Reason #11 Why RainStor: Same data retained, same SLA performance for online access, 1/10th or less of the cost

Let’s take a real-life non IT example. Imagine you are being charged monthly for all that overflow of stuff you moved out of your garage into Public Storage. If all of a sudden you saw an advertisement for another storage facility who said you could store all of your stuff for a year for the cost you pay Public Storage a month, wouldn’t you move your stuff? Or at the very least, you’d put your latest batch of old stuff into the new facility right?

Let’s now say you are the owner of that storage facility, and you could economically offer your new customers this amazing price to storage discount. Wouldn’t you be at a competitive advantage in your industry? Even if you charged just 50% less, your overhead would be so low your margins would far outstrip your competition.

RainStor can accomplish this for your IT infrastructure or data center. If  you have the need to store and retain large quantities of structured data. You can achieve this either on-premise or in the cloud. With extreme compression, leveraging commodity-based hardware and practically no administration needed, you’ve just answered the question Why RainStor?

Reason #12 Why RainStor: Because Immutable Hardware like EMC Centera Can Now be Used for Structured Data Too

Last July (2009), RainStor (formerly Clearpace NParchive) completed certification on EMC Centera content addressed storage (CAS). EMC Centera is the world’s most simple, affordable and secure repository for information archiving. It is a purpose-built, software-driven storage platform that provides a myriad of new capabilities for unlocking the business value from unchanging or infrequently changing digital assets.

By integrating RainStor’s specialized repository for structured or semi-structured data with EMC Centera, data sets from any packaged or custom database application (Oracle, DB2, Peoplesoft ++) or critical logs can now be stored on the Centera platform, all while allowing full SQL access through industry standard reporting tools.

This new capability extends mutual customers’ uses of Centera, using it to deploy structured data archives alongside existing email and file archives.

More posts about RainStor on Cloud ‘N Clear: https://www.ramonchen.com/?tag=rainstor

Cloud ‘N Clear Poll – You Make The Call!

Your Votes: [poll id=”15″]

2 thoughts on “$7.5M and 12 Reasons Why RainStor

Leave a Reply