BETA
This is a BETA experience. You may opt-out by clicking here

More From Forbes

Edit Story

Getting Some Clarity On The Google Versus Amazon Archival Storage Products

This article is more than 9 years old.

When Google announced its Nearline archival storage product recently, it was heralded as a seriously disruptive introduction, largely due to its promise of a very quick (only a few seconds) retrieval time. This in comparison with the cloud-based market leader, Amazon Web Services (AWS') Glacier product. After the excitement dies down, it is worth digging a little deeper into both Google Nearline and AWS Glacier to give a neutral assessment of the benefits that each can bring.

After poring over some of the fine print and technical details, it seems that what we have is Google doing an impeccable marketing job with the product entry. Let's take a look at what both Google and Amazon can offer in different areas.

Retrieval Throughput Limit

According to Google, Nearline offers slightly lower availability and slightly higher latency than the company's standard storage product but with a lower cost. The “time to first byte” of approximately 2 – 5 seconds was held up by most commentators, myself included, as a real game changer.

However digging into the “slightly higher latency” aspects, we promptly discover some significant issues. Google Nearline limits data retrieval to 4MB/sec for every TB stored. This throughput scales linearly with increased storage consumption. For example, storing 3 TB of data would guarantee 12 MB/s of throughput while storing 100 TB of data would provide users with 400 MB/s of throughput.

So, if a customer stores 1TB of data within Nearline, their download will start within 2 – 5 seconds, and then promptly take 73 hours to complete (assuming they are downloading 1TB at 4 MB/second).

Comparing the same 1TB case with Amazon Glacier. AWS will have that object available to customers in approximately 3 – 5 hours. Four hours into their download, a Google Nearline customer would be 5% complete on downloading their 1TB of data with approximately 69 hours to completion.

So it seems that Google has purposely muddied the message here and AWS, whose retrieval time looks slothful, is, in fact, faster than that of Googles.

Pricing Surprises

Data Access Costs – Glacier customers can access 5% of their cold data for free while Google Nearline charges customers for all retrievals. Given that a single archive can be as large at 40 terabytes that means an Amazon Glacier customer can pull up to 2 TB of data per month for free. With Nearline, however, data retrieval incurs a cost of $0.01/GB. If an object is accessed once a month, its effective price is $.02/GB ($.01/GB for storage and $.01/GB for data retrieval) which is suddenly twice what the customer expected.

Network Egress (Data transfer out) Cost – Network Egress charges are 25% more with Google Nearline than with Amazon Glacier. Customers who want to remove data from both products will also need to think about network traffic charges, but AWS wins the price competition in this area.

Cost Management Tools

AWS allows customers to set policies that only allow users to retrieve a certain amount of data per day - customer could set a policy for retrieval that falls within the free tier. Nearline doesn't appear to have similar granularity.

Data Lifecycle Management Capabilities

This is important for organizations looking to automate their hot and cold storage strategy. Google doesn’t have lifecycle policy features, therefore Google customers cannot tier data from Google Standard to Nearline automatically. In contrast, within AWS, S3 customers can use S3 lifecycle policy to tier objects from Standard to Glacier. Customers also need to be careful about using versioning with Nearline. Early deletion charges apply when they overwrite existing objects. For example, if a customer creates an object in a bucket configured for Nearline, and 10 days later overwrites it, the object is considered an early deletion and they will be charged for the remaining 20 days of storage.

Summary

The old adage, if it sounds too good to be true, it probably is, would seem to hold here. Every commentary I read got carried away with Google's spin and failed to see the bigger picture. What will be interesting will be to see how AWS responds to this - Google dealt them a blow and their strategy around how to fight that blow will be fascinating to watch.