There’s been a lot of discussion about tiered storage lately. Most notably, Stephen Foskett has written a series of posts on the topic on his Nirvanix blog, Enterprise Storage Strategies. In his latest post, he essentially argues that tiered storage hasn’t turned out to be cost effective and that cloud storage could be the best option for the lower tier.
We certainly agree with him that unstructured data has become unmanageable due to the proliferation of rich media and other large files. We also agree that tiered storage hasn’t lived up to its promise to a large extent. However, let’s not be too quick to throw out the baby with the bathwater. As Hu Yoshida has discussed in a recent post, tiering has come a long way in light of new technologies, particularly virtualization. In our view, by combining virtual tiering at the block level (as described in Hu’s post) with virtual tiering at the file level you can get the best of both worlds.
Tiered storage used to be about moving data from one physical storage place to another. The premise there was that some storage was fast and expensive, and other storage was slower but cheaper, and that you could save a lot of money by moving data to the appropriate place.
This was a good idea in theory, but as it turned out there were a number of unforeseen problems. First, the tools for moving files were themselves sometimes expensive. There goes your cost savings. On top of that, they were sometimes good at moving the files but not at getting them back. And further, in situations where the fast tier and the cheap tier were not from the same vendor, it often proved difficult to make finding files that had been moved transparent to users and applications. As you can guess, these types of problems often made the whole thing more trouble than it was worth.
The fact remains, though, that most files are stored on storage that has more performance, and costs more, than is necessary for that file. Most storage admins know that 80% of their files could be stored on a cheaper tier, if it wasn’t a hassle or too expensive to do so.
One solution with immense potential is to have virtual tiers within a single filer or namespace. Virtual tiers are levels of dedupe and compression applied to a file, making it cheaper to store because it’s taking up less space. In a virtual tier, the file does not have to move anywhere – it can stay right where it is, but you reduce the cost of storing it by shrinking it. With dedupe and compression, there are lots of choices for trading off performance versus space savings.
Sun’s file system ZFS allows this, and cloud storage like Nirvanix can do this too — having the advantage of using the latest technology, and that the technology behind the cloud interface is invisible to the user of the cloud. Either way, let’s look at how you can implement virtual tiers while keeping files in the same place that they were created in.
Let’s say Tier 1 is for your fast hot files – they live on your Tier 1 filer, uncompressed. In that case, you might have a Virtual Tier 2 be all the files that have not been modified in 7 days, and Tier 2 would be that same filer, same volume, but with a policy that those files that meet the Tier 2 definition are deduped. No compression, just dedupe. In that case, read back times will be quite fast. Maybe not exactly as fast as reading the original un-deduped file, but almost.
A Virtual Tier 3 might be “files that have not been modified in 30 days” and the tier might be defined as dedupe plus light compression. Read back will be a tad slower, but space savings greater than dedupe alone. Finally, you might have a virtual Tier 4 – dedupe and maximum compression. This might fire more complex compressors that take longer to compress (and decompress) a file, but will get excellent space savings. Read back performance for tier 4 might be quite a bit slower, but the space savings might be 90% or more reduction in the file sizes in that tier.
Here’s the kicker: All of this can be done without moving a file off the filer it started on. Users and applications can still find the file right where it always was. If they access the file, the optimization solution will transparently “rehydrate” the file.
There are different solutions that can do some or all of these things today. NetApp’s dedupe can only dedupe all of the files in a volume or none, so it can’t be used today to create logical or virtual tiers within a volume. But other solutions, like the Ocarina ECOsystem, are policy-based and can be used to create multiple different logical (or virtual) tiers within a single filer or volume, with multiple dedupe settings (including Ocarina’s patented Object Dedupe) and multiple levels of compression, with choices of over 100 compressors for different file types.
Ocarina has been tightly integrated with certain types of storage – including cloud solutions like Nirvanix – and the most transparent virtual tiers would be with the combination of Ocarina and one the filer choices that have tight integration with Ocarina: BlueArc, EMC, HDS HNAS, HP, Isilon and Nirvanix (in alphabetical order – no vendor prefences implied!).
Of course, virtual tiers can be combined with real physical tiers, so that you can combine the level of storage optimization (dedupe, compression) with storage of different physical characteristics (expensive filers, cheap filers, cloud storage) to provide an environment that is not just a simple two-tiered model but a policy-driven environment of possibly a dozen or more logical tiers, with files being tiered-in-place or migrated-and-optimized automatically based on policy with little or no storage admin involvement.
As you can see, there is vast potential in this new approach to tiering. Even better, it can be achieved in such a way that storage admins’ jobs become easier, rather than harder. Like a lot of things, storage tiering has always been a good idea, but sometimes the technology has to catch up with the idea for implementation to become a good idea. Given the growth of storage, and the improvements in physical and virtual tiering, I think doing a better job of tiering must rank close to top of the list for many customers.