Content feed Comments Feed

Online Storage Optimization

Exploring Next Generation Storage Solutions

Happy New Year

Posted by Sunshine On December - 29 - 2009

Tis the week for the “out of office” email messages. But the storage blogo-tweet-osphere waits for no man. Here are a few posts that caught my eye this week.

Bas Raayman sees CPU power hitting the wall: The RAM per CPU wall

Rick Vanover says 2010 could be the year for 10GigE - Will 2010 see 10 Gigabit Ethernet go mainstream?

It being the end of a year–and a decade–predictions abounded. We’re pleased to note that when it came to summarizing the top storage stories of 2009, deduplication for primary storage, the specialty of this blog’s parent Ocarina, made the big lists:

Infostor: The top 5 storage technologies of 2009 (and 2010?)

“Storage optimization (or data reduction) technologies such as data deduplication and compression can significantly reduce capacity requirements and costs … Consider data reduction for primary storage.”

SearchStorage - Beth Pariseau: Top 10 enterprise data storage news stories of 2009

“10. Data deduplication branches out. As deduplication settled into a comfortable role in backup, data-reduction technology started working its way into other parts of the data storage infrastructure, including primary as well as nearline and archived data … Ocarina and Isilon Clustered NAS help visual effects studio archive images, cut costs.”

For sheer inventiveness, blogger Stephen Foskett wins the prize with his 2009 predictions post, in which he turns the clock back and takes advantage of 20-20 hindsight: My 2009 IT Industry Predictions.

Meanwhile, social media and tech watcher Louis Gray takes himself to task and looks at all of his 2009 predictions to see how well he fared: My 2009 Tech Predictions: Mixed, But Nailed Real-Time.

OK that’s all for now. Here’s wishing all of you a happy, healthy, green and techy new decade.

SNW Revisited

Posted by Sunshine On August - 17 - 2009

InfoStor editor Dave Simpson has a post on his storage blog about how Storage Networking World (SNW) can get back some of its lost luster. As he notes, trade shows in general are feeling the pinch in the current recession and SNW-which just posted the agenda for its upcoming fall conference–is no exception. For the post, he spoke with Mike Alvarado, a consultant in the storage industry and former Storage Networking Industry Association (SNIA) board member.

Alvarado has some pretty radical new ideas. Mainly, he recommends that SNW shift its focus away from end users towards channel partners, or so-called VARs as a way of bringing some energy back to the show.

Says Alvarado: “Resellers and integrators are a vital network storage industry segment. I have seen many great conversations take place between vendors and these partners at different SNWs; those exchanges represent opportunity to drive great value for our industry. I believe if SNW focused on optimizing the interaction between vendors and resellers/integrators, it would pay large dividends. Calling the show something else or founding a new show with different sponsorship would be needed, but whatever it takes the sooner this happens the better.”

Wow. A new name, even? This would be a major change of direction for the show. One question I had in reading this, is would this benefit innovation in the industry? As we noted in a post following last Spring’s show, one of the more disappointing aspects of it was the dearth of startups.

As our lead blogger Carter George wrote last April: “(Startups were) … always an exciting part of SNW for me: a peek into the future and a chance to see where the big guys might be headed next. As recently as last year I recall seeing maybe 15 or 20 small vendors taking up floorspace, all of them hoping to not only find end user leads, but also to catch the eye of potential big partners.”

Overall, it is good to see that people are talking about ways to keep the show going–and relevant. I can’t help thinking however that it might be worthwhile to wait out the recession and see if the energy just naturally returns to the show before making it turn in a whole new direction.

Dedupe for Online - What’s Next?

Posted by Ocarina On July - 10 - 2009

In the wake of the high profile Data Domain acquisition battle, data deduplication is the technology du jour. This has led to wide-ranging discussions on the topic, one of which is the question of where deduplication is being applied.

A recent Wikibon article, “Pitfalls of compressing online storage” notes that Data Domain represents only one sector of the market–dedupe for backups. But what of dedupe and compression for online storage? (Though technically of course, even backups could be said to qualify as “online storage” since they are still accessible.)

The authors, analysts Dave Vellante and David Floyer, come to similar conclusions that many storage analysts already have on this topic. Namely, that dedupe for online is a different animal, and that the vendors who service the backup world aren’t equipped to handle it. In fact a whole new market has emerged in which a different set of players are providing the data reduction solutions for primary/nearline storage. This, they note, has added a whole new layer of complexity to storage that has its own drawbacks. All true from this somewhat limited viewpoint.

InfoStor’s Dave Simpson has written a summary of the Wikibon article in a recent post by boiling down the three market sectors within online storage optimization as follows:

–“Data deduplication light” approaches such as those used by NetApp and EMC
–Host-managed data reduction (e.g., Ocarina Networks)
–In-line data compression (e.g., Storwize)

And so, as is so often typical of an emerging technology area, dedupe is currently viewed as a set of point solutions for each tier of storage. Hot storage? Try compression with Storwize. Online, nearline, archive? Try Ocarina. Backup? Try Data Domain. This is where the situation stands at the moment, and for the most part the analysts are correct to summarize it this way. However, we believe that this is just the beginning of a larger trend, to which I referred in an earlier post “The Dedupe (R)evolution.” What will really change the face of dedupe over the course of the next couple of years is the concept of end-to-end optimization.

That is, instead of shrinking a file on a filer, then expanding it to copy it to an archive, where it will be shrunk by a different solution, and then expanding it again to back it up, where it will be deduped by another solution, why not optimize a file early in its lifecycle and then manage it as an optimized object as you move it, back it up, and archive it? That saves not only on storage space, but also on network and SAN bandwidth and on the amount of CPU cycles, power, heat, and cooling used to repeatedly expand and then re-dedupe files over and over.

For now, the differences as sketched out by Simpson and Vellante are reasonably accurate from our point of view. However, some of the advantages and disadvantages may be overstated. For example, that of the performance between in-band and out-of-band. Think of it this way. In general, whether for backup or primary storage, an in-band approach is going to support faster throughput than a post process, and a post-process, because it has time to do more intelligent things, should always get better data reduction. The question is, how much of a given customer’s data needs the (sometimes slight) performance advantage of an in-line solution? If performance is good enough on a post-process, most customers would prefer to get the maximum space savings.

Most industry data suggests that over 90% of online file data - even data on high-performance primary tier filers - is accessed only infrequently starting three days after it is created.

Yet, despite this behavior on the part of users, the vast majority of a customer’s Terabytes are in files that are well over three days old. So if you have a way to choose which files you optimize, then you can go ahead and use post-processing with performance as good or better than an in-band process. You would simply do nothing to files as they are being created (so no performance penalty at all) and for their first n days of existence. Then, when the time is right, you optimize the file with the maximum space savings.

A key to achieving the best balance of space savings and performance is to be able to optimize data by policy, and to be able to tune the level of optimization by file type, owner, modified time, and a wide set of performance and tuning choices. The conventional wisdom is “Storwize for live databases, and Ocarina for files” and in general we think that is more or less true — but we also think that for most customers, that means that in 9 of 10 situations, the big dollar savings for a customer is going to be in policy-based optimization of files. As it happens, Ocarina is the one that has this capability, which in the end translates to notable differences in how well they save customers money.

Overall, it’s good to see these topics being wrestled over among analysts and journalists, and we hope it continues.

Dedupe for Primary - Recent Coverage

Posted by Sunshine On June - 22 - 2009

As we keep noting on this blog, data reduction is becoming the topic du jour as storage budgets are squeezed and deduplication becomes more and more viable and effective. Dave Simpson, Editor-in-Chief of Infostor, came out with a very thorough article today on primary storage optimization. It’s a practical guide for customers who may be struggling to understand the differences between key vendors’ offerings in this new and exciting data reduction arena. They are: NetApp, EMC, Ocarina (this blog’s parent), Storwize, Hifn, and greenBytes.

According to Simpson’s article, performance is a key issue to consider when assessing primary storage optimization products. He also quotes Eric Burgener, formerly an analyst with Taneja Group (now with InMage), who notes that often time the much touted differences in reduction rates can be overplayed.

“… a handful of vendors are addressing the performance requirements associated with data compression and de-duplication on primary storage, and … users should understand that there’s not a huge difference between, say, an 8:1 data-reduction ratio and a 20:1 ratio.”

An interesting point, and one that is often overlooked in the race to show results. As we have reported on this blog in the past, the real comparisons should be about the percentage of difference, not the ratios, which can be misleading. So for example, the Ocarina ECOsystem had 200% better results on a typical home shares file mix than NetApp dedupe, with 54% reduction vs. NetApp’s 27%. These are real numbers that can give you a sense of the amount of storage space you’re likely to reclaim when deploying one of these solutions.

And by the way, Eric Burgener had a really nice post back in February when he was still at Taneja Group.  Called Pulling One Out of the Hat, it gives great advice and details about how to make the best use of your primary storage budget in these times. Definitely worth a read.

Happy Monday everyone!

When You Only Have a Hammer…

Posted by Ocarina On May - 7 - 2009

Nice to see Dave Simpson from InfoStor getting interested in the subject of dedupe for primary. Today, he had a new post on the topic, along with a plug for a webcast he’s doing on the topic with Noemi Greyzdorf of IDC. In the post, Dave makes the point that people are starting to muddy the waters when it comes to terms such as “dedupe for primary”–something that NetApp and others are popularizing at the moment. He notes that quite often, the term covers a lot more than just dedupe, and can include compression, single instancing, and other capacity optimization methods.

Dave makes some great points. For example, there are a lot of possible trade-offs you can make between performance and space savings when you are looking at data reduction technologies. My company Ocarina gives you the choice of superfast lightweight compression, subfile dedupe (fast), object dedupe (medium fast, but better results) and content-aware compression (slow, but great results).

You can turn any of these things on or off, and you can pick the optimizations you want by policy not only by volume, but right down to the individual file or file type.

For example, you could say, “I want wire speed lightweight compression only for my MS Word docs, but use object dedupe and content-aware compression on any PDFs you find in this homeshare volume; don’t do anything to this database volume, and do everything you know how to do to shrink this archive volume.”

There’s no one right answer. The hotter your data, the more likely it is to be true primary data with lots of users reading it and writing it in real time. That means you have to be that much more careful about what data reduction algorithms you apply to it. The colder the data, the more aggressive you can be. Some data reduction vendors can only deploy in one tier of storage because they only have one kind of tool. If all you have is wire speed compression, you can do hot database data, but you won’t be able to shrink most kinds of data at all. If you have a heavyweight dedupe and compression solution, you may be stuck in archive or backup, because you’re not fast enough for primary. If all you have is a hammer, then all the world looks like a nail.

At Ocarina, we want to give you the toolbox, and the ability through policies to intelligently decide which tools to use for each file, volume or data set. This means you can match your data reduction strategy to a file’s performance requirements and get the best fit for each part of your storage - primary, nearline, archive, etc. Multiple tools, multiple options, and a far more customized solution. At the end of the day, having a really really great hammer is only marginally useful if what you need to do is cut something in half … like your storage budget.

What We’re Reading - April 28

Posted by Sunshine On April - 28 - 2009

An interesting day today in that there isn’t too much in the way of hard news, but plenty o’ commentary floating around the old blog-o-tweet-o-news-o-sphere regarding storage.

First, lots of buzz about Ocarina this past week:

We earned a nice mention in the keynote address at BIO IT World, where Ocarina is in attendance and was a “best in show” finalist. Things are going extremely well there, so stay tuned!

Dedupe Team Up: A post by George Crump of Storage Switzerland on InformationWeek that shows how Ocarina is working with major storage vendors to help them compete with NetApp on dedupe.

NetApp’s Dr. Dedupe questions Ocarina about whether it is doing dedupe in this post “When is Dedupe Not Dedupe?” Kind of an odd one–as W. Curtis Preston points out in the comments field, the “Technology” pages on the Ocarina web site–particularly this one which breaks down the Ocarina ECOsystem process–answer his question. I’d also suggest he read this recent post which was actually in response to another NetApp blogger, Alex McDonald.

Another mention of Ocarina and its recent BlueArc partnership announcement in Wikibon’s Bill Mottram’s summary of the scene at NAB last week. He writes:

“BlueArc: The news on the BlueArc booth was the announcement (April 20th) of their partnership with Ocarina. Availability of the Ocarina Optimizer for BlueArc is scheduled for mid-May.”

And in other news…

InfoStor’s Dave Simpson comments on the Oracle-Sun acqui with some thoughtful insights on his newly revved up blog.

Devang Panchigar has a nice round-up and some commentary on EMC’s recent V-Max announcement on Gestalt IT/Storagenerve.

And, as widely reported, EMC has asked its employees to take an across the board pay cut. And, with the kind of speed we’ve come to associate with blogging, EMCer Storagezilla posted on the 5% cut on his blog right after the announcement was made public.

Happy Tuesday everyone!

InfoStor Revs up Blogs

Posted by Sunshine On April - 8 - 2009

As everyone in the industry knows, storage publication InfoStor is a great resource for news and views on storage. One area where it hasn’t been all that active was in the blogging arena. Well, all that is changing. Dave Simpson, Editor-in-Chief and Kevin Komiega senior editor, are both blogging as we speak, on site from SNW, and the word on the street is that they’re planning to ramp up this aspect of their reporting from here on out.

This is a good thing on many levels. I doubt I’m alone when I confess that I read more blogs than full-on pubs these days, and so I’m much more likely to read ongoing blog-sized updates from these guys. They are both extremely knowledgable about storage.

My one complaint is that there’s no obvious way to get an RSS feed for Kevin’s blog. It is possible to subscribe to Dave’s blog because it also lives on Blogger. Here’s hoping they fix that.

Update April 10: Kevin has informed me that he also has a Blogger version of his blog. In addition, @BlueArc set up a feed on Toluu for Dave Simpson’s blog.

In any case, I look forward to getting news and views from these two on a regular basis, and I encourage you to check their blogs out.