Content feed Comments Feed

Online Storage Optimization

Exploring Next Generation Storage Solutions

Dedupealooza

Posted by Sunshine On February - 19 - 2010

So much talk about dedupe these days it’s hard to keep up. The industry is waking up to the reality that dedupe is one of the best ways to reduce data, thus saving on power, cooling, space and other crippling storage costs.

Some of the more thought provoking posts of late:

DCIG - How SSDs can be leveraged to Deliver Inline Deduplication for Primary Storage
Jerome Wendt responds to a comment from someone about Hifn’s Bitwackr inline dedupe. I don’t necessarily agree with Jerome’s take on this. In general, inline solutions are extremely limited, as the original commenter pointed out. But the post provides interesting food for thought.

Storagebod - Where is OnTap 8 with a bit of a rant!
Martin Glassborow isn’t talking specifically about NetApp dedupe here, but the delay on shipping OnTap8 is of interest to anyone who is concerned about data reduction products. As he puts it, the elephant in the room is that A-SIS dedupe as it now stands has limited scalability.

Recovery Monkey - More FUD busting: Deduplication - is variable-block better than fixed-block, and should you care?
This post, by Dmitris Krekoukias, argues that major distinction some vendors make about variable and fixed block deduplication is a way of distracting customers from the real issues. The post served to defend NetApp against its detractors and competitors who say fixed block dedupe is limiting. The comments field is in some ways the most interesting part, with EMC heavy Chuck Hollis raising questions about his connections with NetApp. Also, our own Mike Davis weighed in, and the numbers he cited were so notable that further commenters questioned how this could be lossless compression. At this point, we’re used to it–the industry at large has become accustomed to less than spectacular results. More on all of this in a later post.

And here’s another interesting trend. The word “dedupe” is starting to creep into the lingo in a more general way. Among storage tweeps there is a greater tendency to throw “dedupe” into their conversations about everything from their record collections to what they eat. It reminds me a little bit of the “hepcat” slang I used to hear when hanging around jazz musicians. If something was ordinary, they’d call it “B Flat,” since that’s the most common key in jazz. For example, “Oh, I just had a B Flat lunch today of a burger and fries.”

The often Twit-witty Greg Schulz recently tweeted: “ I can have dvr record on disk NBC tape delay (thats probably on disk) then dedupe da commercials.” Good plan, Greg.

This post by Steve Gillmor at TechCrunch also uses the term–in a way that I’ve never heard anyway. In this case, he’s referring to the fact that there is duplication of content across what are now becoming overlapping social networks–FriendFeed, Twitter, and the new Google Buzz.

OK that’s all for now. Keep on deduping friends!

Jingle Bell Storage Rock

Posted by Sunshine On December - 22 - 2009

‘Tis the season for holly hocks, eggnog, and pundits predicting industry trends for the coming year. Apparently, in the storage blogo-tweet-osphere, getting in the holiday spirit means poking fun at one another.

One of the most notable entries is a series of “Letters to Father Christmas” penned by none other than one of our favorite bloggers, Storagebod (Martin Glassborow). They are broken into seven — count ‘em! — installments (Part 1, Part 2, Part 3, Part 4, Part 5, Part 6, The Last One). Storagebod really kept Santa busy this year.

The letters contain such gems as this jab at storage giant EMC, which he has dubbed “Elven Magic Company”: “there was that V-MAX announcement; boy did that go down well with all of the other Father Christmases. Still, it was great to see you get down with the Intel Gnomes; still a pity that the FAST magic wasn’t ready to go quite yet…”

And this for another company he calls Hippie Pixies (try to guess…) “I think you need a tete-a-tete with your Hitachi partners; they’ve been sleeping on the job! They either come up with a refreshed array or you’re going to have to go acquisitive to get yourself a top of the line Tier 1 array.”

And lest you think that the Elven Magic folk are balking, he got props from the big elf himself, Chuck Hollis. He writes: “He tells all of us vendors what we need to hear, and does it in a gentle and funny manner.  I think Martin has emerged as one of the most insightful and reasonable voices in the storage blogging world, and we all follow him closely here at EMC.”

Martin G. also sends us and our fellows a little gifty: “A special shout out to the smaller vendor bloggers; what you do is very important, more so than the bloggers of the big guys…Most of you do it very well and with a keen sense of humour! Keep it up and keep me smiling.” We’re working on it, Martin, we really are.

But wait, Mr. Bod isn’t the only one getting nipped by Jack frost. Over at his StorageIO blog, Greg Schulz has penned a post for the season: Behind the Scenes, SANta Claus Global Cloud Story. In it, we learn that “The heart or brains of the SANta operation is his global system operations center (SOC) or network operation center (NOC) that rivals those seen at NASA among others with multiple data feeds. The SOC is a 24×365 operations function that covers all aspects from transportation, logistics, distribution, assembly or packaging, financials back office, CRM, IT and communications among other functions…” Wow. For those of us who thought it was just elves turning wooden cranks to pop out Tonkas, this gives us a whole new view of the ops at the North Pole.

And speaking of cranking out wondrous things, Beth Pariseau over at SearchStorage has put together two seasonal posts that provide for highly entertaining reading. What’s great about this is that those of us in the tweetsphere saw her doing at least some of the research in real-time by asking for feedback through Twitter. Here they are:

First, on SearchStorage - What to buy a geek for the holidays

The best quote in this has to be the following from Taneja Group analyst Jeff Boles, “I’ve rejected the Kindle concept. Have you seen the book prices? It’s an entirely disadvantaged arrangement for what is ultimately a compromised experience in the first place. Wait until NEC makes a levitating robot dog that follows you around and holds the Kindle up for you while you do other things…” Quips Pariseau, “We’ll be sure to include this on our “What to Buy a Geek in 2050″ list.” Well, fine but I’m still putting that robot dog on my wish list…

Then this, on Storage Soup: The storage industry’s holiday wish list

In it, is the “wish list” of Wikibon’s Dave Vellante:

  • secure cloud
  • A dumpster to haul all my backup tapes that I’ve converted to disk-based backup.
  • A primary storage device that optimizes capacity without sacrificing performance. (Dave, we may just have a surprise for you under the tree over here at Ocarina….)
  • A virtualization performance guru … make it 5 gurus …

Finally, here’s a video showing John Troyer, Community Manager at VMware (and my costar in the Gestalt IT Tech Field Day overview video). In it, we see him open a veritable bonanza of presents from the VMware community. The normally unflappable Troyer cannot help but be blown away by the outpouring of love and truly enviable gifts he received. Those who know John are aware that this is all entirely appropriate to the level of positive energy he brings to that circle of techies. All the way over here at Online Storage Optimization we can feel the glow.

And with that, we wish you all a Merry Merry Christmas, and to all a good night!

Who’s Afraid of the Big, Bad, Dedupe?

Posted by Ocarina On July - 28 - 2009

Martin Glassborow on his Storagebod Blog has written a controversial piece raises questions about the two hottest technologies in storage at the moment, dedupe and thin provisioning. In his post, entitled “Living on a Prayer,” he suggests that both of these technologies could be the road to a storage nightmare, in which, “you could be many times over-subscribed with de-duped storage.” He gives the example of someone turning on encryption and all the dupes reappear at once, suddenly requiring all kinds of storage capacity that wasn’t needed until then.

He also sounds the alarm on migration, saying, “migrating deduped primary storage between arrays  … is going to need a lot of planning. Deduping primary storage may well be one of the ultimate vendor lock-ins if we are not careful.”

Here are some of my responses to this thought-provoking post, which will no doubt be getting a lot of attention.

On oversubscription:

I agree with Martin that there is a real risk here. When a bulk operation could cause massive rehydration, it’s essential that you have the proper warning and planning tools. There is also an economic component to this–essentially, you’re weighing paying for disk now or later.

A good dedupe solution will allow you to control the degree of over-subscription. While this does not matter so much for backup dedupe, it does matter for online. So you should be able to say, make a new copy of data every time the reference count on a duplicate hits 10 (or whatever number you choose). That way, while you limit your space savings to 10:1. You also limit your exposure to some application level decision that would cause all the duplicates to be rehydrated and returned to primary storage.

Encryption is a good example - encryption will cause most dedupe solutions to not be able to find duplicates at all if the encryption is done at the application or file level. Increasingly, we’re seeing encryption moving to the drive level, and in that case, it will be transparent to primary dedupe, but that’s not to say that there’s aren’t other cases where being oversubscribed couldn’t happen.

The lesson here is clear: Your online or primary storage dedupe tool must be able to give you the tools to manage that risk.

On Migrating Deduped Data

The topic of end-to-end deduplication is the natural next step in the maturation of the deduplication market. Today, you have many vendors, each of whom have built dedupe in to their filer as a feature. Every time you move data, you have to rehydrate it. This is often the case even when you are moving deduped data from one filer to another from the same vendor! NetApp dedupe will rehydrate every file any time you move it off the filer - for SnapMirror, for an NDMP backup, etc. There are really two things that the IT user wants to see. First, you want to be able to move optimized data in its most efficient form (deduped, compressed) not only across filers, but across vendors and storage tiers.

For example, why dedupe data on the filer, then rehydrate it, back it up to a VTL target, and then dedupe it again? Why not dedupe it once, and move the already-optimized data to the backup target, to the DR site, to the tier 2 filer? In the backup case, you’ll still get more dedupe benefit from your dedupe appliance. The repetitive nature of backups mean that when you back up the same file over and over, even if it was already deduped on the filer, it will still benefit from being deduped again with each backup. But you ought to have less data to move to the backup appliance, and you ought not to have to burn up a bunch of filer CPU cycles rehydrating files that are just headed off to backup.

Ideally you want dedupe and compression that is not a lock-in feature of a vendor, but that is a vendor-neutral data reduction solution that the IT shop can deploy across multiple filers (primary, nearline, etc), archive, and backup. And so the lesson again is to take a close look at the dedupe product and be sure that you’re not headed for vendor lock-in.

We look forward to seeing what others are saying about this provocative post.

Blog Review - Storagebod

Posted by Sunshine On June - 16 - 2009

Note: this is the first in a series of posts on the blogs that make up the Online Storage Optimization blogroll. Please look out for future reviews of other storage bloggers.

Every once in awhile I find myself enjoying a blog so much that I end up reading several posts in one sitting. Such was the case today with Storagebod’s Blog. Who else, I thought, could integrate references to Winnie-the-Pooh with cloud storage while making subtle points about storage infrastructure costs? This must be a sign I’m becoming a fan.

Storagebod, whose real name is Martin Glassborow, is an independent storage blogger whose topics cover a wide swath of storage and tech-related topics. His bio states that he’s responsible for storage infrastructure for a large UK Media company, which he doesn’t name. He also says in posts that he utilizes both EMC and NetApp storage, which puts him in an interesting position vis a vis the two competitors.

I’ve gotten chatting with Martin on Twitter on several occasions (as have some other contributors to this blog), and one thing that stands out about him is that while he has strong opinions about storage products, they always seem to come from a customer perspective — that is, he’s not interested in slamming a vendor for its own sake. Rather, he takes a pragmatic approach that speaks to a larger mission of helping other storage and IT professionals who are also struggling to control costs, keep data safe, and so on.

So, even while mocking IBM’s latest cloud offerings with his Milne-inspired ditty, he gives it the benefit of the doubt, saying, “…I’ve been a bit unfair, it’s not just tin, it comes with a raft of management software as well…”

Another recent post about a recent Amazon AWS outage doesn’t slam the company for losing a data center, but instead argues for better planning for such an eventuality.

“When Amazon lose a data-centre in their cloud, this should not be news! It will happen, it may be a whole data centre, it may be a partial loss. This not a failure of the Cloud as a concept; it is not even a failure of the public Cloud…”

In short, this is a blogger I recommend for anyone who would like to read spirited, opinionated yet fair coverage of storage from the point of view of someone who knows your pain. And while he never seems to quite find the best way to alleviate it, the process he goes through should be enlightening to many, both within and outside the industry.

Storage News and Notes - May 29

Posted by Sunshine On May - 29 - 2009

This has been a very interesting week in the storage blog-o-tweet-osphere, and the hottest topic was, somewhat ironically, an announcement that seemed to fall flat. Wednesday, HDS brought out its High-Availability Manager for USP-V (quickly dubbed “HAM” by bloggers), and several bloggers called it underwhelming and confusing.

Chris Evans, The Storage Architect - Enterprise Computing: USP-V - So Long And Thanks For All The Fish

Stephen Foskett, Gestalt IT -  HDS’ HAM-Fisted Announcement Can’t Be All

Storagebod’s Blog - I Wanted Bacon not Ham

To its credit, HDS immediately fired up a whole boatload of responses. Consultant Tony Asaro can be found arguing each point in all of these blog posts.

He also posted this on his Blog Bytes HDS blog:

Real World Implications and Impact of Hitachi High Availability Manager

HDS’s Hu Yoshida also put out a short post that clarified some of the issues:

Hu’s Blog - High Availability Cluster

In the end, there was this Seussian wrap-up of the whole debacle by Stephen Foskett - A Taste of HAM

I have to admit, this last one made me laugh.

In other news, there was some actual news out there this week! A Massachusetts court has ruled that Dave Donatelli, formerly of EMC, may work at HP, but he isn’t allowed to work in the storage division–the result of a non-compete clause the storage veteran signed with his former employer.

And finally, this blog’s parent Ocarina Networks was profiled in The UK Register this week:

Chris Mellor - Ocarina makes waves with lossless image compression

The article takes a look at the company’s compression technology–the first article that gets into this level of detail about it that I’ve seen. Definitely worth a read for those who are wondering about the magic behind its amazing results with image compression.

Twitter - Not for the Faint of Heart

Posted by Ocarina On April - 24 - 2009

Earlier this week on Twitter, we experienced what it’s like to have real time, public discussions with bloggers, analysts, and journalists about our product. It was a great example of why Twitter is such a powerful tool for talking about technology–not to mention one that can be a little hair-raising.

It started when Martin Glassborow, known to us all as Storagebod, tweeted about his results with the Ocarina Networks Simulator, OcaSim. We were of course aware that Martin was running these tests, but somehow none of us realized he would be releasing his results as they came in — all on this highly public forum. In short, this was one of those real rubber hitting the road type moments for a young company. Either Ocarina’s compression solution was going to knock his socks off, or it wasn’t.

As it turned out, his results were impressive. While he was less than amazed by our results on video files (which we make no claims to compress), he saw a 22% reduction on a fileset that was completely made up of JPEGs. As many people know, JPEGs are notoriously difficult to compress, as they are already compressed files. He noted this in his tweet.

Then things got a little crazy.

SearchStorage reporter Beth Pariseau retweeted his original tweet, while Stephen Foskett reported that he had seen 20% savings using the OcaSim on a set of family photos. At this point, a few analysts jumped in. Steve Duplessie from ESG said he’d be interested in running the OcaSim, while Greg Schulz of Storage IO suggested a challenge. He wrote:

“@pariseauTT @storagebod how about run against some PPT/PDF briefing/sales slide decks c which 1s compresses most, blind results of course ;)”

This raised an issue that we are getting more and more familiar with. One of the biggest challenges we face at Ocarina is getting past the disbelief factor. For so long, there have been rules about what can be compressed or deduped.

Customers will often say, “but everyone knows you can’t compress an already compressed file. It’s impossible.” Or, “I’ve tried dedupe for primary storage, and it hardly makes any difference.”

Much of what we are doing is helping people open their minds to the fact that the impossible is now not only possible, but fully operational and installed with major storage vendors such as BlueArc–which also tweeted during this exchange–and Isilon.

Oh, and Steve and Greg, we’re more than happy to send you the OcaSim. This is getting to be fun!

Happy Earth Day

Posted by Sunshine On April - 22 - 2009

earthdayIt’s a beautiful, sunny day in San Jose, a fitting reminder of the amazing planet we celebrate today.  To me, every day is earth day, because for now at least, this is our home. We can treat it well, or we can treat it with less than total regard. It’s our choice.

On this blog, we talk a great deal about greening the data center through reducing your storage footprint. This, in fact, is the rallying cry of our parent, Ocarina Networks. True, “Green IT” has become a bit of a buzzword these days, but without it, we are heading into a downward spiral of waste and overuse of natural resources.

Some nice pieces today that take a look at the advancements in Green IT:

Earth Day: The Best of Green IT - ChannelWeb

This slideshow covers the innovations in greener data center equipment from just about all the major vendors, including IBM, HDS, HP, NetApp, and many others. It also discusses the important contribution that deduplication is making to the goal of reducing power, cooling, and materials. (Thanks @SamMoulton for that tip.)

Do Something with Nothing - Storagebod’s Blog

While perhaps not intended as a specifically “green” sentiment, Martin Glassborow’s post today on his Storagebod blog gets into a topic that is sometimes overlooked in discussions: deciding if you can get away with buying less storage. Sound and very simple advice.

Cisco Launches Carbon Emissions Map, First in San Francisco - Earth2Tech

Just in time for Earth Day, Cisco announced it will launch an “Urban EcoMap.” This web-based mapping tool will provide up-to-date data on emissions and other environmental factors for the city of San Francisco. To me, this is more symbolic than anything else, but may increase awareness of some of the more hidden types of emissions - i.e. it’s not just cars, guys.

Hope you’re able to get out there and enjoy a little of what this amazing planet has to offer.

Join the Conversation!

Posted by Sunshine On March - 27 - 2009

There’s a lot of chit chat going on about storage these days - some on the mud-slinging level, some civilized, most somewhere in between. One positive trend is that the blogosphere is becoming less about personalities and more about collaboration and discussion — thanks to Twitter and other networking tools. Storage bloggers, being the smart and tech savvy  folk they are, have started creating their own online communities.

Here are some of the newest and most innovative sites that are aggregating content about storage:

Storage Monkeys - This site is serving as a social network and discussion forum for those involved in the storage industry. There are also pages for job listings and blogs.

Gestalt IT - This is a group magazine that aggregates blog entries on storage and IT, and features some of the most informed and interesting bloggers in the storage realm, such as Stephen Foskett (Packrat), Martin Glassborow (Storagebod) and Chris Evans (Storage Architect), to name a few.

BlueArc Shared Items - A quick way to get a tour around the storage blogosphere and find out what topics are getting the most online ink.

If I have missed any here, please feel free to comment.