Content feed Comments Feed

Online Storage Optimization

Exploring Next Generation Storage Solutions

Archive for the ‘Storage optimization’ Category

InformationWeek Shows Strong Dedupe and Compression Demand

Posted by Mike Davis On July - 22 - 2010

Data deduplication and compression are quickly becoming standards, with customers in a wide variety of markets and business sizes recognizing the value of saving money and reducing management through storage optimization. Earlier this week, InformationWeek released a research report highlighting how IT professionals were working to meet today’s exploding storage capacity demands. The report, unsurprisingly to us, showed nearly 80 percent of survey respondents said they were using compression technologies or had them under evaluation – while more than half reported similar implementations and plans for dedupe.

What was even more impressive about this survey was the respondents themselves. While it’s easy to understand for the largest companies with their heavy data sets that scalability is a major strain, two thirds of these respondents reported having less than 50 terabytes of storage, meaning these challenges have now hit mid-size businesses as well. The <50TB crowd is going to be more risk averse, and reducing their storage from 3 arrays to 1 array isn’t necessarily the “killer” value proposition. The value proposition of compression and dedupe technologies includes things like reduced backup windows, which can only be obtained by a solution that supports end-to-end benefits, proves to customers that it’s trustworthy and doesn’t carry additional administrative burden.

Also intriguing was the comment by many respondents that the growth of storage was in transactional applications (database and e-mail) rather than unstructured data. Many of Ocarina’s clients in multiple market segments are seeing dramatic growth in unstructured data, but in this business tier, more traditional applications are adding to the data glut. Also, there is little doubt that in a short amount of time VM applications will appear as part of that category. This is also in contrast to the “Petabyte vertical markets” which are chock full of unstructured file data, and a good lesson for OEMs to take away; any embedded dedupe implementation needs to deliver results in structured-data applications, and that’s a non-trivial requirement. Common dedupe solutions in the market today would create noticable performance impact on these apps, which is why Microsoft removed single-instancing as a native feature from Exchange. A dedupe solution in these applications needs to include workflow and application awareness, for example compressing and deduping only those portions of the database (or VMDK) that are inactive. Ocarina calls this Heat Index Management, and it’s a feature included in our ECOsystem for OEMs.

As industry observers have seen in the last several years, backup has been the killer driver for deduplication – the first wave, if you will, because the ROI of D2D backup is tremendous with this feature in place. But in the survey there is strong adoption (>50% of deployments) of dedupe for archival and other non-backup apps. Of course this isn’t unexpected to Ocarina, because we’ve helped drive this shift towards compression on primary storage and the entire data lifecycle.

As data growth continues in all markets, we believe strongly in the need to compress and optimize across the datacenter, from primary storage to backup and archival. InformationWeek has done a fantastic job demonstrating real customer demand for these technologies, which we expect will increase. If you haven’t walked through the presentation, make sure you do at http://www.informationweek.com.

Dedupe Or Compression? Both! Optimization or Performance? Both!

Posted by Carter George On June - 8 - 2010

Interest in and discussion of data deduplication and primary storage compression seems to be at an all-time high right now. In the last few weeks, we have seen new entrants into the market, including Permabit, and a broad overview from Wikibon, focused on storage optimization. At Ocarina, we see industry discussions such as these as proof positive our business is on the right track. As the market innovators and pioneers in this space, we believe in end to end storage optimization, aimed to enable customers to best use their existing equipment and protect their core data.

Some of the discussion seen in articles around the Web (including Storwize) has focused on the speed of compression and deduplication – with concerns around how this impacts primary storage performance. Completely valid issues, as one should not have to substitute performance in exchange for features. From our company launch, we have focused on achieving the best of both worlds – with primary data storage optimization, and high performance, both with our own dedicated devices and those from our OEM partners.

For customers who want choice, Ocarina has you covered. Ocarina has both fast in-band deduplication and advanced compression options. You can either run fast in-band deduplication, with sub-millisecond latency, or you can choose deep content-aware compression, which takes longer, of course, but also gets results that simple deduplication can’t hit. Or… you can do both! Stop the presses!

Ocarina’s deduplication is fast – you can get deduplication results immediately on every file that passes through our systems, and then come back to do a post-processing run that gets advanced results later, at an off-peak time, should you choose. Also, the post-processing engine is driven by policies which you set, letting you compress only files that meet criteria you choose – for example, size, age or type.

Thanks to Ocarina winning a number of high profile deals at large customers where deduplication alone is not enough, Ocarina has become associated with heavy compression – but we do dedupe as well, and we’re quite good at it! If you look at results we recently got on a corporate data set, we were able to shrink that data set by 92% overall, with 60% coming from deduplication and 32% more coming from files that were compressed in the post-process.

At Ocarina, we believe deduplication will soon become an embedded system feature, and a commodity. It is possible to do in-band deduplication, with very little latency, and minimal CPU resource demands. Dedupe will become a storage fundamental, and the pricepoint for customers to gain dedupe will trend towards zero (where NetApp is today with their A-SIS offering).

Companies like Permabit will have to win quite a few OEM deals at these kinds of end user prices, minus an OEM discount, to be profitable – but advanced features, including dedupe-aware data movement and advanced compression, will be value-add features customers will pay more for. Unlike dedupe, those features won’t provide big value for every customer, but they will apply as important benefits providing value to a significant percentage, and unlike dedupe, advanced content-aware compression is not going to be a commodity given away for free in every system.

Ocarina is in a good position with our technology, and both customers and OEMs should evaluate not just the technology of a dedupe provider, but their ability to financially survive as well. Once a company has reduced your data, and locked it away in their format, the last thing you want is for that company to go out of business, and compress your chances of getting it back. A company that only has dedupe on the table is going to be priced out of the market by 2012, even if it is successful today selling dedupe.

When something sells to end users for free, you can’t make up the profit margin by selling in volume. In Ocarina’s case, we can be extremely aggressive on price for dedupe, because we bring more to the table and have something else to sell, and every OEM deal we win creates a platform ready for future upgrades. As Wikibon wrote yesterday, “Ocarina provides the highest levels of compression by using the optimum compression techniques.” We are the best in the world at this stuff and we will continue to stay ahead of the competition. We also have a well-rounded business that won’t see us get deduped. So if you are looking for an advanced solution that does much more than dedupe, Ocarina is the answer.

Protecting Compressed Data and Reducing Costs

Posted by Carter George On May - 25 - 2010

As @storagebod (Martin Glassborow) noted today in a blog post, the issue of protecting deduplicated data is an important one for any business.

Deduplication and other data reduction technologies offer the opportunity for increased data protection at a reduced cost.

When you hit a threshold of 50% or greater overall data reduction, it lowers the cost of performing a a full mirror of data. For example, if you have 100 terabytes of data, mirroring today would require 200 terabytes of disk space. But if the data were reduced by 75%, you could store the same data in 25 terabytes, and full mirroring would require on 50 terabytes of disk - a significant savings. In other words, it is possible to fully protect your storage, including full mirroring, with less disk than it takes to store the unprotected data today.

While this is true with any amount of space savings, when that level of savings goes beyond 50% (or a dedupe ratio of 2:1 or better), the storage cost of full mirroring is zero.

There are a number of options that your deduplication provider should be pursuing to compliment your data protection strategy. For example, a vendor should be able to allow you do two key things with your dedupe configuration:

  1. Allow a minimum level of duplicate blocks to accumulate prior to starting deduplication. For example, you could allow data to be written twice, and then deduplicate all subsequent occurances. So long as the dedupe solution is aware of both original occurances, you have a form of mirroring without needing to do full mirroring at the disk level. Call this duplicate mirroring.
  2. Set a threshold for a maximum number of duplicates. To set a maximum level of exposure on the loss of physical disk, you should be able to ask that once you have found ‘n’ instances of the block, to start over. You could set that number at 8, 32, 128 or whatever frequency makes business sense to you, and therefore, the potential loss of the sector on which the duplicate is stored would only affect a certain number of files.

These two examples are not necessarily a complete answer in and of themselves, but they do provide guidance for tools you can work with as part of an overall data protection strategy. As deduplication becomes a storage fundamental, and is in place across multiple tiers of storage and on multiple products in your data center, understanding the impact of dedupe on your data protection strategy will be key and your dedupe vendor should be providing you the right tools to manage that.

I dream of data reduction

Posted by Sunshine On March - 29 - 2010

jeannie

Data is growing at a dizzying rate. We need only look at our home computers to get a sense of how easy it is to fill our hard drives to overflowing with all manner of flotsam and jetsam. From family photos to LOLcats to videos of our kids, we’re finding it difficult if not impossible to keep down the rising tide of files.

There is a cost to this, as many if not most enterprises are now recognizing. Recently, InfoWorld launched a special section, Data Explosion that guides companies through the myriad problems that arise from having too much data to handle. With headlines like: “The big data addiction,” the new section promises to address the issue with step-by-step guides, white papers, and other instructional pieces.

Infoworld blogger Matt Prigge delves into the topic in a post today, “The high cost of lazy storage.” He says that users need to take responsibility for keeping their data under control. Despite this admonishment, he admits that he himself is an “excellent example of the problem.” He saves all of his email, because he never knows what he might need later. Sound familiar? If someone whose blog is called “Information Overload” can’t get control of his personal data, it’s hard to imagine how anyone else can.

Prigge writes, “The bigger that data gets, the more effort required to put the genie back in the bottle.” He pushes the metaphor even further (and more gruesomely) by suggesting that at some point it’s easier to kill the genie and throw away the bottle. Now, that does strike us here at Online Sto Op as rather extreme. Why not simply put the genie back into that nice, compact bottle where she was living perfectly happily for so many years?

As we all know from 70s TV, those bottles were well-upholstered and downright comfortable living spaces for many a genie. And while it’s true that some genies (or Jeannies) would get so angry they’d stomp their feet when they were magically sent back there, they eventually settled back onto the purple pillows, kicked off their metallic platform heels, dug their toes into the shag carpeting and relaxed. Same goes for data reduction. A combination of approaches seems the most sensible answer. Data needs to be managed. There is something that is known as 100% compression–it’s called “deletion.” But short of that, there are ways to reduce data by as much as 90%. There are solutions for reducing the types of files that are driving the fastest storage growth, such as JPEGs, documents, videos, graphics, and other large files. An intelligent, content aware approaches that includes both deduplication and compression is what this blog’s parent Ocarina provides.

Storage News and Views - March 17

Posted by Sunshine On March - 17 - 2010

saint_patricks_day_cheer-tAcross the storage blog-o-tweet-osphere today folks are donning green scarves, putting four leaf clovers in their lapels, and generally proclaiming the luck of the Irish. Yes, it’s a good life in storageland. And there’s plenty of news to amuse and bemuse.

EMC made a big splash this week with a presentation to analysts by President and COO Pat Gelsinger that outlined a new vision for virtual storage. You can listen to the whole thing here. Chris Mellor at the Register called the plan a sign that EMC has “lost their marbles,” but others think it represents the future of storage.

Here is some of the commentary from both within and outside EMC:

EMC:

Chuck’s Blog - This changes everything

Blog Stu - Virtual Storage, not just another V-word

Commentary:

**New addition thanks to @sfoskett** Burton Group: EMC’s Global Storage Vision

Gregs’ StorageIO blog - Virtual Storage and Social Media: What did EMC not Announce?

Chris Mellor, The Register - Gelsinger stuns analysts and colleagues with storage pool plan

Stacey Higginbotham , GigaOM (yes, GigaOm! Welcome to sto-land, Stacey) - EMC’s Crazy Plan to Create a Worldwide Data Cloud

In other news… there’s a really sweet video on the Hitachi Data Systems site that talks about its partnership with this blog’s parent Ocarina Networks, and how this will benefit customers, reducing their data at rest by 10:1. Ocarina CEO Murli Thirumale makes a pixelated, jazz music backed appearance.

Here’s the video in its entirety, or go to the HDS blog site and watch a higher quality version:

Meanwhile, it’s not all sword crossing in the land of the storers.

As we already know, our kind can rally for a good cause. This past week, arch rivals NetApp and EMC raised money for kids with cancer by shaving their locks for St. Baldrick’s. NetApp led the charge, and EMC responded in kind.

Virtual Geek Chad Sakac sums it up here: A little EMC/NetApp Fun - to help cure cancer…

A heartwarming effort.

That’s all for now. Remember, it’s not what you store, it’s how you store it.

Make the right call

Posted by Sunshine On March - 10 - 2010

Four out of five college students agree, this is not the way to deal with data growth. How about this instead?

stuffed-phonebooth


Fast and Effective Dedupe

Posted by Carter George On March - 3 - 2010

I’ve noticed a few blog posts recently about speed of deduplication in the modern data center. I agree that speed is an important factor, but keep in mind that not all dedupe is created equal. That is to say, fast is good, but only if you are also effective. One of the tricky things has been that the easiest data to compress is also usually the most carefully performance tuned. A great example of this is a database. This is because databases are comprised of simple alphanumeric fields and sparse tables. All of that is easy to reduce in size.

However, a company’s core transactional database is the most conservative asset in the data center. Introducing compression would save space, for sure, but you could only use very fast, simple compressors there. At the same time, customers will be hesitant to deploy a new layer of processing in their most sensitive application.

So, where is most data growth? In fact, it’s being driven by unstructured data – Office documents, rich media, email with attachments, PDFs, Flash videos, and so forth. This complex data does not lend itself to fast simple compressors. But perhaps we should back up for a moment and think about how customers have been behaving all along.

Throughout the history of storage, there have always been tradeoffs available between fast expensive storage, and slower but cheaper alternatives. This is not a bad thing. It gives users alternatives based on their priorities and budgets. Back in the old mainframe days, these choices were between very expensive mainframe memory and “offline” storage like drums, cards, and tapes. Today the technology is all much bigger, faster, cheaper and sexier. But really, the tradeoffs are the same.

Data reduction technology adds another layer of choice above and beyond the traditional hardware choices. Now in addition to choosing whether you want fast, expensive solid state disk (SSD) or slower but very cost-effective SATA, you can also choose whether you want to compress and/or deduplicate the data that is stored on those disks.

Just like physical disks, compression and dedupe come in a range of speeds and capabilities.
There are simple and very fast compressors that are essentially invisible in terms of their impact on storage performance. There are more complex compressors that get better results, but which may take longer, either to compress or to decompress the data. Deduplication, done well, should always be pretty fast, and streaming dedupe rates of well of 300MB/sec are now available from many vendors (including Data Domain and Ocarina).

The emergence of tools to automatically tier data to its appropriate place help make the use of all of these technologies more feasible. That applies as much to solid state disks as it does to dedupe and compression. When data tiering can be made invisible to end-users and applications, then implementing multiple physical and logical tiers of storage becomes practical.  Good examples would include EMC’s new FAST tools, Compellent’s “Fluid Data Storage”, and HDS’s Data Migrator. When users or administrators have to move data by hand to get it to a compressed tier or a solid state disk, then the operational costs offset the capital savings.

You might want to be wary when someone’s biggest claim to fame is fast dedupe. Just as the old mainframe admin had to decide whether something was important enough to live in RAM, or could be stored on cheaper tapes instead, today’s IT shops have to decide where it is most important to try to get data reduction, and what tool will get the most bang for the buck for that kind of data. You need the whole story, and then you can decide based on your own priorities.

Tagged Gets Shrunk

Posted by Sunshine On January - 29 - 2010

tag

Interesting story from the vault of the Ocarina case study library. Social network Tagged is the third largest social network in the U.S. It has seen traffic increase 10x over the past two years. With its focus on making new friends rather than simply getting to know existing ones, it has carved out a successful niche and is building an international subscriber base of over 80 million members.

The cost of this success? Data growth. Tagged’s storage infrastructure has been doubling every single year. With 1 million new photos uploaded every single day, Tagged needed a way to expand capacity and fast.

Compression with Ocarina meant about 10 TB of additional free space, which in turn meant they could put off buying new NAS equipment by several months. The lower average image size also meant reduced bandwidth and 15%-20% reduced monthly content delivery network (CDN) costs.

The company chose to go with Ocarina’s newest specialized image reduction technique, native format optimization (NFO). This is visually lossless compression of images that nevertheless delivers significant space savings–a technology that’s perfectly suited to the social networking environment.

The other crucial benefit to reducing image size was improvements in site responsiveness. “We’re sure that using Ocarina to reduce image sizes has helped improve our page rendering times,” said company CTO Johann Schleier -Smith. “That’s a big deal because it creates a better user experience, which means improved customer loyalty and higher market share.”

Read the entire case study by clicking here. Or visit the Ocarina resources page and click on the Case Studies tab, where you’ll find several others.

The Year in Images

Posted by Sunshine On December - 30 - 2009

This past year, we at Online Storage Op gathered all manner of images to illustrate our posts. So as a way of looking back at 2009, here are some of the ones we liked the best–and the stories that went with them:

HolodeckHolodeck fun:

In February, Robin Harris at StorageMojo wrote about a potential breakthrough in storage technology that could change the landscape forever: quantum holographic storage. Online Storage Op was on the scene. It also gave us a chance to upload a pic of a Geordi La Forge doll. Admit it… this is one cool toy.

dna2-webSqueezing into your Genes:

This blog’s parent Ocarina had quite a year–inking partnerships with a number of major storage vendors and becoming a noted player in the hot dedupe space. It was also the year that genomics labs woke up to the need for better data reduction to deal with the coming onslaught of genetic data. In short, compression can be a matter of life and death. We reported on it here, and our readers got to relive their 10th grade biology class by looking at images like the one above.

marathon

Racing for Dedupe

As many pundits are now opining, dedupe really was one of the biggest stories of 2009, not least because of the high profile battle for Data Domain between storage titans EMC and NetApp. In the end, EMC nabbed the dedupe specialist for an eye-popping $2.1 billion.

boothbabeBooth Babe Mania:

We know our readers are sophisticated types who come here only to absorb information and opinion, and to better themselves for the benefit of all humankind. But for some odd reason we saw a major traffic spike the day we ran our post on the great Booth Babe Controversy. When we asked, everyone quickly told us, “I read the articles.” Mmmhmm!

VMworld a hit

And speaking of images that make storage folks drool, one of the most mesmerizing sights of the year was at VMworld, held in August in San Francisco. Participants descended the escalator to be greeted by gleaming rack of servers and storage–which we later learned was the result of a plan drawn on a napkin by the VMware GETO team. In any case, this year’s VMworld was a major event–and as we rightly noted, it foretold more economic activity in storage and virtualization.

nick_banner

Industry puts aside differences to try to save a life

This is one of the saddest stories of 2009, and one that demonstrates an activist and caring streak in the storage community. When word got out in May 2009 that EMC employee Nick Glasgow was in need of a bone marrow transplant, folks within the storage industry put aside competitive differences and pulled together to find him a match. Sadly, Nick passed away in October. The degree to which he inspired others will not be forgotten.

And, finally…

We never did have an egg and spoon race, but…
In November, Ocarina participated in the first ever Gestalt IT Tech Field Day, which brought independent bloggers from around the world to Silicon Valley for two days of tech deep dives. Our “bring out your data” challenge started tongues wagging well before the event began. Participants brought us their toughest data sets, and aside from those who used archaic encryption software to stump our algorithms, the results were impressive–an average of about 30% reduction on these tougher-than-tough data sets. Plus, the whole event was just a ton of fun. And it didn’t even require that we slog around the mud clapping coconut shells together.
bring-out-your-dead

Happy New Year

Posted by Sunshine On December - 29 - 2009

Tis the week for the “out of office” email messages. But the storage blogo-tweet-osphere waits for no man. Here are a few posts that caught my eye this week.

Bas Raayman sees CPU power hitting the wall: The RAM per CPU wall

Rick Vanover says 2010 could be the year for 10GigE - Will 2010 see 10 Gigabit Ethernet go mainstream?

It being the end of a year–and a decade–predictions abounded. We’re pleased to note that when it came to summarizing the top storage stories of 2009, deduplication for primary storage, the specialty of this blog’s parent Ocarina, made the big lists:

Infostor: The top 5 storage technologies of 2009 (and 2010?)

“Storage optimization (or data reduction) technologies such as data deduplication and compression can significantly reduce capacity requirements and costs … Consider data reduction for primary storage.”

SearchStorage - Beth Pariseau: Top 10 enterprise data storage news stories of 2009

“10. Data deduplication branches out. As deduplication settled into a comfortable role in backup, data-reduction technology started working its way into other parts of the data storage infrastructure, including primary as well as nearline and archived data … Ocarina and Isilon Clustered NAS help visual effects studio archive images, cut costs.”

For sheer inventiveness, blogger Stephen Foskett wins the prize with his 2009 predictions post, in which he turns the clock back and takes advantage of 20-20 hindsight: My 2009 IT Industry Predictions.

Meanwhile, social media and tech watcher Louis Gray takes himself to task and looks at all of his 2009 predictions to see how well he fared: My 2009 Tech Predictions: Mixed, But Nailed Real-Time.

OK that’s all for now. Here’s wishing all of you a happy, healthy, green and techy new decade.