Isilon Systems had some fun with this video. It’s definitely been making the rounds on Twitter. And so we thought, why not post it on our site and grab some of the fun? So for those who have yet to watch it, here’s a little weekend magic for your enjoyment.
Happy New Year
Tis the week for the “out of office” email messages. But the storage blogo-tweet-osphere waits for no man. Here are a few posts that caught my eye this week.
Bas Raayman sees CPU power hitting the wall: The RAM per CPU wall
Rick Vanover says 2010 could be the year for 10GigE - Will 2010 see 10 Gigabit Ethernet go mainstream?
It being the end of a year–and a decade–predictions abounded. We’re pleased to note that when it came to summarizing the top storage stories of 2009, deduplication for primary storage, the specialty of this blog’s parent Ocarina, made the big lists:
Infostor: The top 5 storage technologies of 2009 (and 2010?)
“Storage optimization (or data reduction) technologies such as data deduplication and compression can significantly reduce capacity requirements and costs … Consider data reduction for primary storage.”
SearchStorage - Beth Pariseau: Top 10 enterprise data storage news stories of 2009
“10. Data deduplication branches out. As deduplication settled into a comfortable role in backup, data-reduction technology started working its way into other parts of the data storage infrastructure, including primary as well as nearline and archived data … Ocarina and Isilon Clustered NAS help visual effects studio archive images, cut costs.”
For sheer inventiveness, blogger Stephen Foskett wins the prize with his 2009 predictions post, in which he turns the clock back and takes advantage of 20-20 hindsight: My 2009 IT Industry Predictions.
Meanwhile, social media and tech watcher Louis Gray takes himself to task and looks at all of his 2009 predictions to see how well he fared: My 2009 Tech Predictions: Mixed, But Nailed Real-Time.
OK that’s all for now. Here’s wishing all of you a happy, healthy, green and techy new decade.
Dedupe - The Big News in 2009
It’s been a tough year — a worldwide recession, a sluggish housing market, rising unemployment … and on top of all that, the tarnished image of one of sports’ most squeaky clean players. Well, actually, there have been some bright spots. As DCIG blogger and storage analyst Jerome Wendt notes while looking back at the past year, “Deduplication is the Big Success Story of 2009.”
Wendt writes: “Deduplication is arguably one of the most notable trends of 2009 as it has been widely adopted by users after bursting onto the scene just a few years ago and has grown to be included in both software and hardware products.”
Wendt focuses on dedupe for backups, where there has been much publicized activity over the past year. The big storage story of 2009 was of course the battle between storage titans EMC and NetApp over backup dedupe specialist Data Domain. He cites an industry survey from SearchDataBackup that indicates that 41% of enterprises either are or are seriously considering dedupe to control data growth and costs. He also notes that the despite the predicted demise of Quantum, that dedupe company remains strong.
Dedupe for backups is one part of the cost reduction puzzle. Another part is to reduce data at the source, in primary storage. This is of course the specialty of this blog’s parent Ocarina, which implements a unique combination of content-aware dedupe and compression to achieve startling results. It focuses on the very types of unstructured data that are driving storage growth today–emails, images, documents, and so on. The company has been partnering with almost every leading storage provider, including HP, EMC, HDS, BlueArc, and Isilon. Another leader in this space is NetApp, which has a strong dedupe for primary offering that has also garnered a great deal of attention.
Here’s the thing, the economy might be slowing down, but data growth continues apace. This is one reason that the storage industry has been thriving this year. But rather than standing still, what is spells is a concerted effort to keep that data under control. As Wendt notes, another of the year’s big trends is cloud storage, which offers companies more flexibility for storing some percentage of their data. I would also add that virtualization has taken a huge leap forward, not only in terms of the technology itself, but also in terms of adoption over the past year. Yet another way to attack the problem.
So if 2009 was all about dedupe for backups, I’m going to guess that 2010 will be very much about data reduction at all points on the data life cycle. What do you predict?
Image: Gizmodo
VMWorld Turnout - A Sign of Recovery?
VMWorld is in full swing in San Francisco this week, and attendance has gone far beyond all predictions. I’ve heard a number of estimates, ranging from 12,000-15,000. In any case, the attendance has blown away their original prediction of 8,000-10,000. Seems there were a lot of last-minute registrees.
The downside of this is that the events–particularly the labs–are now so overbooked that there are reports of two hour waits for demos. Sessions are hard to get into as well. Even the expo floor has been snarled with pedestrian traffic.
Here are some of the reasons that could account for the unexpected upsurge in attendance at this year’s VMWorld:
Virtualization is a technology that promises to actually reduce costs, particularly around power and space. As Stephen Herrod, CTO of VMWare told attendees in his keynote this morning, the machines that are being used to run VMWorld itself would take up the equivalent of three football fields if it weren’t for virtualization. Instead, they all fit into one end zone. So even in a recession, companies are willing to invest in this technology.
Virtualization has given, but it has also taken away–that is, there is a storage bottleneck as a result of the complexity it has created. End users are forced to educate themselves about potential storage solutions (and other simplifying technology) to reduce the pain they’re experiencing as a result of this new level of complexity.
As an example, I was at a dinner event last night for a new company that just has come out of stealth, EvoStor. VMWare CEO Paul Maritz spoke at the event, a sign that the company is taking this seriously. In a nutshell, the company claims to be storage designed specifically for VMWare vSphere. It was built from the ground up, they say, to manage the challenges and complexity of virtualization. It’s too early to tell of course whether this will be a better solution than the ones out there now, but its very existence supports this argument.
Possibility number three: the economy is recovering! This is the sunniest view, but given my moniker I should be allowed this one. I do think that many companies are loosening the purse strings for corporate travel, particularly in areas where the expense could be so easily justified.
The mood at this year’s VMWorld is certainly very upbeat. I’ve seen vendor presos–particularly in storage–truly mobbed by attendees. I’ve spent time at the EMC, NetApp, Isilon, and HP booths and there’s a ton of activity. Even some of the smaller concerns, such as Asigra, have a steady stream of traffic. Now, this could all be due to the raffle announcements, attractive “booth babes” in nurse uniforms, magicians and jugglers, but I’d also like to believe that there’s real interest here from potential customers.
All in all, an exciting show. I feel lucky to be there to see and experience the bleeding edge of tech.
Ocarina: The Movie
As many people know, Ocarina Networks has been living up to its name lately. It really is becoming a “network”-oriented company, inking partnerships with just about all of the top storage vendors–HP, BlueArc, Isilon, and so on and so forth. This is great news for storage customers, who can now depend on the very best in data reduction, slashing storage costs.
For those who want to get a quick and entertaining hit on how one of these partnerships works–this one with BlueArc, might we suggest this new animated demo on the Ocarina site? This demo offers a case study in how a world-class CGI animation studio, Rainmaker Entertainment, deployed BlueArc storage with Ocarina to achieve astounding compression results. (For more on this, you might also want to take a look at our Q&A with Shmuel Shottan, CTO of BlueArc from last February.)
And for more on how Ocarina is joining forces with the top storage vendors to help media and entertainment companies maximize storage capacity, check out these recent news stories:
Beth Pariseau, SearchStorage - Ocarina deduplication and Isilon clustered NAS help visual effects studio archive images, cut costs
Debra Kaufman, Studio Daily - VFX Companies Lower Storage Costs with Ocarina
Bryant Frazer, Studio Daily - Q&A: Carter George, VP of Products, Ocarina Networks
Downsizing Storage Requirements for Post-Production
Happy Friday everyone!
The Talk of the Town

As I mentioned in a blog post yesterday, the media is beginning to swoop down and take note of yesterday’s announcement that HP is now an official reseller of Ocarina. It’s pretty darn big news.
Here are some of the better articles that have appeared. I’ll add more as they come in, so keep rechecking this post. In fact, if you’re worried about missing something, I suggest you subscribe to the Online Storage Optimization news feed by clicking the RSS symbol above.
New addition: Paul Shread, Enterprise Storage Forum - HP Sees Opportunity in Data Deduplication
Chris Mellor, The Register (UK) — HP Makes Ocarina Music
As Chris puts it: “Ocarina has similar partnerships with BlueArc, EMC and Isilon. It looks almost inevitable that every other filer supplier must be looking at the Ocarina product and thinking a reseller deal might be a good idea. Otherwise, it could lose sales to the competition when a lot of image-type data is being stored.”
Raju Shanbhag, TMCnet - Ocarina Networks’ Ocarina ECOsystem to be Resold by HP
Says Raju: “According to the company, this solution has the intelligence to extract and analyze the component parts of virtually any file … As the amount of information companies produce on a daily basis are increasing phenomenally, companies are looking for highly scalable storage solutions that efficiently and cost-effectively manage these volumes.“
Getting Animated at SIGGRAPH
This week, Ocarina sent a contingent to SIGGRAPH 2009, a conference and exhibition that draws computer graphics, animation, post-production, and imaging specialists from around the globe. Lots of toys to look at, and stories to hear. We are there through the end of this week, spending time in the booths of two of our partners, Isilon Systems and BlueArc. We’re so tired we almost feel like we’re getting wall-eyed, but every time we look up we see more folks who have gotten in their cars and driven here. There are so many new talents here, it’s really a group of incredibles. As far we can tell, the blue sky is the limit to what they can achieve.
If we were ever wondering whether next-generation data reduction technologies are needed in this era of animated, 3-D movies, the stories we’re hearing at this show convinced us that without some type of data reduction we could heading for an ice age. Data growth for this industry is a clear and present danger.
This has been a great week already for BlueArc, which announced that Starz Animation used its storage for a new animated feature, “9″ from Focus Features. For those who are here, we encourage you to come tomorrow, as cinematographer John Hickson, IT Systems Architect at Starz, and the champion of the Studio Sys Admins user group is going to speak in the booth about the installation and the movie itself. *Thanks Julie Herd Goodman for that correction.*
One of the Starz folks, Terry Dale, is quoted in the news release as follows:
“During the production of Focus’ 9, it was critical that we have a storage system in place that would handle the massive demands of the three films we were animating concurrently. 9, in particular, has characters — specifically the machines and the seamstress — that generate enormous data sets during rendering,” said Terry Dale, VP, operations, for Starz Animation Toronto. “The BlueArc Titan storage system gave us the performance we needed, the reliability we required and enabled us to complete our production on 9 without disruption.”
And in case you haven’t seen it, here is the preview for this amazing looking Tim Burton extravaganza:
Why Dedupe? It’s the Economy, Stupid
Beth Pariseau at TechTarget has an interesting article this week about how a combination of storage tiering and dedupe are just right in these recessionary times. It talks in some depth about Ocarina’s deployment at Rainmaker Entertainment on BlueArc and Isilon.
The article is worth a read all the way through, as it gives some real-life examples of two very different ways that tiered storage and data deduplication together added up to storage savings. For example, Clackamas County in Oregon was able to reduce storage costs by utilizing a combination of F5 for migration to lower tiers and Data Domain to dedupe archives.
Christopher Fricke, senior IT administrator for the county is quoted in the article saying: “…It helps us not have to chase capacity while we go through a budget crunch — we can focus on performance rather than capacity…”
The article also delves into how a combination of migration and dedupe/compression can greatly reduce storage costs and simplify life at entertainment studios. Rainmaker, a digital animation studio, deployed Ocarina in order to ensure that they could keep all their files online, rather than having to back up to tape while in the midst of a project.
The article quotes Ron Stinson, Rainmaker’s director of IT and operations, who said: “We’re looking at compressing 6 terabytes down to two, and possibly storing 300 terabytes on the Isilon system in the future.”
A very interesting set of use cases that help highlight the value of dedupe in very practical ways.
The Dedupe Wars
At Ocarina, we’re having a great deal of success these days with partnerships, and the buzz around this is being seen in the storage press and beyond. What has happened in part is that now that NetApp has made dedupe table stakes, we are the dance partner that many vendors are turning to, as we stand out as the ones who have the best data reduction technology for online data.
Before we go on, we should acknowledge a recent FUD-spreading post by NetApp’s Dr. Dedupe in which he tries to question whether Ocarina is even dedupe technology. Let’s just clarify, Ocarina offers a solution that includes both content-aware compression and a next-generation form of dedupe called object dedupe. What makes him think that Ocarina is just “resizing photos” is frankly a little beyond us, but in any case, that’s not in any way, shape or form what Ocarina does. However, we are pleased to see that NetApp believes the Ocarina technology deserves attention and is pointing their binoculars at us!
W. Curtis Preston makes a good suggestion in the comments field of Dr. Dedupe’s post. The best way to handle this is obviously to run some tests to see which solution offers better results. As it happens, we have already commissioned just such a study, by George Crump at Storage Switzerland. His results will be published soon, but I can reveal that our results are excellent compared to block dedupe in the filer.
Fundamentally, the value proposition of dedupe technology is that it increases storage capacity, resulting in lower CapEx and OpEx. The difference between the more standard, block-level dedupe such as NetApp’s and our technology is that Ocarina can intelligently extract and analyze the natural semantic objects inside virtually any file.
For example, in a PowerPoint file, a slide is a natural object and so are graphics that appear on a slide. Rather than hashing 4k file system blocks, we look for natural objects like the slide or the graphic, and we hash and dedupe those. This is one reason that we are able to achieve such startling results. For more information on this, please see my earlier post in response to another NetApp blogger, Alex McDonald, who also seemed caught up in semantics around the difference between dedupe and compression. We are planning to release updated white papers in May that reflect all of our latest capabilities. But the bottom line always comes down to, how much data can you reduce?
In general, if we are applied to a data set where our content-aware algorithms recognize most of the file types and objects, we’ll get better results than any other approach. Where the data set is something we do not have specific algorithms for, we’ll treat each file as an opaque object, and our results will trend down to about the same as you’d expect from block dedupe. So worst case, we’re the same, best case we’re as much as 50 times better.
A big advantage of a content-aware approach is that you can set policies that define what gets optimized and when. Block dedupe typically processes all of a volume or none. Since block dedupe has no awareness of content, it has no way to decide whether a given block should be deduped, compressed, or left alone. In a content-aware solution, you can say dedupe files like this, compress files like that, dedupe and compress files older than “x,” and leave these other kinds of files alone, because they are very performance sensitive. We see that kind of file and object level policy control being essential to broad adoption.
This is one key reason many of the largest storage vendors are partnering with Ocarina and including its data reduction technology their overall offerings. We have recently announced partnerships with EMC, HDS, HP, BlueArc, and Isilon, as well as cloud storage provider Nirvanix. This, of course, is good news for customers and the industry in general.
Our success will depend on how well we seamlessly integrate and support their storage hardware. So far, this is going well, and in fact some of the partnerships we’ve already announced are looking as if they may escalate to deeper integrations and levels of partnership.
If we execute well on those, then the data reduction for file storage category may eventually become “NetApp dedupe” versus “everyone-else-with-Ocarina.” That is our goal.
Data reduction for online storage is a hot topic for a couple of reasons. It’s been validated in other parts of the data center. We all know dedupe has become the norm for backups, with disk-based backup targets with dedupe built in rapidly replacing tape. Compression and simple dedupe (called dictionary compression) have also been widely adopted (as WAFS solutions) in the network. The next natural frontier is online file data. Just like Data Domain focused their technology on backups and Riverbed did the same for WAFS, we have optimized our data reduction technology for online data. Each of these use cases has a different design optimum, and if you started out building a solution for online file data, you’d make different design decisions than you would for backups and WAN optimization. In our case, that meant building a whole new kind of dedupe to be able to get the best results on the kinds of files that are driving storage growth.
In my view, we’re really just in the early stages of seeing data reduction make its way to online storage. Over the course of the next year or two, we expect to see dedupe for online data become as widely understood and deployed as dedupe for backups. During that time, we’ll see lots of debate over different approaches, lots of education about how things work, and a market that will gravitate towards the solutions that both deliver the best results, and which deliver the best performance for end-users and applications.
A Q&A With Brad Winett of Isilon Systems
Ask any system administrator what he or she really wants, and the answer will most likely come down to two things: simplicity and efficiency. The irony, however, is that just about every advance in one of these areas leads to problems in the other. For example, server virtualization has led to inefficiencies on the storage side. This week, I sat down with Brad Winett, senior director of business development at Isilon Systems, to discuss his company’s “next generation NAS” and how this manages both sides of the coin.
Sunshine: What are you seeing are the key issues for your customers?
Brad Winett: One of the things we’re seeing out there is everyone’s talking about storage efficiency. This means a bunch of things, but at the end of day it’s about getting the most bang for your storage buck. In this economy, IT is being really careful about what they buy and making sure they get a fast return on whatever they do choose.
Sunshine: What does Isilon do to address this?
BW: There are opportunities to implement storage efficiently if the technology easily supports it; and there are three things that leap out as being key to making that happen. First, look for the ability to support a truly just-in-time storage model. Pay as you grow. Mr. IT guy should buy only what he needs today and maybe for next month, but no more. If your storage can be upgraded easily — and with Isilon you can truly grow in a matter of minutes — that’s a real benefit. Why pay for disks now when they will sit empty, especially when the cost of those drives have dropped every quarter for as long as anyone can remember? However, make sure that the storage system you first invest in really can scale just-in-time, without the systems administrator needing to worry about lots of volumes, rebalancing load or spreading data around manually. Those are the painful tasks that promote over-provisioning in the first place!
Second, maximize the raw data you can actually place on the disks. We found out our customers on average (verified independently) have data filling 80% of the actual disk capacity they bought from us, which is a lot more than the 25-40% that NetApp’s CEO Dan Warmenhoven recently cited at an Infocomm meeting in Singapore for typical storage utilization in the field. This is because of the native storage efficiency of our architecture. Our customers don’t have to worry about under-allocated volumes and over-allocated volumes; we automatically balance the data across a single large and robust file system that has little overhead associated with it, performs extremely well even when it is filled up, and eliminates any performance hot spots.
The third part is file efficiency. Depending on your biggest pain point, you might want to implement one or another of the data reduction strategies available today. You have Ocarina, which we have as part of our solution for several installations, and others that allow you to store more data in less space, which addresses both CapEx and OpEx — because you’re buying less storage and also reducing the cost of maintaining it. Of course there are other storage tiers that can take advantage of data reduction such as VTL and tape.
Sunshine: When you talk about efficiency, are you also talking about complexity?
BW: Yes. It’s important to look at the whole IT picture. For example, many companies are investing in server virtualization environments to make their servers more efficient. On the storage side, there are all sorts of implications about how those virtual machines are going to access the storage and your data. You need a much more flexible and agile storage environment to support that, especially as the size of the environments and number of VMs gets large.
The core part of our scale-out NAS architecture is that it was built from the ground-up to support transparently scaling a single file system in a single namespace. We take the data and spread it around under the covers so all you know is you can access and read/write the data through any of the nodes. There’s no concept of separate RAID, volume management, and operating system environments; it’s a truly holistic architectural approach that Isilon brings. It is next generation NAS.
Historically, if you look at large NAS installations, the same people who loved the first one hate the tenth one. Each NAS might have 10 file systems, so maybe I have 100 in total. The sysadmins are saying, “… now I have data sitting haphazardly all over the place, but to truly utilize all of my volumes I have to manually spread it around to use my disks efficiently and try to balance the load.” The bottom line of this problem is that it makes it hard to utilize the hardware efficiently. That’s the reason the NetApp’s volumes are only half full. It’s too time consuming to balance so IT just buys more (and more than they really need). We don’t have that problem, because people get to use the whole thing in a very balanced manner. Plus it’s easy to scale. It’s literally one button. One button and automagically the single file system and namespace has grown and is balanced.
Sunshine: Thanks very much for taking the time to speak with me.
BW: No problem. Thanks for your interest!
Brad Winett is senior director of business development for Isilon and is responsible for the company’s partnerships with complementary technology providers. Most recently, he served as vice president of alliances for IBRIX, Inc, a developer of clustered file system software. Prior to IBRIX, Mr. Winett served in various roles at Exavio, DataDirect Networks, Hewlett-Packard, and Hughes Aircraft Company. He received his B.S. degree in electrical engineering from the University of Illinois.
