Content feed Comments Feed

Online Storage Optimization

Exploring Next Generation Storage Solutions

Archive for February, 2009

A Q&A With Shmuel Shottan, CTO, BlueArc

Posted by Sunshine On February - 27 - 2009

photo_shottan

The term “deduplication for primary storage” has become the latest industry buzz phrase. Yet, how much do any of us know about it? This week, I sat down with Shmuel Shottan, CTO of BlueArc, and learned a great deal about what makes this emerging technology such a crucial one at this time.

I was impressed by his soft-spokenness and ability to discuss these new innovations in layman’s terms. This is clearly a great year for BlueArc, which was recently awarded the Gold prize in the SearchStorage Product of the Year Award, for the Titan 3200. Our conversation is below.

Sunshine: Who are your customers, and what is the biggest problem that you solve for them?

Shmuel Shottan: BlueArc’s customers are mainly focused on high performance applications and are in data intensive markets. To be more specific, for many of our customers, BlueArc helps in driving value creation and revenue generation–for example, drug discovery, computer generated effects, and design and simulation. These are all customers that appreciate the value created by deploying a BlueArc system, which accelerates their applications. Another example is focused on consolidation. By consolidating many storage islands through the deployment of a BlueArc system, the customers lower their total cost of ownership and simplify their infrastructure.

Sunshine: There’s been a lot of talk about “dedupe for primary.” What are your thoughts about this technology, and where do you see it going?

Shottan: Dedupe for primary lagged dedupe for backup. Dedupe for backup became an embedded feature in every VTL system. Dedupe for backups lends itself because you have duplication. Primary was the next wave, because everybody started to talk about OpEx–operational expenses–or how much power, cooling, and space it takes to store all the data, which was increasing by a factor of 10 for many enterprises.

Data has a lifecycle. If you look at it holistically, you can apply the 80/20 rule. Only 20% of data needs to be accessed at the kind of performance level BlueArc realizes, while 80% of data is rarely accessed after a certain time period. Yet, all of it must be available 24/7. We’re solving the problem by adding a virtual tier of storage that is online, but which is compressed, and therefore is 1/3 or even 1/4 the cost.

Dedupe for primary is more challenging than dedupe for backup. It lends itself less to repetition, because it’s all different files. In tape libraries, you can do 30x because it isn’t all backup. The easiest way for a storage appliance is file-based dedupe, but that is not very efficient. Depending on the data set, you might get 80%, or maybe as much as 50% efficiency. The technology most relevant is the one taken from the backup dedupe, variable block. So while the need was there, the ratios were not good enough to justify the compression in many cases.

Sunshine: What types of industry verticals are facing the biggest storage challenges?

Shottan: Some of the industries that need this kind of accessible data storage include: media and entertainment, oil and gas, and bioinformatics. The data that is being collected and the applications that manipulate it are huge. “Huge” is a technical term, of course. (Laughs.)

Our solution includes primary storage compression, for which we have partnered with Ocarina Networks. Ocarina represents the next generation solution, which goes beyond block-level or file-based deduplication, and is far, far more effective at increasing capacity.

Sunshine: What are the specific storage needs of these industries?

Shottan:  At a very high level, those are industries which have the following two requirements: Firstly they are data intensive industries, which means lots of primary storage needs. Secondly, their ability to efficiently run their business depends on how fast their applications run. This is why the two key attributes of the BlueArc system: performance and scalability, apply well.

Let me tie this need with the reason we have partnered with Ocarina. For industries such as oil and gas and bioinformatics, the situation is this: upon completion of a processing run, all the data needs to be kept around. However, for successive processing runs, the data can be compressed. This is where our multi-tiered storage comes in, and beyond that, we’ve been able to seamlessly integrate with Ocarina’s appliance to achieve this compression without having to invest in any new storage. Ocarina is the only offering we found that could successfully compress media rich files such as those created in genomics labs.

Sunshine: What about movies? Why are they so data intensive?

Shmuel Shottan: The biggest storage costs for the media and entertainment industry are in production. We work with studios that do animated films, and while the final product fits on a DVD, the production phase can be tens or even hundreds of terabytes. And it is dynamic data, not static data. The rendering time or the processing of rendering a scene to a movie takes time. And time is money.

For example, it can take almost a week to render hair on top of the head of an animated creature. Say you want to reuse that hair. If you as the animator has to recreate it from scratch because it’s now on tape, that’s weeks of extra work. This is why you want to keep that data online, because once you put something on backup–whether tape or VTL–you no longer can easily access previous scenes.

Sunshine: I’m guessing that for production, OpEx costs are extremely high.

Shottan: Yes, and we realized that this is a perfect situation for compression to Ocarina. They have designed compression algorithms for industry specific file types, including those used by Hollywood film studios. We’ve been able to integrate our two offerings and significantly reduce their storage footprint.

Sunshine: Thanks for taking the time to speak with me.

Shottan: It was my pleasure.

Shmuel Shottan is an industry veteran with thorough experience in the research and development of hardware and software, and in engineering management for firms ranging from start-ups to Fortune 500 companies. He holds a BSEE degree from the Technion, Israel Institute of Technology. His full bio is here.

BlueArc is a leading provider of primary storage solutions to enterprise markets, as well as such data intensive markets as electronic discovery, entertainment, federal government, higher education, Internet services, oil and gas and life sciences.

Obama’s Healthcare Imaging Plan - More Implications

Posted by Sunshine On February - 26 - 2009

nurse-definition

More news and commentary on the Electronic Healthcare Records part of Obama’s stimulus plan of interest to the storage industry. As you may or may not know, $20 billion has been set aside in an effort to fully digitize healthcare records.

As HealthImaging & IT Reports, the plan could result in something of a limbo situation in terms of who will oversee it–especially since it’s still not clear who the next Health and Human Services (HHS) secretary will be following Tom Daschle’s withdrawal.

Meanwhile, Jerome Wendt and Howard Haile posted a very detailed piece on the DCIG blog today about the implications of the plan. One area that could be of concern, say Wendt and Haile, is the language in the bill that states that patients will be afforded full transparency:

“This language basically states that a patient will have the right to receive an audit trail of all disclosures of their EHR made through electronic record. This paragraph stunned us as we immediately thought of the many facets of IT this would touch.”

All in all, this is shaping up to be quite an earth-shaking new mandate. It will be interesting to see how it unfolds–who will benefit from these dollars, and whose resources will be stretched.

Image: http://rrig.blogspot.com/

Storage News and Notes 2/25/09

Posted by Sunshine On February - 25 - 2009

Big day today for the storage industry. Top story for us, Ocarina announced $20 million in Series B financing, led by Palo Alto-based Jafco Ventures, with participation from existing investors Kleiner Perkins Caufield & Byers and Highland Capital Partners. Here the latest round up of news articles:

Plenty of other chatter in the news and blogosphere of interest to the storage industry today as well:

Iron Mountain announced a new service, Virtual File Store (VFS), reports Beth Pariseau at SearchStorage.

On Storage Soup, Beth also dished on a recent clash between EMC & NetApp, this time around VMWare. The post includes a link to a video spoof of 8 Mile that has to be seen to be believed. (I can see an updated version in which MC Hammer leaps in at the last minute and starts talking about content-aware compression, but I digress…)

And speaking of videos, here’s one that Marc Farley over at 3Par put together on his Storage Rap blog that takes some swipes at Hitachi’s dynamic provisioning in what has to be said is a truly creative way. The “Mangatars” used in it are a phenom that’s currently sweeping through Twitterville. (Yes, I’ve got one.)

And speaking of HDS, David Merrill’s blog post today is an explainer on dynamic tiering that many will find useful and enlightening reading (I know I did).

Ocarina Raises $20 Million

Posted by Sunshine On February - 25 - 2009

Today, Ocarina Networks announced that it closed a $20 million Series B funding round. This is obviously great news for the company, as well as a strong validation of what the Ocarina set out to accomplish. The funding round was led by Jafco Ventures, with significant participation from Series A investors Kleiner Perkins Caufield & Byers and Highland Capital Partners.

The Mercury News did a very nice piece on the funding today. As reporter Scott Harris put it:

“The incredible shrinking economy may not be creating much in the way of jobs, profits or consumer confidence, but it’s doing a bang-up job of producing data. The Information Age is nothing if not a volcanic profusion of digitized documents, photographs and video — not to mention the data emerging from the genomics industry and other deeply scientific pursuits.”

Ocarina has a solution for these rising storage demands. As VentureBeat wrote this morning, 90 percent compression is an attractive proposition these days, not only to manage costs, but also, as the article states, “for companies concerned with greening their business models, cutting down on storage requirements can help shrink energy footprints.”

This new round of venture funding, raised at a time when the economy is in dire straits, demonstrates the immense need that Ocarina fulfills in the marketplace.

Beauty Really in Eye of Beholder

Posted by Sunshine On February - 24 - 2009

beauty

Here’s a completely non-storage related topic that I found fascinating. Science writer Brandon Keim’s story in Wired today reports that beauty is experienced differently by men and women. Summarizing findings from the Proceedings of the National Academy of Sciences, the article states:

“In men, images they consider to be beautiful appear to activate brain regions responsible for locating objects in absolute terms — x- and y-coordinates on a grid. Images considered beautiful by women do the same, but they also activate regions associated with relative location: above and behind, over and under.”

Hard to say whether these prelimary findings will be reinforced by further study, but the potential for unlocking this difference–which seems to match common experience–is very interesting. Thoughts?

The Storage Oscars and Other News of the Week

Posted by Sunshine On February - 24 - 2009

There is a definite feeling out there that storage blogging is growing. It’s a more connected, social world than ever before in the past, the result being higher quality content.

Here are some of the tidbits that jumped out at me this week:

Storage Monkeys, relaunched last month (just like this blog!) announced today that it plans to add more social features, with the “goal of being the premier social networking site for storage users, by storage users.”

Beth Pariseau at StorageSoup self-corrected in her discussion with and about Storagezilla’s post over the weekend regarding EMC’s scaled down version of Networked, Fast Start, which can soon be used as a virtual storage appliance. As Beth mentions in this same post, VSAs are a-popping from numerous vendors, and she links to a very interesting post by Storagebod in which he puts several of these products through their paces. Beth is one of that select group of journalists out there who really gets the value of bloggers and blogging.

Meanwhile, there’s a new group blog/magazine on the storage block–Gestalt IT, which brings together voices from storage and IT. Today’s post is by Stephen Foskett–a very active and positive force in the storage blogo-tweet-osphere community–which reacts to what he points out is a bit like the storage Oscars. That is, the announcement of the winners of the Storage Products of the Year awards from Storage Magazine. (Ocarina Networks was named a finalist this year.) Steve has a thoughtful response.

Like Stars in the Digital Sky

Posted by Sunshine On February - 20 - 2009

Peanuts
It has been nearly two years since the now semi-famous IDC report, sponsored by EMC on “The Expanding Digital Universe” was issued, stating that digital growth would soon surpass capacity. Last year, the report was updated, showing that the earlier estimates were about 10 percent too low, and that the total amount of data would be 1.8 zettabytes by 2011.

I’d read about the report and scanned the executive summary, but a closer read yielded some surprisingly poetic attempts at offering readers some way of grasping this magnitude of data size. A zettabyte, or 1,000 exabytes, is simply too huge a number for most of us to imagine, and so the report comes up with some novel ways of helping us connect the dots.

Here are some of the most striking examples:

  • The number of digital “atoms”–that is, the digital ones and zeros created, captured or replicated–is already bigger than the number of stars in the universe.
  • Because the digital universe is expanding tenfold every five years, it will surpass Avogadro’s number within 15 (or actually 14, now) years. For those who have forgotten everything they learned in physics class, Avogadro’s number is defined in the report. It is the number of carbon atoms in 12 grams, or 6.022 x10 to the 23rd power.
  • Continuing the planetary motif, the report shows that the digital universe has its own version of “dark matter”–the signals from sensors, RFID tags and voice packets that make up only 6% of the digital universe by gigabyte, but account for more than 99% of the units or files in it.
  • The flipside of the above problem is in the other 94% of the digital universe in terms of gigabytes–the vast majority of the actual data. What you’ll find here is “unstructured data”–that is, images, video clips, documents, and so on that are pinging around the digital sphere. This is an area where Ocarina has been making immense strides in bringing data under control.
  • Despite the fact that it only accounts for 4% of the world’s revenues, the broadcast, media and entertainment industries are responsible for over 50% of the digital universe–a percentage that is expected to grow once digital TV takes off worldwide.
  • The digital universe can be broken down to 45GB for every person on the planet.

Image from United Media

Cloud Humor Hits Blogosphere

Posted by Sunshine On February - 20 - 2009

With so much talk about the cloud on blogs far and wide, it was inevitable, I suppose. Blogger Dave Graham has penned a piece, with tongue firmly in cheek, about the various “cloud personality types” that are out there, based on the DSM.

He is still updating the post. So far, my personal favorites are:

“The ‘Cloud Idiot‘ - (cloudiot) - This is the person who thinks they know more about the cloud than anyone else. They’re constantly on the prowl for the ‘what is …?’ questions on social media platforms and provide blustery responses with vapid data validation.  Oftentimes, these folks are proven wrong in a rather humiliating and public fashion.

The ‘Cloud Antagonist‘ - (cloudagonist) - This is the commiserate cloud ‘hater.’  This person loves DAS storage, SANs, divided fabrics and can be found extoling the virtues of direct server management via commandline and a collection of USB sticks.”

He has asked readers to submit their own, and I look forward to seeing the comments field fill up on this one. So far, there has been one, from Matthew Glidden, who suggests the “‘Cloud Evader’ - (cloudaway) Afraid or unwilling to engage clouds and any related topics, evaders check their virtual watch and leave a vapor trail of mumbled excuses. They only bookmarks blogs and twitters that cover safe topics like ‘cuisine’ and ‘roller derby.’ Evaders create fringe attachments to cloudagonists, but keep a safe distance from any emotional opinions, pro or con, lest they disturb the cloud-free status quo. Stopped watching the Weather Channel in mid-2007.”

I encourage one and all to go to Dave’s blog and add more personalities as they come to you. I’m working on one I think I’ll call the “cloud burst…”

Ocarina Goes To Hollywood

Posted by Sunshine On February - 19 - 2009

relax_t

Here’s something that might surprise you–one of the sectors that has been hit hard by an upsurge in storage demands is movie studios. And while you’re probably imagining that video is the source of their storage woes, in fact this is not the case.

As it turns out, video is a relatively small part of the movie-making process from a data point of view. Rather, it’s the growing adoption of all-digital HD and film workflows, and the move to 3D (stereoscopic) production that has led to this new level of demand.

While the price of disk is dropping, storage costs for studios have nevertheless skyrocketed as a result of these new digitization technologies. In essence, every movie is now broken down into a series of digital stills, regardless of whether it is animation or live action. This has led to a situation in which the amount of data for just one 18-month shoot is unmanageable without some kind of compression solution in place.

This is a challenge that Ocarina has taken up, and I encourage you to take a look at how its next generation optimization solution has been deployed for this particular sector.

Learn more by visiting this page and downloading the “Digital Intermediate White Paper.”

The New Yorker 2.0?

Posted by Sunshine On February - 18 - 2009

new-yorker11

I have to admit that it’s been over a year since I stopped renewing my sub to the venerable New Yorker magazine. I hung in there through the Tina Brown years, and after that it seemed to revert to the Ivy League tone I found so hard to take in the past, and the magazines started gathering dust on my coffee table. No doubt I’m setting myself up for some flame comments, but there you go. The web site, as I recall, was not very functional either. It was really just the magazine transposed into HTML form. There were some articles posted in full, others just excerpted, and that was about it.

Today, out of curiosity I took a look at the site for the first time in … well, it has to have been at least a year. I was amazed at its increased functionality. And the topper was that it actually included animated cartoons. Perhaps others commented on this when it first happened and I missed the uproar, but this definitely seemed like a mad departure from everything the New Yorker once stood for.

There’s a cartoon video on the home page–today’s depicted a rather ruthless doctor giving a kid a shot. You can also click to go to an entire page of these types of animated New Yorker cartoons.

Just one more way that rich media data is starting to pile up online–even in the most surprising places.