Content feed Comments Feed

Online Storage Optimization

Exploring Next Generation Storage Solutions

Archive for the ‘Uncategorized’ Category

Make the right call

Posted by Sunshine On March - 10 - 2010

Four out of five college students agree, this is not the way to deal with data growth. How about this instead?

stuffed-phonebooth


Fast and Effective Dedupe

Posted by Ocarina On March - 3 - 2010

I’ve noticed a few blog posts recently about speed of deduplication in the modern data center. I agree that speed is an important factor, but keep in mind that not all dedupe is created equal. That is to say, fast is good, but only if you are also effective. One of the tricky things has been that the easiest data to compress is also usually the most carefully performance tuned. A great example of this is a database. This is because databases are comprised of simple alphanumeric fields and sparse tables. All of that is easy to reduce in size.

However, a company’s core transactional database is the most conservative asset in the data center. Introducing compression would save space, for sure, but you could only use very fast, simple compressors there. At the same time, customers will be hesitant to deploy a new layer of processing in their most sensitive application.

So, where is most data growth? In fact, it’s being driven by unstructured data – Office documents, rich media, email with attachments, PDFs, Flash videos, and so forth. This complex data does not lend itself to fast simple compressors. But perhaps we should back up for a moment and think about how customers have been behaving all along.

Throughout the history of storage, there have always been tradeoffs available between fast expensive storage, and slower but cheaper alternatives. This is not a bad thing. It gives users alternatives based on their priorities and budgets. Back in the old mainframe days, these choices were between very expensive mainframe memory and “offline” storage like drums, cards, and tapes. Today the technology is all much bigger, faster, cheaper and sexier. But really, the tradeoffs are the same.

Data reduction technology adds another layer of choice above and beyond the traditional hardware choices. Now in addition to choosing whether you want fast, expensive solid state disk (SSD) or slower but very cost-effective SATA, you can also choose whether you want to compress and/or deduplicate the data that is stored on those disks.

Just like physical disks, compression and dedupe come in a range of speeds and capabilities.
There are simple and very fast compressors that are essentially invisible in terms of their impact on storage performance. There are more complex compressors that get better results, but which may take longer, either to compress or to decompress the data. Deduplication, done well, should always be pretty fast, and streaming dedupe rates of well of 300MB/sec are now available from many vendors (including Data Domain and Ocarina).

The emergence of tools to automatically tier data to its appropriate place help make the use of all of these technologies more feasible. That applies as much to solid state disks as it does to dedupe and compression. When data tiering can be made invisible to end-users and applications, then implementing multiple physical and logical tiers of storage becomes practical.  Good examples would include EMC’s new FAST tools, Compellent’s “Fluid Data Storage”, and HDS’s Data Migrator. When users or administrators have to move data by hand to get it to a compressed tier or a solid state disk, then the operational costs offset the capital savings.

You might want to be wary when someone’s biggest claim to fame is fast dedupe. Just as the old mainframe admin had to decide whether something was important enough to live in RAM, or could be stored on cheaper tapes instead, today’s IT shops have to decide where it is most important to try to get data reduction, and what tool will get the most bang for the buck for that kind of data. You need the whole story, and then you can decide based on your own priorities.

Dare to Be… Anyone You Choose!

Posted by Sunshine On February - 24 - 2010

dare2bdigitallogo

This Saturday I’m participating in an event that aims to bridge the gender gap in computer science and engineering. It’s the first annual Dare2BDigital, a conference for young women in the 7th-10th grades that exposes them to the new and exciting career options that now exist in computer science and engineering.

Why such a young group? Studies suggest this is the time when we begin the decision-making process about our career path. These young women are beginning to make pictures in their minds about how they’ll be spending their days when they enter the workforce. They might well be gifted in math or logic. But computer science still suffers from an image problem. Most people–girls in particular–see it as the realm of geeky guys who make endless Star Trek references, drink too much soda and have questionable grooming habits.

What many don’t know is how far this field has come in the last decade. If you’re creatively inclined, now is one of the best times to enter the vast computing field and start poking around for an interest area. An example, one of the first workshops at Dare2BDigital to fill up was one taught by Pixar technical directors on “Computers, Art, and Animation — How opposing specialties come together to create feature films.” What a a treat for a middle- or high school-aged girl to be able to dip her toes into the exciting field of computer animation. Other popular choices were programming with Python, making a Facebook game (with folks from Zynga), my workshop on being a tech reporter, and others. For the full list of workshops to share with your daughter, go to the workshops page.

The event is sponsored by SAP, along with many other top names in technology, including, HP, Microsoft, Cisco, IBM, Symantec, and others. What do you think? Is this the way to bring more women into the fold? What else can be done to open up the world of computing to more potentially qualified and creative people?

Full disclosure: I personally am receiving a small stipend from the event presenters for my consulting work on this conference. This blog’s parent Ocarina Networks is in no way involved, other than to be supportive of the concept.

The Environment Still Matters

Posted by Sunshine On February - 22 - 2010

With all the talk about the data inconsistencies around climate change theory, one issue that I’d hate to see lost in the shuffle is the actual environment. That is, while I personally have been skeptical for some time about the alarmist tone many scientists took regarding global warming, it would be a shame if there was such a backlash that people forget about the much more crucial, larger issue at stake. That is, we need to look at all the ways –on macro- and micro-scales–that we can reduce the overall pollution we generate through our daily habits.

One of the persistent myths about the Internet is that it is clean and green. We overestimate the value of going “paperless” while lowballing the effect on the environment of data centers. One need only look at an online pub like Data Center Knowledge to see that one of the most talked about issues in data centers today is how to reduce rack space, cooling and other energy costs associated with storage. (Another great resource is Greg Schulz’s StorageI/O blog.) This is particularly true of the data being generated through our new Web 2.0 sharing habits. Jon Toigo can laugh about the exploding digital universe all he likes, but it’s still the case that data growth is going like gangbusters in this socially networked era. Recession or no recession, there is a growing demand for ways to make storage more efficient.

Large players in this space are all too aware of the environmental and financial costs of such rapid data growth. Every time you share a photo or video, you’re contributing to it. And who among us doesn’t do this nowadays? In response. companies are experimenting with all kinds of techniques, including new building designs making use of outside air, reducing overall rack space usage with data reduction such as is offered by this blog’s parent Ocarina, cloud adoption, and so on and so forth. Companies like Google, Yahoo and Facebook are also creating next generation storage architectures that are more efficient for handling the realities of today’s internet. In short, let’s be sure, as we discuss the fallout from the latest global warming debate that we don’t start acting too lax about the effect of our actions on the planet.

Dedupealooza

Posted by Sunshine On February - 19 - 2010

So much talk about dedupe these days it’s hard to keep up. The industry is waking up to the reality that dedupe is one of the best ways to reduce data, thus saving on power, cooling, space and other crippling storage costs.

Some of the more thought provoking posts of late:

DCIG - How SSDs can be leveraged to Deliver Inline Deduplication for Primary Storage
Jerome Wendt responds to a comment from someone about Hifn’s Bitwackr inline dedupe. I don’t necessarily agree with Jerome’s take on this. In general, inline solutions are extremely limited, as the original commenter pointed out. But the post provides interesting food for thought.

Storagebod - Where is OnTap 8 with a bit of a rant!
Martin Glassborow isn’t talking specifically about NetApp dedupe here, but the delay on shipping OnTap8 is of interest to anyone who is concerned about data reduction products. As he puts it, the elephant in the room is that A-SIS dedupe as it now stands has limited scalability.

Recovery Monkey - More FUD busting: Deduplication - is variable-block better than fixed-block, and should you care?
This post, by Dmitris Krekoukias, argues that major distinction some vendors make about variable and fixed block deduplication is a way of distracting customers from the real issues. The post served to defend NetApp against its detractors and competitors who say fixed block dedupe is limiting. The comments field is in some ways the most interesting part, with EMC heavy Chuck Hollis raising questions about his connections with NetApp. Also, our own Mike Davis weighed in, and the numbers he cited were so notable that further commenters questioned how this could be lossless compression. At this point, we’re used to it–the industry at large has become accustomed to less than spectacular results. More on all of this in a later post.

And here’s another interesting trend. The word “dedupe” is starting to creep into the lingo in a more general way. Among storage tweeps there is a greater tendency to throw “dedupe” into their conversations about everything from their record collections to what they eat. It reminds me a little bit of the “hepcat” slang I used to hear when hanging around jazz musicians. If something was ordinary, they’d call it “B Flat,” since that’s the most common key in jazz. For example, “Oh, I just had a B Flat lunch today of a burger and fries.”

The often Twit-witty Greg Schulz recently tweeted: “ I can have dvr record on disk NBC tape delay (thats probably on disk) then dedupe da commercials.” Good plan, Greg.

This post by Steve Gillmor at TechCrunch also uses the term–in a way that I’ve never heard anyway. In this case, he’s referring to the fact that there is duplication of content across what are now becoming overlapping social networks–FriendFeed, Twitter, and the new Google Buzz.

OK that’s all for now. Keep on deduping friends!

News from the Holodeck

Posted by Sunshine On February - 16 - 2010

what_happens_in_the_holodeckAs regular readers of this blog know, we’re obsessed with out there tech. Anything that smacks of Star Trekkian futurism gets our blood pumping. This week, Deep Storage’s Howard Marks reports on something we’ve been watching for some time: holographic storage.

The news is sad. The company that was developing it, InPhase, is out of business. Their web site is still up, but according to the article, the company, a Bell Labs spin-off, was shuttered in early February and the Colorado Dept. of Revenue is now seizing its assets. As he points out, for now, technologies like deduplication make it hard to justify spending $10K on holographic drive.

Despite this terrible setback I for one don’t want to believe this idea will die out entirely. It promises a new generation in storage at a time when data growth is spiraling out of control, threatening to overtake data centers worldwide. And who says we can’t add compression and deduplication on top of that? Howard and I both predict that sooner or later someone else will follow the holographic storage clarion call. As he so succinctly put it: “It’s just so cool.”

Image from: Geek Stuff

Fun Break - Playing D&D on the Microsoft Surface

Posted by Sunshine On February - 11 - 2010

Today, Matt Hickey over at CNET reported on something that should gladden the heart of many a Dungeons and Dragons fan. A new version of the game is being designed to work on Microsoft tabletop device known as the Surface. A group of geniuses over at Carnegie Mellon have been developing it, and it looks like hella fun.

Here is a video demonstration of the thing. Too bad the Microsoft Surface retails at $12,500+ and is only currently for “commercial” usage such as retail venues.

With a name like Ocarina…

Posted by Sunshine On February - 4 - 2010

This blog’s parent Ocarina and I have something in common. Can you guess? That’s right. We both have names that are cause for frequent comment. The first question anyone ever asks me when they meet me is, “is that your real name?” And the first question that anyone asks a person from Ocarina is “where did you get the name?” In essence, this is the same question in two different forms.

I almost always answer in the same way when asked about my name. I confirm the fact that my parents were indeed hippies (as the questioner already suspected) and that yes, I do live in California and always felt I belonged here. My full name actually means “Sunshine from the West,” something I have noted in one of my many blog entries on the concept known as “aptonyms.”

For the folks at Ocarina, the response is to explain that the name came from the founders, who decided that it sounded good. They also liked that it is a real word, rather than some kind of computer-generated mashup of random “Xs” and “Zs.” Most people are satisfied with this answer, though it does tend to be a bit of a conversation stopper. You want them to draw some connection between the clay flute that is the actual ocarina and the company mission, which is to reduce data through content aware compression and deduplication. But truth be told, there isn’t one.

As someone who spends a good deal of time tracking what’s being said on blogs, Twitter and other social media platforms, the name Ocarina poses a bit of a problem. The Ocarina of Time is a game within the extremely popular Nintendo series known as the “Legend of Zelda.” Hundreds of tweets are posted almost every day about this game, and numerous blog posts–all of which clog up my RSS feeds and Twitter search data. There’s even a video celebrating the music of the game, which has had over 65,000 views on YouTube for some unfathomable reason.

Here it is for all you Zelda freaks:

There is also an iPhone app called the “Ocarina” that allows you to play your own iPhone and share your music with others. Again, very popular–which leads me to wonder if there’s something about the name Ocarina that naturally resonates with people. What do you think? Are there any other associations you have with the name you’d like to share?

One small bit for mankind…

Posted by Sunshine On January - 22 - 2010

Thanks to Data Center Knowledge for picking up this science geeky piece of news–turns out that the SNAFU-prone CERN Large Hadron Collider is quite the data beast. The detectors built into the giant science experiment are coughing out gigabytes of data every second. One detector has 100 million readout channels. The below video is a mind blowing journey into the data center that is powering this experiment. As someone points out in the comments field, despite the multi-petabytes of data being generated there, the collider experiment all really comes down to one bit of data. That is, the Higgs boson that the gigantic experiment is designed to produce.

Of course, some of us are still wondering if our grandchildren are trying to stop us discovering it.

Meanwhile, enjoy this video.

The BD Event - Are you going?

Posted by Sunshine On January - 8 - 2010

Once in a great while someone comes up with an idea that makes you slap your palm to your forehead and ask, “Why didn’t I think of that?” Such is the case with The Business Development Networking Event, or “BD Event.” Organized by storage industry veterans Greg and VaNessa Duplessie, this conference fills a clear need that has arisen for industry insiders to meet and network among themselves. The costs are reasonable, and there are no sponsors or exhibitors to clutter up the place.

The description on the home page of the site sums it up: “Our mission is to create a compelling and inexpensive business development and networking event for industry insiders – one that focuses on networking and building relationships and that does not require exhibiting or catering to end-users.”

The next event will take place in Palo Alto, California January 26-28. The Ocarina crew will be there. The panels all look like they’re designed with a “need to know” agenda in mind. (Full disclosure, this blogger is on one of them–a panel on social media.) They cover such topics as: IT sales strategies, M&A and growth strategies for storage companies, channel partners and so on. The main event is networking. There is plenty of time set aside for it, in both structured and unstructured forms.

Hope to see you there!