In today’s UK Register, Chris Mellor talked with Brian Biles of Data Domain about its plans for global dedupe. In it, Brian says that Ocarina is not “synergistic” with Data Domain. Writes Chris: “Data Domain set out to solve a data protection problem whereas Ocarina set out to solve a media management problem.” He then quotes Brian, “‘I think it [Ocarina] is in a different market that’s not that synergistic. It’s a different choice from how to optimise data protection.’”
Chris’s final comment? Even if Ocarina offered an OEM deal, Data Domain wouldn’t be “enthusiastic.” Well, that remains to be seen, and actually, it isn’t the important question. Ocarina agrees that, for now, the right place for its functionality is not in the backup tier where Data Domain lives. There is no reason to believe that Data Domain’s acquisition by EMC in any way diminishes the strength of the technology partnership that already exists between Ocarina and EMC.
Ocarina is the Rolls Royce solution for online data reduction, and in that sense, we compete with NetApp Dedupe, not Data Domain. The reality is that right now, as a member of the EMC Celerra Velocity program, Ocarina has been a point of synergy for them, and we don’t see that ending any time soon. The synergy is that if you do online dedupe right on your NAS platform, including EMC’s Celerra, then it plays right in to the strengths of Data Domain when it comes time to back up.
In the Data Domain product, you have a product that’s optimized for the backup world – fast sequential throughput in support of backup windows driven by standard backup applications. In the NetApp case, you have an OK implementation of simple block dedupe, designed to give some data reduction results without sacrificing too much performance in support of random I/O by end users.
There is no right or wrong answer here – both products take the correct approach for the problem that they solve. What’s misleading is the positioning of Ocarina as a solution for media accounts. While Ocarina does have many successful installs in rich media accounts, our core dedupe engine is intended to give multiple storage vendors the same kind of fast, embedded dedupe solution that NetApp has for all online file types. Just to clear any misconceptions, Ocarina has a diverse – and fast growing – customer base, with existing customers in publishing, semiconductor, bio-informatics, energy, film-making, eDiscovery, and Web 2.0.
Because Ocarina’s solution combines dedupe with content-aware compression, Ocarina can address a much broader set of data types and customers than any dedupe-only product, including NetApp. With Ocarina, you can use policies to configure Ocarina for simple dedupe only, giving Ocarina storage partners like BlueArc, EMC, HDS, HP and Isilon equivalent data reduction and primary storage performance as NetApp dedupe.
Alternatively, you can set the policies to be more aggressive, to use all the content-aware compressors, and get much much better data reduction than NetApp while still supporting reasonably fast random I/O for end-users. Since dedupe in general does not get good results on already-compressed files – especially images, video, Zip and other compressed data – having content-aware compressors allows Ocarina to address all those files in addition to providing great dedupe performance for corporate and enterprise file types. Finally, Ocarina works across multiple types of storage, so a customer can have a single dedupe “language” across all their NAS and primary storage vendors.
Ocarina is, therefore a better technology than NetApp dedupe that also has the advantage of being vendor agnostic. At the same time, it’s complementary to Data Domain. That synergy comes from a fundamental difference in how a customer backs up data that has been deduplicated by Ocarina versus data that has been deduplicated by NetApp. With NetApp, when you go to backup a deduped volume, NetApp will rehydate that volume, expanding the data back to its original full size. With Ocarina, we have a dedupe-aware implementation of NDMP – the backup protocol standard – that allows us to keep data in its deduplicated and compressed state as it is backed up, while still allowing single file restores.
This actually raises an interesting question: Do you still need Data Domain in that case? After all, you’re backing up already deduped data?
Well, yes, actually. Backups are repetitive. So even if you perfectly dedupe the live online volume, if you back it up every day, that process is going to create more dupes in the backup target. Data Domain will find those and eliminate them. The data reduction is additive. The combination of Ocarina for live volumes and Data Domain as a backup target has a big advantage for backups, because it shrinks the backup window. Because the first pass of dedupe has already been done on the filer, there is less data that has to move from storage to backup. If you have 100TB on a set of NAS filers, and Ocarina shrinks that to 40TB, then you’ve reduced the amount of data that needs to be sent across a network to the Data Domain by 60%, making your backup window smaller and faster. Data Domain, in turn, will shrink that data further with every subsequent backup.

When can we expect native CIFS support on the Ocarina platforms? The current implementation is outright clunky. So until you have a working CIFS implementation, I don’t think you can compete with NetApp. You may get better compression results, but it works only for NFS data.