
Interesting story in today’s San Jose Merc about the new home for the “wayback machine” run by the nonprofit Internet Archive. For those who haven’t explored it, this site is a great way to take a trip down memory lane, a la Proust’s madeleine, at Web pages gone by. For example, I was able to track down this article I wrote in 2000 for a dot com called Pro2Net. (Note to younger self on that one: brevity is the soul of wit.) There are plenty of other uses.
The problem is that there is a massive and growing amount of data in this archive. Their novel solution? The system was moved from a data center running standard Linux servers to a specially equipped shipping container on the campus of Sun Microsystems in Santa Clara.
On the Sun site there’s a video tour of the shipping container/data center. It has all kinds of green elements, including a recirculating chilled water cooling system. Plus, the efficient size is impressive. The storage itself is very high density disk inside a small form factor — up to 48 disks in each — something they say requires far less power. In all, there are eight racks filled with 63 Sun Fire x4500 servers running Solaris 10 with ZFS.
In any case, this is an instructive example of what is happening with online data growth. The archive, which takes occasional snapshots of pages across the entire WWW, now must grapple with increase in video, audio and graphics. In fact, the rate of growth on this already multi-petabyte system is now 100TB/month, reports Lucas Mearian at Computerworld. One wonders how long this solution will hold before the Internet Archive will have to go searching for yet another home for its nostalgia machine.





