internet-archive-story

Anybody who has used The Internet Archive knows that there’s a wealth of information and data stored there that is seemingly unfathomable to the average person. Essentially, it acts as a sort of Time Machine for the web allowing you to view just about any page on the internet beginning around 1996 until now. And even with a few pages missing here and there, the infrastructure needed to hold The Internet Archive together must be pretty impressive.

To show off their robust technologies, Sun Microsystems put together an interactive tour of how they manage to get The Internet Archive up and running smoothly and consistently. By using Sun’s server technology in their Modular Datacenters packed into shipping containers, The Internet Archive is able to continually udpate its database of the web while making sure the past is preserved without any failure or downtime.

A couple of interesting facts:

  • The Internet Archive grows at a rate of about 100TB every month.
  • The Internet Archive currently fits into one 20-foot shipping container, but additional containers can be added on the fly.
  • The Internet Archive database is currently about 3 petabytes in size.
  • The Internet Archive is one of the largest (if not the largest) digital archive in the world.

It’s a bunch of nerdy stuff I guess, but you have to admire how crazy and awesome these people must be to want to archive ALL digital data on the internet.