Last Resort Data Recovery: The Wayback Machine
Monday, October 1st, 2007
Yesterday, I discussed a technique to recover data if your database has gone <skrewed>poof</skrewed> using Google’s cached content. There’s another technique that has saved my bacon more than once: The Internet Archive’s Wayback Machine.
I once ran into a situation where the data on a new client’s server was corrupted, and there were no backups available. The graphics were jumbled or missing altogether, and the original artist was missing in action. Without the original source files, or any data backup to speak of, my client was pretty much pooched. And because the site had been in this state for quite some time, Google had cached the corrupted content. What to do? Visit the Wayback Machine, that’s what.
If you aren’t familiar with the Internet Archive, it is a non-profit that has been quietly taking periodic snapshots of the Internet–yes, the entire Internet–since 1996. Unless you have been one of the idiots who have requested that the IA stop
“stealing” your content, images of your site are probably in the Archive.

To use this technique, simply type your URL into the Wayback Machine’s search tool. You’ll get a list of snapshots in return that will, in most cases, date back to the birth of your site. Clicking on one of these snapshots should reveal a navigable (and, hence, content-extractable) version of your site as it appeared on that day.
In my case, I was able to find most of the images in an uncorrupted state. Simply by saving the graphics directly from the browser into my client’s site, I was able to reconstruct the site and get my client back on the web.
Sphere: Related Content


I recently had a need to create a ton of fake users to test a system I’m building for a client. This required more than your average “tuser1″ type one-offs. I need to stress the system, so I dusted off an old post over on the most excellent