Posts Tagged ‘ the wayback machine ’

Last Resort Data Recovery: The Wayback Machine

Monday, October 1st, 2007

The Wayback MachineYesterday, I discussed a technique to recover data if your database has gone <skrewed>poof</skrewed> using Google’s cached content. There’s another technique that has saved my bacon more than once: The Internet Archive’s Wayback Machine.

I once ran into a situation where the data on a new client’s server was corrupted, and there were no backups available. The graphics were jumbled or missing altogether, and the original artist was missing in action. Without the original source files, or any data backup to speak of, my client was pretty much pooched. And because the site had been in this state for quite some time, Google had cached the corrupted content. What to do? Visit the Wayback Machine, that’s what.

If you aren’t familiar with the Internet Archive, it is a non-profit that has been quietly taking periodic snapshots of the Internet–yes, the entire Internet–since 1996. Unless you have been one of the idiots who have requested that the IA stop
“stealing” your content, images of your site are probably in the Archive.

The Wayback Machine's Search Results

To use this technique, simply type your URL into the Wayback Machine’s search tool. You’ll get a list of snapshots in return that will, in most cases, date back to the birth of your site. Clicking on one of these snapshots should reveal a navigable (and, hence, content-extractable) version of your site as it appeared on that day.

In my case, I was able to find most of the images in an uncorrupted state. Simply by saving the graphics directly from the browser into my client’s site, I was able to reconstruct the site and get my client back on the web.

Sphere: Related Content

Tip: Creating Test Users

Friday, June 22nd, 2007

I recently had a need to create a ton of fake users to test a system I’m building for a client. This required more than your average “tuser1″ type one-offs. I need to stress the system, so I dusted off an old post over on the most excellent Signal vs. Noise blog written by the 37 Signals guys who recently went through a similar need when building their Highrise contact manager. A Google search and some good, old fashioned hunting revealed the post I was looking for.

For demo purposes, we’ve had to populate Highrise with a bunch of fake people. Here are some of the sites we used to save time and increase randomness while creating these make-believe contacts:

The Random Name Generator pulls first and last names from a couple of genealogy sites. Some fun ones that turned up: Garfield Morland, Juniper Pinney, Keaton Dimsdale, and Seymour Zeal.

A search for “John Smith” at whitepages.com provides addresses and phone numbers (we change the street and phone numbers by a couple of digits).

Plambeck.org has a company name generator that serves up choices like Sems Research, Cadridium, Nated Design, etc. 2robots.com also offers a Random Business Name Generator.

For job titles, The Economic Research institute has a huge list. And there’s also GigantaMegaCorp’s Job Title Generator which spits out random ones like Inter Purchasing Planner, Senior Engineering Associate, and Foreign Information Processor.

[via Making Random Contacts by Matt Linderman]

I found the White Pages method unusable for bulk-applications, so I turned up an ancient app for The Windows called RandomData 2.3 by Geoff Phillips. The trial version cranked out fake addresses like nobody’s business and I was off to the races.

I’ll also throw in some custom PHP functions for good measure:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
function generateFakeIP()
{

    return rand(12,97).'.'.rand(1,255).'.'.rand(0,255).'.'.rand(1,255);
}

function genFakeEmail($username,$company)
{

    $retStr = strtolower($username.'@'.$company);
    $retStr = str_replace(' ‘,",$retStr);

    $domain = array('com','com','com','com','net','edu','org');

    $thisDomain = array_rand($domain);

    $retStr .= ‘.'.$domain[$thisDomain];

    return $retStr;

}

function genPhoneNums()
{

    $parenSet = array(array('(',') ‘,'-'),
                      array('(',') ‘,'-'),
                      array('(',') ‘,'-'),
                      array('(',') ‘,'-'),
                                        array(",' ‘,'-'),
                                        array(",' ‘,'-'),
                                        array(",","),
                                        array(",'.','.'),
                                        array(",'.','.'),
                                        array(",'.','.'));

    $whichArr = array_rand($parenSet);
    $punctSet = $parenSet[$whichArr];

    $thisPhoneNum  = $punctSet[0];
    $thisPhoneNum .= getRandNums(3, false);
    $thisPhoneNum .= $punctSet[1];
    $thisPhoneNum .= getRandNums(3, true);
    $thisPhoneNum .= $punctSet[2];

    $thisCellNum   = $thisPhoneNum;

    $thisPhoneNum .= getRandNums(4, true);
    $thisCellNum  .= getRandNums(4, true);

    return array($thisPhoneNum, $thisCellNum);

}

function getRandNums($charCount, $canBeZero=true)
{
    $retStr = ";
    for($i=0; $i<$charCount; $i++)
    {
        if($canBeZero && $i==0)
        {
            $retStr .= rand(0,9);
        } else {
            $retStr .= rand(1,9);
        }
    }

    return $retStr;
}


Enjoy!

Sphere: Related Content