Remote Backup Doublechecker

My Remote Backup Script is working nicely. After the backup, it writes some status to a file, "rsync_completed.txt."

But I noticed a few days ago that one of my backups didn't run. That's probably because the computer wasn't on at the designated time, or possibly because nobody was logged in, or that the system was too busy to run that particular task.

In any case, I wrote a Remote Backup Occurred Doublechecker. It runs every time I log in.  Because I'm crazy. I thought about making it a DOS batch script, but went with Python because it'd be faster for me that way.

And just to reveal my craziness, here it is (mostly):

try:
    rsync_file = os.path.join(my_root, "rsync_completed.txt")
    scriptname = sys.argv[0]
    if os.sep in scriptname:
        scriptname = scriptname.rsplit(os.sep, 1)[1]
    ask_user = False
    if not os.path.exists(rsync_file):
        import win32con
        ask_user = True
        msg = "Could not verify backup with file %s. Backup now?" % rsync_file
        flags = win32con.MB_ICONWARNING | win32con.MB_YESNO
    else:
        mtime = os.path.getmtime(rsync_file)
        dur = datetime.timedelta(seconds=time.time() - mtime)
        if dur.days > 2:
            import win32con
            ask_user = True
            msg = "It's been %s days since the last backup. Backup now?" % dur
            flags = win32con.MB_ICONQUESTION | win32con.MB_YESNO
    if ask_user:
        import win32ui
        response = win32ui.MessageBox(msg, scriptname, flags)
        if response == 6:
            import subprocess
            cmd = os.path.join(my_root, "backup_to_dreamhost.bat")
            subprocess.Popen(cmd)
except Exception, e:
    f = file(os.path.join(my_root, "Desktop", "%s Fail.txt") % scriptname, 'w')
    f.write("An exception occurred: %s %s\n" % (str(e.__class__), str(e)))
    traceback.print_exc(file = f)
    f.close()

Using Dreamhost personal backup from a Windows Computer

Dreamhost offers a personal backup service.  Here are some notes on what I did to get it working from my Microsoft Windows Vista system.

Install All Required Cygwin Modules


First, I installed any missing required modules for cygwin.  At first, I had trouble with ssh, because it complained it couldn't find cygssp-0.  Using cygcheck confirmed the missing library.

~$ cygcheck ssh
Found: C:\cygwin\bin\ssh.exe
Found: C:\cygwin\bin\ssh.exe
Found: C:\cygwin\bin\ssh.exe
C:\cygwin\bin\ssh.exe
  C:\cygwin\bin\cygcrypto-0.9.8.dll
    C:\cygwin\bin\cygwin1.dll
      C:\Windows\system32\ADVAPI32.DLL
        C:\Windows\system32\ntdll.dll
        C:\Windows\system32\KERNEL32.dll
        C:\Windows\system32\RPCRT4.dll
    C:\cygwin\bin\cygz.dll
      C:\cygwin\bin\cyggcc_s-1.dll
cygcheck: track_down: could not find cygssp-0.dll

Once I installed libsso0, rsync was working correctly, but required a password to be entered.

Setup Passwordless Login


Setting up passwordless login was really easy. In an sftp connection, I made a .ssh directory from my home backup server's directory, copied my cygwin's .ssh/id_dsa.pub to the new .ssh directory and renamed the file to authorized_keys.

Determining Directories and Files to Copy


I made an exclusion list of files not to backup and named that file excl.txt. Here's what the file contains:

*.obj
*.tmp
*.sbr
*.ilk
*.pch
*.pdb
*.idb
*.ncb
*.opt
*.plg
*.aps
*.dsw
*.pyc
*.pyd
*.bsc
*.pdb
*.projdata
*.projdata1
.svn/
.git/
.bzr/
.hg/

Then I made a shell script called, backup_to_dreamhost.sh, to backup only certain directories:

#!/bin/bash
rsync -e ssh -avz --exclude-from=excl.txt /cygdrive/c/Users/David/Documents user@b.dh.com:~/David
rsync -e ssh -avz --exclude-from=excl.txt /cygdrive/c/Users/David/Downloads user@b.dh.com:~/David
rsync -e ssh -avz --exclude-from=excl.txt /cygdrive/c/Users/David/Pictures user@b.dh.com:~/David
rsync -e ssh -avz --exclude-from=excl.txt /cygdrive/c/Users/David/Music user@b.dh.com:~/David
rsync -e ssh -avz --exclude-from=excl.txt --exclude=Pinnacle/ /cygdrive/c/Users/Public/Documents user@b.dh.com:~/Public
...

Note that occasionally I have to exclude some directories that contain massive amounts of video or temporary files. I can't copy my pictures and videos without taking up more than the free 50GB space allocated for backups.

Making Automatic Backups


I ran the shell script from within cygwin, and it worked. But now, how to make Vista run it? I created a DOS Batch script called, backup_to_dreamhost.bat. It contains only one line:

C: && chdir C:\cygwin\home\username && C:\cygwin\bin\bash --login ~/backup_to_dreamhost.sh

And from the Windows Task Scheduler, I created a recurring task that runs backup_to_dreamhost.bat.

TechCrunch, NetFlix, Digg and Twitter: You Have Not Beaten Me

Four services I use changed their APIs on me last week.  Four.  What the hey, Internet?

TechCrunch

They migrated to the Disqus commenting system.  In the process of doing so, they broke a feature of their RSS feed.  Their feed has the <slash:comments> element for each item, and it used to contain the correct number of comments.

I'm too busy to have to read each TechCrunch article's title to evaluate whether I should read the article. So I wrote a recommendation engine. The number of comments each article accrues is one of the criteria my cron job uses to evaluate TechCrunch articles.  

TechCrunch broke their <slash:comments> element.  It's still there, but it always evaluates to zero.  I fixed my cron job to go get the comment count directly from disqus instead.

TechCrunch should fix their broken feed anyway.  It's not cool to lie in your RSS feed.


NetFlix


Netflix changed the format of their movie URLs.  In some places.  In their new releases RSS feed, the movie URLs separate words in the tile with hyphens, like so:

http://www.netflix.com/Movie/Harry-Brown/70117310
But their actual API, like api.netflix.com/catalog/titles, returns movie titles with underscores separating the words in the title.

http://www.netflix.com/Movie/Harry_Brown/70117310
I'm too busy to have to look up a bunch of movies to decide which to rent, so I have a cron job evaluate each week's new releases with NetFlix's personal predicted rating for me.  By changing the format of their URLs in one service, but not the other, they broke my cron job that matches movies in the feed to their corresponding IMDB ratings.  It took me a while to figure out exactly what it was that broke my service!

I fixed that by having my cron job do a fuzzy match that matches to words separated by either hyphens or underscores.

Digg

They did a major overhaul when they released V4.  I'm not really interested in the debate over what's better and what's worse.  I'm interested in what they broke.

They broke their user history feeds.  They used to support personal feeds for their users, so that you could easily see what your friends dugg, like so:

http://digg.com/users/dblume/history.rss

Around August 25th, they changed the nature of the feed to also include everything from the people that that user follows.  So instead of being a concise personal history, it became a huge mess.  The next day, they turned off the service altogether.

By changing the nature of the feed, not to mention turning it off altogether, they broke the digg component of my personal lifestream.

Digg should restore the history feeds.  They were useful.  And it's bad form to break services that you used to provide.

Twitter

Twitter turned off basic authentication and left OAuth as the only alternative.  They announced the transition, and gave developers a long time to prepare for it.  It's a good thing.

Sadly, I was using basic authentication to munge together two of their feeds into one, for inclusion into my feed reader.

http://user:password@twitter.com/statuses/home_timeline.rss
http://user:password@twitter.com/statuses/mentions.rss

So for me, all Twitter activity suddenly disappeared one day.  It took me a while to realize that I'd forgotten to migrate my feed collator's authentication from basic to OAuth.  So I went ahead and made the fix.

Oh, the awesome thing about making a certified OAuth App for twitter?  I can integrate it into my dead man's switch.  Maybe I'll tweet from beyond the grave.

Phew!

In one week, four external services broke four of my personal services.  It felt like so much household maintenance: The toilet broke, or the grass needs mowing. The upside is that in fixing each of these personal services, I added to my skill set.

She Rolls Brains (Father's Day Edition)

The way Tycho Brahe rolls twenties, my wife rolls brains.  It's unbelievable.  She's undefeated in Zombie Dice, so far.  Zombie Dice?  Yeah, that's one of the things my family gave me for Father's Day.

They really outdid themselves this Father's Day.  I'm so spoiled!



Father's Day Loot



If you click on the picture above, it'll take you to the Flickr page where there are notes, but that's not the whole story. Sure, I got loot that's uniquely suited for me. (And I love it so much!) They also gave me tons of time doing the stuff I enjoy best. I played games with everybody: Super Mario Galaxy 2 with Aaron, Rock Back 2 with Madison, Zombie Dice with Lillian and Madison. I went to Starbucks and read a book for an hour or so (again) while nursing an Iced Mocha.

I feel so loved.  And it's by people who know what I really like, even if it's silly or geeky.  I appreciate them so much.  (And boy, do I owe my wife something special.  This is going to be hard to beat.)

Wonderful Early Father's Day

 We've got some neighborhood commitments on Father's Day, so my family decided to celebrate an early Father's Day today, just for me.

My one concession to the family was to go with them to The Karate Kid, and it turned out to be better than I thought it would be.  (I didn't have any high hopes.  That Smith family is pretty good.  It's a little annoying.)

I spent some of the afternoon reading at home, then went to a Starbucks, got an ice mocha, and nursed it while reading a dead-tree book, Daemon.  I was the only person there without a laptop.

When I got home, the family was waiting for me to play Rock Band with them.  (We don't have a good drummer, but you can't have everything, I guess.  We mostly kept drums on easy.)

It was pretty much my ideal Father's Day.  Gonna go wrap it up with an episode of Breaking Bad or the Moribito anime.

Ichi

 I loved Ichi.

I don't think this film got the attention it deserved.  And strangely, I think it might be for the things it did right.  I'll explain below.


Ichi is a twist on the Zatoichi movies, where instead of being a man, the blind protagonist is a woman. If you're familiar with the Zatoichi series, you'll know what to expect. You'll see people wronged, and deadly, bloody vengeance.  It's not high art, but it fills a niche, and does so nicely.  There's pathos, poignancy, and a certain serene beauty.  There's humor, too, but the movie manages to stay this side of campy.

Here's what I think Ichi did right, that it could so easily have screwed up.

The beautiful Haruka Ayase was cast as Ichi. The movie usually relies on long shots that take in the surroundings and the peasants dressed in their rags, but it'll occasionally linger on a close shot of Haruka's face, as the blind swordswoman senses her surroundings. When I looked up Haruka Ayase online, I was surprised to learn that she was a shapely actress, model and singer. They could easily have made her character flash a little skin and bare a little cleavage to draw in the boys. She's got it. But there was really none of that in the movie. Ichi remained a tragic and sympathetic character throughout, never a sex object for the audience.

Then there was the blood. In this era of Sin City, 300 and Spartacus: Blood and Sand, we've grown to expect buckets of digital blood flying off of every sword stroke. Oh, and plenty of sword strokes. And there was digital blood in Ichi, too. But it was, dare I say it, almost subtle!  Quite a few fights ended in just one decisive deadly stroke. And maybe there'd be some blood splatter, but not geysers.

It turns out that the movie would have been so easy to screw up for ratings. They could have sexed up the actress and slathered on more digital blood and further removed it from its Zatoichi heritage. But they didn't. And the movie's better for it.

Wanna see some screen captures?  Go see why Lynaeina calls it her favorite date movie.

The Six-Limbed Bat

 I had a really vivid dream this morning.  I was inside this big old dilapidated building, on around the third floor. It was evening. I was standing next to a big, dirty window.  There were thick cobwebs on the inside of the window, in the corner.  They almost obscured what was just on the outside of that corner of the window.  Something mouse-sized was crawling around.

I looked as closely as I dared.  And it turned out to be a group of bats.  But they didn't have normal wings.  As I studied them, I realized that they had six arms.  And instead of wings formed from the skin between long fingers, they had flaps of skin that stretched between the wrists of their arms like the skin of a flying squirrel.


In my dream, I fished out my phone, and tried to take photos of the six-legged bats.  It was hard, because I didn't dare clean the window, it was too disgusting, and I couldn't feel sure that it would be safe to do so.  So I stretched my arms out with the camera, as close as I could, and as steady as I could and took a few shots.  The skin around their arms was a little translucent because there was street (or moon) light behind them.  It was fascinating.

After I woke up, I had to try and sketch them, to remember the dream.

How To Efficiently Waste Time

Step One: Stop wasting time all willy-nilly.  Decide when to do it, and stick to your decision.
Step Two: Mute your friends who post, tweet or plurk without any substance (or update too frequently), but are still awesome. Save those streams for time-wasting time.  Don't add their feeds to your feed reader.  Make it so you have to force yourself to type in their account's URL.

Here are a couple of my favorite sites:

Here's a new site that has potential: Trending items on Facebook without actually having to go to Facebook.

  • It's Trending (Not sure yet how frequently the content updates, though.)

California's Education Funding Over Time

California is expecting to fund over 50% more students than in 1980 while diverting funds from universities to prisons. That can't possibly work. A major adjustment of expectations is needed to address this issue which is only worsening.

California's population has more than doubled from 15 million in 1960 to 32 million in 2000. (It has increased by more than 50% from 1980 to 2008, data from Wolfram Alpha.)


(Graphic and data from censusscope.org.)


Undergraduates in 2010 have to pay over 4.5 times what 1980 students did, even in inflation-adjusted dollars. At the same time, the state government has cut its funding to less than a third of what it was, in inflation-adjusted dollars.


(Graphic and data from UC Pay)


To add insult to injury, California state prison funding has risen as a percentage of the state budget, while university funding has fallen.


(Graphic from Professor Bainbridge)


The article from which I got the above graphic suggests that the Governer is attempting to change the way higher education is funded, but is doing so in a way that will pit the very powerful California Correctional Peace Officers Association against it. (An anecdote: While searching swivel for statistical data on "California" and "Prison", an ad came up for a petition against the Governer's proposal to reduce prison costs.)

Sigh.

"Random" Notes

I carpool with a friend in a nearby building to the climbing gym. To decide who drives, we play email roshambo. The HTML code for the Rock/Paper/Scissors choice is presented below:

<SELECT name="p_throw">
<OPTION value="r">Rock</OPTION>
<OPTION value="p">Paper</OPTION>
<OPTION value="s">Scissors</OPTION>
</SELECT>

We've done this for years, and we each end up driving about 50% of the time, with a few short streaks breaking up the routine. I decided it was too much effort to have to actually choose rock, paper or scissors if I didn't want to. It'd be nice for the website to randomly suggest one for me.

So I added the following PHP code:

$suggest = rand(0, 2);
<SELECT name="p_throw">
<OPTION <? if ($suggest == 0) echo "SELECTED "; ?>value="r">Rock</OPTION>
<OPTION <? if ($suggest == 1) echo "SELECTED "; ?>value="p">Paper</OPTION>
<OPTION <? if ($suggest == 2) echo "SELECTED "; ?>value="s">Scissors</OPTION>
</SELECT>

There, now the web page will suggest a random throw when it loads. How convenient!

Except that my friend started destroying me in our challenges. From December 2009 through March 2010, he began winning over 75% of our challenges, and I had to keep driving the carpool. This was costing me money!

It turns out that while I was using the suggested random throw, he had a different strategy. He considered his suggested throw, and made the throw that would beat it.

So there was a correlation between the random throw suggested for me, and the one suggested for him! Even though our suggestions are generated from different pages (mine from index.php because I start the challenge, and his from throw.php because he responds to a challenge), the random number generator runs on the same server.

It's just not random enough, and it exposed an exploit that was costing me money!

There's a simple fix for this, use mt_rand() instead of rand(). So I made the following change:

$suggest = mt_rand(0, 2);
Although I made the change to mt_rand(), the fact that the pseudo-random numbers were being made by the same physical generator bothered me. I decided that instead of generating the suggested throw on the server, it'd be best to generate the suggestion at the client computer, in Javascript. So I wrote the following code and deployed that:

function Set_suggested_throw()
{
var randomnumber = Math.floor( Math.random() * 3 )
document.rpsform.p_throw.selectedIndex=randomnumber
}
</head>
<body onLoad="Set_suggested_throw()">
That's better. Now his random suggestion is generated on an entirely different machine than mine...

Except it bothered me that both random numbers were seeded in a similar fashion, and there'd often be a constant offset between the localtime and uptimes of both machines. This wouldn't do. I needed better randomness. Luckly, there's a site for that.

Great! I'll have my Javascript make a quick call to random.org to make the suggestion.

function Set_suggested_throw()
{
var xmlhttp = null;
if (window.XMLHttpRequest) {
xmlhttp = new XMLHttpRequest();
if ( typeof xmlhttp.overrideMimeType != 'undefined') {
xmlhttp.overrideMimeType('text/xml');
}
} else if (window.ActiveXObject) {
xmlhttp = new ActiveXObject("Microsoft.XMLHTTP");
}
// Only if quota not exceeded. See http://www.random.org/clients/
xmlhttp.open( 'GET',
'/rnd_org/integers/?num=1&min=0&max=2&col=1&base=10&format=plain&rnd=new',
true );
xmlhttp.send( null );
xmlhttp.onreadystatechange = function() {
if ( this.readyState == 4 && this.status == 200 ) {
document.contactform.p_throw.selectedIndex=this.responseText
}
}

}
Making a call to random.org from a page generated at rps.dlma.com runs afoul of the Same Origin Policy. My server is a Linux server, so all I have to do is add a ProxyPass to my httpd.conf and restart it.

LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_http_module modules/mod_proxy_http.so

ProxyPass /rnd_org http://www.random.org

A downside is that the proxy content can be cached, so I'd have to be sure to disable that.

Even worse is that my server is a shared server running at DreamHost, and I don't have access to httpd.conf. And one can't specify a ProxyPass in a .htaccess file either.

So the work-around is to have the PHP code make the call to random.org. Great! Just code that sucker up, ensure that we don't spam random.org by checking our quotas, and falling back on the Javascript implementation if we do go over quota. I won't show you all that, just the PHP snippet.

$ctx = stream_context_create( array( 'http' => array( 'timeout' => 5 ) ) );
$url = "http://www.random.org/integers/?num=1&min=0&max=2&col=1&base=10&format=plain";
$result=file_get_contents( $url, 0, $ctx );
if ( !is_bool( $result ) || $result != false ) {
// set $suggest }
Great! The next time we played Email Roshambo, I won! Phew! As he drove me to the gym, I asked him how he'd been since I last saw him.

He said, "Oh, I've had better Fridays. I was RIFfed. We won't be carpooling anymore."