D'oh. I've been CAPTCHA'D.
Sitting here in Texspresso after work. I decided to let my program run at top speed. I wasn't sure whether it would take a fixed amount of time to catch me, or whether it's mainly based on the number of page hits. I reduced my sleep time so that I get a new web page every two seconds. It only took them twenty minutes to make me stop, so the speed at which I hit them is definitely a big factor.
Oh well. In that time I managed to collect 1100 new clusters, which finishes off the month of September 2006 (the month that Paris Hilton got arrested, which make some entertaining analysis). But I only managed to pick up 100 stories, so I've got more to do.
Nephlm mentioned a program called Tor that hides your IP address, so maybe I'll try that and see if it works.
Update: Tor works! It works like a charm! Nephlm, I owe you a beer. Come to Austin sometime and I'll pay up.
Tor is a product of the Electronic Freedom Foundation, and what it does is rout your web requests through various remote servers so that the Google server can't tell where you're really coming from.
But an amusing side effect is: When I logged in to blogger, everything was in German. I must be sending requests through a host in Germany somewhere, and now Blogger sees my destination and thinks I want the German version of Google.
Oh well, who cares, as long as I'm getting my data. :) "Post veröffentlichen" means "publish this post," right?