Thanks to everyone who responded to my request for Master's Thesis ideas. As I mentioned in the comments section, I'm planning to do some news analysis using sites like Digg.com, reddit.com, and perhaps del.icio.us.
I like to say that the this topic is partly inspired by Anna Nicole Smith, since around the time I thought of it, Smith died and for some reason completely monopolized cable news for several weeks. I kept wondering: Why in the world do they think people care about her? People die all the time. As celebrities go, she wasn't particularly interesting. Do people actually read this stuff?
Web 2.0 can give sort of a handle on answering this question. At Digg.com and similar sites, people actually rate the news by voting it up or down. A given news item will get an overall "score" for how many people voted for and against it.
Now suppose you take the average rating of a news story on a given subject -- let's stick with Anna Nicole Smith as the example -- and compare it to the number of times that that subject story appeared in the news, across all news sites. The first number would tell you what people want to read about. The second number would tell you what is being presented most often as news. We could probably normalize this by what section of the newspaper it appears in -- for example, a story that appears on the front page is considered more important than one that doesn't; a long story may be more important than a short one.
So the question at hand is: how successful are news sources at generating information that people want? Are readers really treating their news as entertainment, or do they recommend hard hitting investigative reporters much more heavily? And what about media bias, either liberal or conservative?
In theory, it may be possible to quickly identify stories as leaning towards a liberal or conservative position, perhaps by cross-referencing them with the people who recommend them. Then what? Well, suppose it turns out that there are more liberal stories than conservative ones in the media... but suppose also that the liberal stories tend to be rated higher and read by more people than the conservative ones. That might indicate that, for instance, the idea of what "liberal" means is out of sync with the political center. Of course, it could go either way, and I'll be interested to try to come up with a measurement that doesn't bias the results.
There are tons of flaws with this topic, and I'll acknowledge some of them up front. For starters, those who subscribe to Digg almost certainly do not constitute a representative sample of all people in the country who read the news. So there's no way I can think of to justify any claims about all people nationwide. However, just investigating this cross section of people, and seeing what they like, could be useful and interesting in various ways that I haven't thought of yet.
When I talked about this topic with Dr. Ghosh, who will be my adviser, he said I shouldn't get sidetracked by that kind of problem, because it's not unusual for a research paper to be limited in scope. In fact, he recommended that I deliberately limit the scope to around five news sources, so that I have interesting things to say about just articles from those sites. I was thinking of picking three somewhat "mainstream" media sites (for example, NY Times, Washington Post, and CNN); then pick a liberal feed (perhaps Daily Kos) and a conservative feed (Fox News? Washington Times? WorldNet Daily?) to compare against.