To that end, I've written a big Java program around an online MySQL database. In the last few days I've cataloged about 22,000 news pages, although only a small number of them will ultimately turn out to be important to the study. I've labeled roughly a dozen web sites and a dozen news topics as "interesting." The sites are:
- www.washingtonpost.com
- www.nytimes.com
- www.foxnews.com
- www.guardian.co.uk
- online.wsj.com
- www.usatoday.com
- www.cnn.com
- www.townhall.com
- www.washingtontimes.com
- Rudolph Giuliani
- Anna Nicole Smith
- Harry Potter
- Tiger Woods
- Rupert Murdoch
- Barack Obama
- Gulf Coast
- Mitt Romney
- New Orleans
- Hillary Clinton
- Britney Spears
- Blackwater
- Ron Paul
With the web sites, the idea is to have a variety of media sources. Some are considered serious news sites; some are "fluff" news (I picked USA Today specifically for that reason, and it's possible that CNN will tend to fall in that category as well); and several are explicitly right wing rags. To be fair, I really would like to have included left wing rags, but the only ones I can identify are blogs, which are not treated much as news sources. The news is all pulled off of news.google.com. I search for the topics of interests, then read the resulting stories more or less indiscriminately and identify which site each one comes from.
Based on this, I have a total of nearly 2000 "news" sources, ordered by the number of stories found in searches since I started collecting data. In the stories I've pulled so far, after about three days of serious searches on the 13 topics, the New York Times and the Washington Post (my main "serious news" sites) each account for 104 stories. But dailykos.com has shown up zero times, so I guess there's a master list that they're clearly not on. TPM Muckraker and TPM Cafe both show up, and those are both explicitly liberal sites, but there are only 8 stories from them. "The Nation": 9 stories. So, liberal sites = small sample size. No use.
By contrast, townhall.com, whose "about" page proudly announces that they were founded as a "conservative web community," accounts for 123 stories. Yes, you read that right: for the topics I picked, townhall is treated as "news" more often than either the New York Times or the Washington Post. So, bottom line, I get to pick on right wing news sources more than left wing news sources, simply because left wing news isn't "news."
Almost time for the Daily Show now, so I've managed to procrastinate this long. Go me!
If anyone would like to make further contributions, feel free to suggest other story topics that are in the news. Anna Nicole Smith and Harry Potter aren't actually generating very many headlines these days, so I need more unserious topics that the media uses as padding these days. Suggestions? And if you have more right-wing, left-wing, or "mainstream" news sources that I should be looking at, make some suggestions. I'll check my database and see if there are enough stories represented to get something useful out of them.
No comments:
Post a Comment