Home > Uncategorized > Do the numbers, Updated

Do the numbers, Updated

Via Instapundit, some plotting of data from the notoriously inaccurate exit polls. That aside, the data was summarized in to graphs that seem hardly ideal for the message they’re trying to get across.

Pretty hard to read, isn’t it? It’s trying to show how much the youth support for the Republican Party dropped in this election year. But the funny construction makes it hard to read, and the axis starting at 30 makes the drop seem much more drastic than it is. The latter charge may not be strictly in violation of any rules but it certainly makes the graph initially deceptive. The data, rephrased in Google Docs (since OO.o 3’s graphs are so jaggy):

Youth Vote vs Everyone Else

Man, look at that blue line dive! Now, I’m officially picking statistical nits with a statistics professor, but I really don’t get why he made his graphs this way. I mean, aren’t line graphs supposed to show change over time, where possible, not change over age?

Update: In comments, Hazel asks if a bar graph wouldn’t be more appropriate. Well, let’s see:

Youth Vote vs Everybody Else, bar graph remix

Bar graphs waste a lot of ink expressing the same information, and I’m not a general fan of them, but they do have a purpose. Where the lines in the above graph were useful for comparing the trends of one group across the years, this bar graph can show the difference in relation between groups over they years. It’s a subtle, borderline-pedantic difference, I think.

Where the first graph pretty clearly shows the decrease among the youth voters, the second graph more accurately shows the disparity between the youth vote and the rest of the vote. Both have interesting stories to tell, and, it seems to me, both present the information more clearly than the original.

Still more: A fellow Stats prof over at Columbia points out my work:

I have to say I prefer a college freshman’s plot to yours, Andrew. Although, you did hack it together at 3am after strolling around Grant Park. And drawing the y axis from 0 is a mistake which you didn’t make, too.

Ack. My sidebar did still say I was a freshman… Anyways, to this, the original author of the plots replied:

I agree–his plot is better! Age on the x-axis is good, but time is even better. I should remake my plot with all his plot’s good features plus a few improvements (fixing the y-axis, removing the horizontal lines, and labeling the lines directly rather than with a color code). Also it would be good to use NES to take the lines back to earlier years.

P.S. He also posts a bar graph, which, as he notes, is horrible.

Heh.

Advertisements
Categories: Uncategorized
  1. Hazel
    November 6, 2008 at 5:56 pm

    Correct me if I’m wrong, but shouldn’t it actually be a bar graph? I know it’s change over time, but since there are only 3 samples of time, four years apart, with no elections in-between (that is, presidential ones), it’s really not a constant change–or, at least, not constantly measured. I haven’t taken my Statistics classes yet, but I think that’s how it’s supposed to be most clear.

  2. November 7, 2008 at 12:00 am

    Hazel:

    Strictly speaking, you’re right. However, the line graph is a standard graphical idiom for time series–EVEN THOUGH THERE’S NO DATA EXCEPT AT THE VERTICES–so that’s the way we like to do it. “TRADITION!”

    We may attach ourselves to trendy topics like exit polling and environmental sciences, but deep down, statisticians are unbelievably SQUARE.

  3. Hazel
    November 7, 2008 at 8:53 am

    Well, like I said, I haven’t taken Statistics yet. Glad it’s cleared up, then (though neither Hober nor you seem to be implying that it’s “horrible” as the original graph-maker said).

  4. jonpeltier
    November 7, 2008 at 9:54 am

    Nice analysis, and no, a column chart is not necessarily better. Depends if you want to emphasize the trends or focus on the differences in values at the dates represented by the data.

  5. parviziyi
    November 7, 2008 at 5:14 pm

    Hobershort says the data is from “the notoriously inaccurate exit polls”. Yesterday in the Wall Street Journal, online.wsj.com, written by the well-connected and probably well-informed Karl Rove: “for the third election in a row the exit polls were trash. The raw numbers forecast an 18-point Obama win, news organizations who underwrote the poll arbitrarily dialed it down to a 10-point Obama edge, and the actual margin was six.” Habershorts’s charts above are using “cooked” numbers that downweight Obama-voting respondents so that Obama gets a six-point win in the poll data, which only superficially hides the fact that the underlying data is highly unreliable, even downright trash.

  1. December 1, 2008 at 3:32 pm
  2. March 10, 2014 at 12:33 pm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: