Monday, September 17, 2007

A better way to show data?

We all know that measured data comes with some uncertainty. Perhaps it's measurement error; perhaps it's sampling error. We even expect to see it mentioned explicitly in political polls, but I rarely see it published in business and financial reports. There are likely many reasons for that omission, only one of which is the difficulty of presenting the uncertainty concisely and informatively.

Thomas Louis and Scott Zeger recently published Effective Communication of Standard Errors and Confidence Intervals with a proposed approach to indicating such uncertainties. On the one hand, I like it. It makes a nice, neat display of two, three, or five numbers that can describe a statistic nicely, and it's relatively easy to understand and to incorporate into your reports (the authors give three lines of LaTeX code you can use directly). On the other hand, as Andrew Gelman notes, graphs are still better (thanks to Andrew for the pointer).

What's a person to do? I have some suggestions:

  • If you've got data in tables, strongly consider figuring out a graphical approach that conveys your information clearly and effectively instead of using a table, as easy as that might seem. In addition to the ideas in the paper Andrew references, consider boxplots among the potential candidates.

    If you don't have time to create useful graphics, consider whether your audience has the time to make sense of your tables. There are usually more of them than there are of you; taking 10 extra minutes to save 20 other people 2 minutes each sounds like a good trade-off (plug in your own numbers).

    Of course, if you're doing a balance sheet or income statement, you probably need the numbers at least once, although graphics may still help to convey your message.

  • If you've got isolated bits of data that you're using in flowing text, consider using Louis and Zeger's approach.

    It's easy to do in LaTeX and pretty easy in's Write. It seems a bit harder in Word (I have Word 2000) because I don't see a way to have subscripts on subscripts, but you can select a smaller font on some numbers.

    Unless and until this becomes a standard idiom, you'll probably need an explanatory note somewhere in your report.

  • Consider sparklines as a way to convey graphical data (time series graphs or histograms) in flowing text. Sparklines can be generated with a number of different approaches for a number of different document formats; if you'd rather, you can generate them online.

Perhaps the real answer is that we now have yet another way to portray data, from which we can pick and choose to fit our current needs.

Labels: , ,


Post a Comment

<< Home