Showing posts with label rants. Show all posts
Showing posts with label rants. Show all posts

Sunday, January 27, 2008

Books that make you dumb? I don't think so.

Time for another statistics lesson. I'm not the world's greatest statistics whiz (not like the super-geek on TV's "NUMB3RS" show), but that's part of my point: you don't have to be a super-geek to detect major mistakes in statistics that come your way.

This isn't a rant - it's just an interesting example of what to be careful about, with a little entertainment along the way.

I just got an email with something that, on the face of it, is fascinating:
Some one matched up the most popular books in Facebook college groups with average SAT scores at colleges to see what people commonly read at different intelligence levels. http://booksthatmakeyoudumb.virgil.gr/
Nice graphics, and a decent explanation about his method. On the face of it, pretty interesting.

But there's this thing about statistics: you've got to be careful about (at least) three things:
  1. When you see a pattern, are you really seeing a pattern you can count on, or is it just a momentary coincidence? (If the first two people to walk into your office are men, does that mean only men will walk in today?)

  2. Even when you do see a pretty reliable pattern, can you be reasonably sure it means what you think it means? (A relationship between the behaviors of two variables is called a correlation, but that doesn't mean you can say one caused the other. A famous example: for some years there was a correlation between wolverine population and the number of sunspots. Did either cause the other? Not likely, and besides, who could tell? The lesson: Similar behavior of two figures could just be a coincidence.)

  3. Finally, you've got to be really careful about whom you actually measured. (If you interview people who are hanging out in skid row bars at 2 a.m., you may reach some interesting conclusions about the opinions of people in skid row bars at 2 a.m., but you can't say they're conclusions about people in general.)
Returning to the email: this guy saw patterns in which books were favorites at colleges with different average SAT scores. Addressing #1, he correctly didn't count colleges with very little data. But he blew it on #2, when he titled the page "books that make you dumb," revealing a pretty massive fixation on one aspect of the whole picture, and flying in the face of his assurance that "I know correlation doesn't equal causation."

And besides, on #3, he doesn't even mention the gross sampling error of making an assertion about the book, based on data from Facebook readers who read it AND who participate in listing their favorites. Example 1: the Jesuit scholars at Boston College are highly intellectual, and I imagine that if they ranked their favorite books, the Holy Bible would rank high; but I doubt the Jesuits are ranking books on Facebook, and the Bible ranks among the lowest on this guy's charts.

Example 2: if some book actually made many people so brilliant they ditched Facebook, those people would disappear from this ranking entirely, and all that would remain would be the people who completely didn't get it. And, that book would show up as "making people dumb."

Besides, there's the whole issue of whether SATs are any indication of smartness, not to mention which type of smartness (Gardner's Multiple Intelligences).

He woud have been better off titling it BooksThatLowAndHighSATSchoolFacebookMembersLove.

This isn't just an academic issue - these errors can lead us to drive off a cliff. When we think we see something, and we don't, then with the best of intentions we can make serious mistakes in our conclusions, our policy decisions and our life choices.

Saturday, December 8, 2007

For prettier statistics, omit inconvenient people.

Occasionally I’ll use this bully pulpit for a rant. The two top rantables on my agenda right now are statistics and silos. This time it’s statistics.

I’m irked because I keep seeing a mistake that blows the kneecaps off any well-intentioned effort to improve policy by looking at statistics. People need to be aware of it, spot it, and cry “BS!” when it rears its head.

Earlier this week, in Paul Levy’s blog I got into a discussion in the comments section of a post. Frequent and knowledgeable contributor Barry Carol had wondered if high health care spending around here might be caused in part by a large supply of hospital beds and specialists locally. I said, in part:

I'm intrigued with Barry's observation. (I don’t have an opinion – I don’t know the data he cites; I’m just intrigued.) Is it accurate to say the *cause* is too many beds? Or is it that more are available, so it's possible to give someone the care they need? [I then recounted a story of my father’s care in his final decade, where the hospital staff only seemed to become competent when it was time to kick him out.]

If motorists were spending lots of money on fixing flats, would we say the problem is that we have so many tire repair shops? It's not a perfect analogy, but it's worth looking at. Some cultures think women are the cause of rape, because if there weren't all those women, there wouldn't be all those rapes.

I feel strongly that any statistics about costs and outcomes in a system should have an accountant's note specifying what proportion of the population goes without coverage in that system, so they don’t even have an outcome. Until we get honest about that, all we're doing is chasing a bubble under the blanket.

There’s the rub, the itchy spot. In cases like this, the goal of statistical analysis is to better understand things, particularly to know what a batch of data does or doesn’t represent so we can predict the best way to approach future situations.

And if we don't know what those statistics left out, we don't know what we'd be getting ourselves into by relying on them. We cannot rely on findings until we know what cases were and weren't included.

Increasingly, what might be getting omitted is you. Or someone you love.

As the boomers age, and their decades of productivity and home buying convert to decades of home selling and health costs (who, me?), this is gonna be a big skull-knocking issue. There will be claims about which system works better, with all kinds of statistics being flung around like monkey dung. (Sorry, but monkeys do fling dung when they’re fighting, and when policymakers start fighting, they fling statistics, claiming they're proving reality.)

For health policy, all kinds of claims can be made with good statistical support – but you damn well better ask who got left out, making the picture prettier, whether it was intentional or not.

Personal story: in Massachusetts insurers must price all group policies the same, without considering who’s in the group; New Hampshire has no such law. My wife and I started 2007 with insurance at her job in NH. Without warning, in June her (small) employer’s group rate went up 60%. Why? Because she had turned 60. Young people generally incur lower health costs, so in most states a company can choose to be competitive by selectively offering lower rates to more attractive groups. But when she turned 60, the entire company’s rates went up 60%.

I work in Mass., and it turns out we could get equivalent coverage from my employer (from the same insurer! See my next post) for 40% less.

Now here’s the killer: in NH the disenfranchised can find themselves in real trouble, as policies evolve and unattractive individuals are increasingly isolated. Next personal story: I know a healthy, athletic 20-something whose coverage was costing $2,300 per year (for one person) because she has a minor murmur that’s never caused a symptom, but she wasn’t in a big group. Now she works for a big company, so she’s swallowed up into a big group and gets group rates.

What is the justification for this???

I also know two young families who simply go without coverage because there’s no room for it in their budget. Statistically they are of course counted in the 46 million uninsured – but I say they should also be factored somehow into the total cost of health care, including what it WOULD cost to provide the care they don’t get but would if they could. (Which brings us back to Barry's point about how many hospital beds we have.)

Worse, while excluding those cases, you can bet that the insurance companies (all of them) talked about how good their rates are, and they mean it. (I would - I'm in marketing, and when I believe my company is doing a super job, you bet I say so.) But again, I say you can’t talk about costs and outcomes without specifying whom you’ve excluded.

Final first-hand story: some years ago, when self-employed in NH, I myself found that I couldn’t afford health insurance, because at the time things had evolved to where almost all the AIDS patients in the state were in the category “not a member of any group” – same as me. So any statistics about insurance prices in that state at that time would have been a fat load of crap – flingable crap.

Overlooking the inconvenient people isn’t limited to health care costs. Consider the following, from the US Dept of Labor’s Bureau of Labor Statistics (BLS):

  • Unemployment statistics don’t include everyone who wants a job but can’t find one. Once your unemployment benefits run out, they simply stop counting you. You don’t even exist as a problem anymore, as far as the BLS is concerned. I cannot figure out a legitimate reason for this.

  • There are no statistics for people who eventually gave up on their previous career and are now working for half their previous pay. People in that situation are, again, simply not counted as a concern.

  • Nor are there statistics for the loss of benefits. Employers certainly pay less for no-benefit or feeble-benefit jobs, but if you or I change to a job with no benefits, it doesn’t even make a dent in the pretty statistics.

  • Worst of all, the “jobs created” statistics are a cruel joke. When a full-time job with benefits is carved up into three part-time jobs with no benefits, the BLS counts it as job growth. (I called my Senator’s office and had them check it out; a senior BLS statistician got back to me and confirmed it.)

This is insane. It's as if King Solomon chopped up 1,000 babies and declared a population explosion.

What is wrong with these people?? In May of 2006 an erudite observer in the New York Times remarked with surprise about the 200,000 “new jobs” that had been created in April: “employment [is] doing well, yet core inflation has remained remarkably subdued." Remarkable indeed, until you know what they're calling “job creation."

As I say, until we get honest about this, all we’re doing is chasing a bubble around under the blanket. With the best intentions, we'll make misguided policy decisions. And believe you me, policy has impact at the personal level. The time will come when you (or a loved one) is the bubble everyone wants to chase away. Do whatever you can to stop this crap. Now. Wake up! And wake others up.