## Saturday, February 19, 2011

### Regression to the Mean

My wife and I bought a Wii this past christmas. We only have the few games it came with, and the Wii Fit. For those of you unfamiliar with it, it's a set of games and exercise routines - most of which take place on a balance board, which measures your center of balance, weight, etc. It also includes a "Wii Fit Age," which, by some calculus of body mass index combined with your reaction time, accuracy, stability, etc. indicates your overall body/mind age.

Doing well on the tests (by keeping still and balanced on one foot, or moving the little ball with your center of balance to a specific point, or whatever) results in a lower age score. I have no idea how these things are calculated - although you can generally figure out how to do well on each of the tests through feedback during the test and iteratively over the course of several tests. It's important to realize that these tests are not found anywhere else in the Wii Fit programs. The only time you see them is by calculating your Wii Age. So you can't go and practice all the tests outside of class.

The graph below shows the results for my "Wii Age" over the past two months. The vertical axis represents the difference between the calculated age and my real age. The horizontal axis represents each test from the first to the most recent (roughly comparable to time).

There are a few things you can see in the above data. First, The result of any one test is quite different from another - especially at the beginning. The variability in the results is less as time goes on. The other observation to make is the maximum calculated age decreases, but the minimum age doesn't change much. Also, most of the scores are a little less than my calendric age - I don't know how the programmers decided what a person of my age should do on these tests, but whatever that is, I'm doing a little better. (Go me, eh?)

This is a good, everyday example of what's referred to as "Regression to the Mean." That's to say that there's some average or mean score that represents what I will most likely get. However, each test is different for each trial so there is a chance that any particular test will have tasks that I am good at or ones that, for whatever reason, I have trouble with (tests do reappear, but there appears to be some random-like possibility of any particular test showing up).

Early on, I would find that some tests were very easy and I scored very highly (low age), while other tests were very hard to figure out and I scored very low (higher age). But over time, the poor performances improved because of practice (practice also comes from doing the other games and exercises, to be sure). As a result of improving my "worst" performances, the "mean age" also improves - but with such a small sample size it's hard to say if you can estimate my "true" mean age from these results. There are several variables at work here. The difficulty of any individual test, my level of proficiency with any particular aspect of each test, and that I have been learning the skills required to "pass" each test (balance, timing, etc). I would be wrong to say that any one point reflects my true abilities. A poor result does not imply I am getting worse. A really good result does not imply that I am dramatically improving.

Perhaps, to create a hypothetical analogy to life, I was told that by eating saltine crackers right before the test, I could boost my performance. After a bad test, I may want to eat crackers before the next one. And upon doing better, I would feel that the crackers helped. After a good score, I wouldn't have need for crackers - but doing poorly again might convince me to eat crackers again before the next test. And so on. Replace "Wii Fit" with illness ( your choice) and "crackers" with remedy (again, your choice) and you can begin to see the difficulty with ascertaining the efficacy of a medical treatment. Sick people have good times and bad. If they only take a remedy when feeling bad, they might start believing that the treatment works. When in actuality, the change in results from regression towards the mean might be the only actual effect at work.

Apply this to sports. A terrific rookie season will be hard to follow-up. The game changes, the rules change, and the player may not have as strong a sophomore year (a "Sophomore Slump" as it were). This is probably the reason behind the so-called "Sports Illustrated Jinx." An athlete usually gets on the cover as a result of an outstanding performance. They get on the cover, but there is little room for improvement - regression to the mean ability is likely.

Politics, too. Any political "mandate" cited by an elected official may be completely wrong. These mandates may be nothing more than swings from one side of the political spectrum to the other. There is no mandate - just people wanting something different from what was. WIth a two-party, either or system it's not hard to get that result. The true political zeitgeist may lie somewhere center of the politician. The politician may, after facing opposition, cite some "silent majority" that supports the extreme positions and created said mandate. But this is just imagineering.