Can a Horse Do Math? The Story Of Clever Hans and a Statistics Problem He Inspired
In 1904, a panel convened by the German Board of Education decided that a horse was capable of doing math. Read that again if you need to. I wouldn’t blame you if you thought I was making this up.
The horse in question was known as Clever Hans, and he was owned by Wilhelm Von Osten, a math teacher, horse trainer and, for good measure, a phrenologist and part-time mystic. Von Osten would ask the horse a question that involved some sort of numerical answer like ‘If Monday is the 3rd of the month, what date is Saturday?’ and Clever Hans would tap his hoof the appropriate number of times (in this case eight). The horse was taken on tour across Germany for all to see his mathematical talents, and was even featured in a report in the New York Times.
The investigation into Clever Hans
The attention that was generated by Clever Hans led the German Board of Education to commission a research panel tasked with looking into whether Von Osten’s claims about his trusty steed were genuine. The panel of thirteen people — led by a renowned psychologist and including a circus manager and several schoolteachers — concluded that the horse was indeed understanding and correctly answering his owner’s questions.
Luckily, the Board of Education decided to pressure test the panel’s conclusions by passing them on to biologist and psychologist Oskar Pfungst. Pfungst, who apparently had a far superior mind for research methods than anyone on the panel, immediately put some (fairly obvious) testing methods in place. He separated Hans and his owner from spectators, used other people to ask the questions, blocked the horse from seeing the questioner, and used questions that the questioner did not know the answer to. You can probably see where this is going.
Pfungst demonstrated through repeated trials that the horse’s likelihood of answering the question correctly had a very high dependence on whether he could see the questioner and whether the questioner already knew the answer to the question. In this case Hans had a success rate of 89% versus 6% otherwise. Pfungst conclusion was that Hans was not doing math. Instead, like any well-trained animal, Hans was very good at picking up subtle clues from the body language of the humans he observed.
The Clever Hans Effect is a term used today for many analogous phenomena where an answer is correct, but not for the reasons we might idealistically hope for. For example, in Artificial Intelligence, it is used to describe when an algorithm makes an accurate prediction but not based on the purpose of its programming (for example when an image detection algorithm correctly describes an image, but only because there was text on the image that helped describe it).
A statistics problem inspired by Clever Hans
Cambridge University set a problem in their 2002 mathematics entrance examination that was inspired by Clever Hans. The problem went as follows:
Harry the Calculating Horse will do any mathematical problem I set him, providing the answer is 1, 2, 3 or 4. When I set him a problem, he places a hoof on a large grid consisting of unit squares and his answer is the number of squares partly covered by his hoof. Harry has circular hoofs, of radius 1/4 unit.
After many years of collaboration, I suspect that Harry no longer bothers to do the calculations, instead merely placing his hoof on the grid completely at random. I often ask him to divide 4 by 4, but only about 1/4 of his answers are right; I often ask him to add 2 and 2, but disappointingly only about π/16 of his answers are right. Is this consistent with my suspicions?
I decide to investigate further by setting Harry many problems, the answers to which are 1, 2, 3, or 4 with equal frequency. If Harry is placing his hoof at random, find the expected value of his answers. The average of Harry’s answers turns out to be 2. Should I get a new horse?
Determining if we should be suspicious of Harry
The following simple diagram is key to how we can answer this question. It represents one of the unit squares of the very large grid. Note that we assume here that the grid is so large that we don’t need to worry about the extreme edges of it.
The different shaded regions of this diagram explain the possibilities for where Harry’s hoof might land:
If the center of Harry’s hoof lands anywhere in the central square of this diagram, which has a side length of 1/2 and hence a total area of 1/4, then his whole hoof will land within the square.
If the center of Harry’s hoof lands in one of the quarter circles in each corner, his hoof will partially cover four squares. These add up to a circle of radius 1/4, and hence of area π/16.
If the center of Harry’s hoof lands in one of the shaded rectangles, his hoof will partially cover two squares. Each rectangle has dimensions 1/2 times 1/4 and so the total area covered by these rectangles is 4/8 or 1/2.
In all other situations, Harry’s hoof will partially cover three squares. This is represented by the white areas in the diagram, which has total areas 1–1/4-π/16–1/2.
We can regard the unit square above as a total probability space, and therefore each area we have calculated represents the probability that Harry would randomly select a given answer to a question.
Now the question states that, when asked to divide 4 by 4, Harry put his hoof in one square 1/4 of the time, which is consistent with a random outcome as calculated above. Also, when asked to add 2 plus 2, Harry put his foot on four squares π/16 of the time, again consistent with a random outcome. So yes, we are right to be suspicious about Harry.
Should we send Harry out to pasture?
Now we do a proper test to decide whether to retire Harry or not. By asking a set of questions that are equally likely to have 1, 2, 3, or 4 as the answer, we can expect a mean (expected) answer of 1/4 + 2/4 + 3/4 + 4/4 = 5/2 if Harry was answering correctly.
Now, if Harry was answering randomly we would use our probabilities above against each of the answers 1, 2, 3 and 4, which would give us an expected answer 1×(1/4) + 2×(1/2) + 3×(1–1/4-π/16–1/2) + 4×(π/16) = 2+π/16. So the fact that Harry’s mean answer is 2 indicates that he is even further away from an accurate set of answers than he would be if he randomly stomped his hoof.
Poor Harry! Time for him to enjoy his retirement!
What did you think of the story of Clever Hans and the math problem he inspired? Feel free to comment!
MCIPD
6dInteresting