The Philosophical Muser: 09/09/18

Sunday, 9 September 2018

Simpson's Paradox: When Things Seem Like Unfair Bias But Are Actually Not.

I remember once reading about a supposed bias in the University of California, Berkeley where they were sued for bias against women who had applied for admission to graduate schools there. The admission figures showed that men applying were more likely than women to be admitted, and the difference was thought to be large enough to infer unfair discrimination:

Applicants Admitted

Men 8442 44%

Women 4321 35%

But the strange part about it was that when examining the individual departments, it transpired that no department was significantly biased against women - in fact, most departments had a small but statistically significant bias in favour of women. Here is the data from the six largest departments:

Department Men Women

Applicants Admitted Applicants Admitted

A 825 62% 108 82%

B 560 63% 25 68%

C 325 37% 593 34%

D 417 33% 375 35%

E 191 28% 393 24%

F 373 6% 341 7%

Given the foregoing statistics, how can it be the case that women tended to do better than men in individual cases but worse overall? What was discovered was that women tended to apply to competitive departments with low rates of admission even among qualified applicants, whereas men tended to apply to less-competitive departments with high rates of admission among the qualified applicants. This can skew the overall picture to look like discrimination when, in fact, it is nothing of the sort.

This is what is referred to in economics as Simpson’s Paradox (which isn’t really a paradox, as I’ll show), after the statistician Edward H. Simpson. What it’s actually to do with is misleading impressions based on percentages and ratios, which can confound expectations. Suppose Jack and Jill are applying for courses at a college over a two week period. In the first week Jill gets accepted into 0 of 3 colleges and Jack gets accepted into 1 of 7. In the second week Jill gets accepted into 5 of the 7 colleges and Jack gets accepted into 3 of 3. Here are their results:

Week 1 Week 2 Total

Jill 0/3 5/7 5/10

Jack 1/7 3/3 4/10

Both times Jack brought about a higher percentage of college acceptances than Jill, but the actual number of colleges into which each was accepted was not the same each week. From an equal sample size, Jill’s ratio is higher and, therefore, so is her overall percentage. It only appears like a paradox when the percentage is provided in isolation from the percentage and the ratio. Based only on percentages, Jack’s is higher than Jill’s on both weeks (14.2% and 100% compared with Jill’s 0% and 71%) even though over 2 weeks Jill’s proportion of college successes is higher. The fact that Jack can be better in each week but worse over 2 weeks is a good underlying principle that’s often repeated in many of the bogus claims of unfair discrimination we see - especially when important causal relations are omitted.