Schoenfield: Stats point to admissions bias

The ratio of women to men applying to Yale has been rising steadily over the past six years. However, the number of accepted males has consistently remained greater than the number of accepted females, hovering at about 51 percent to 49 percent. Yale’s dean of admissions, Jeff Brenzel, says that despite the trend towards more female applicants (55 percent female for the class of 2012), the most qualified male and female candidates must be applying in equal numbers, since Yale’s admission process is gender-blind.

For years now, I have been impressed at how close to 50-50 the makeup of the incoming class always was. But recently I began to wonder if this was too good to be true. So I did some statistical analysis of whether the observed numbers could have come from a truly gender-neutral admissions process, that is, one in which neither the gender of previous applicants nor any factor heavily correlated with gender affects the admission of any given applicant. I have ignored, for the moment, the question of systematic biases against women, focusing solely on the expected standard deviations of the admit percentages — that is, how spread out we would expect the ratio of men and women admitted to be over several years. Allow me an analogy:

Suppose a professor asked students to measure the acceleration of gravity (accepted to be 9.81 meters per second per second) several times by using blocks, rubber bands and a handheld stopwatch. With such a crude setup, the standard deviation would likely be about 0.5 meters per second per second. Even if the experiment were done correctly, one could expect results like 9.83, 10.34, 9.37 and 9.65. Then imagine that one of the students comes back with results of 9.81, 9.80, 9.82 and 9.81. It would be clear that the student’s numbers are much too precise for the experiment he was supposed to do. He must be doctoring his final product.

This is precisely Yale’s situation. If we assume that there is a “qualified” applicant pool made of 51 percent men and 49 percent women every year, by sheer chance we would expect a standard deviation in the ratio of men to women of about 1.1 percent. However, for the entering classes of 2007-2012, Yale’s average standard deviation has been 0.44 percent (roughly 39 students either way). The chances of getting a standard deviation this low in a gender-neutral process is less than 2 percent.

In short, not only is the gender makeup of the accepted class constant — it is too constant. And that was assuming the “qualified” applicant pool had the same makeup every year. If the “qualified” applicant pool changes each year (as I believe it does), Yale’s tiny standard deviation becomes even more remarkable.

I should note that this phenomenon could not simply be the result of giving preference to men. Doing this could keep the average male-to-female ratio at 51 to 49, but it would not pinch the standard deviation.

On the other hand, Yale’s consistent admit rate is not necessarily the result of explicit quotas either. Yale could be counting something else that is heavily correlated with gender. For example, they could be counting the number of prospective engineering majors or the number of football players.

If half of Yale’s admitted class had a hand-picked makeup each year, the likelihood of getting a standard deviation this low increases to a little less than 9 percent. Or if the entire accepted class were split into fixed-size groups, such as engineering majors or the football team, and each group had a gender makeup of 85 percent to 15 percent, Yale’s chance becomes a little more than 8 percent.

Alternatively, Yale’s admissions officers could effect a subconscious balancing act when reading applications by preferring men, for instance, when they have just given women the last two or three spots. For example, my own simulations show that if every time Yale gave away two spots to a woman they automatically gave the next spot to a male, and vice versa, the expected standard deviation would be nearly exactly that observed over the past six years. Some form of this is, in my opinion, the most likely scenario.

Common to all these hypothetical mechanisms, however, is a lack of a gender-neutral selection process. For some women, the chance of being admitted to Yale simply depends on how many women have been admitted previously, or something heavily correlated with that number.

For Yale to avoid these charges, the whole thing would have to be some sort of statistical blip. But this becomes less likely when we examine the numbers at other schools. A similar analysis of the U.S. News and World Report’s top 20 universities and top 10 liberal arts colleges for the entering classes of 2004 through 2010 reveals that most of them have relative standard deviations greater than Yale’s.

More significant still, many have standard deviations far greater than we would expect if we assumed the “qualified” applicant pool each year remained constant. As mentioned above, if this assumption is challenged, the chance of Yale’s super-precise admit ratios being a random error becomes even smaller.

Maintaining gender diversity across Yale is not a bad thing. But we value openness as well. If the process is not truly gender-neutral, applicants have a right to know. And if this phenomenon is somehow unintentional, any official investigation into the gender gap must be charged with explaining it.

Joshua Schoenfield is a junior in Timothy Dwight College.


  • fascinating article

    where did you get your stats? can you make your calculations available online?

  • Joshua Schoenfield

    At the most basic level, the calculation is just a chi-squared comparison between the observed standard deviation in admit ratios and the theoretical one given by sqrt(p(1-p)/n) for the last six years, where p is the proportion of males in the ¨acceptable¨ pool and n is the average number of spots given out.
    Also, I did some simulations that basically repeated the last 6 years many times by assigning each bed to either a man or a woman and counted how many times a standard deviation was less than that observed.
    All the numbers from Yale came from the office of institutional research. The others came from the National Center for Education Statistics.