D’AMBROSIO: Race, class and statistics

Justin D'Ambrosio 11:07 pm, Apr 20, 2017

Guest Columnist

You’ve all heard it. Most of you have probably even said some variant of it. “That area is sketchy.” Or, “that area is dangerous.” Or perhaps, “that’s a bad neighborhood.” What do we base these judgments on? Are they justified? Are they true? Or are they merely expressions of our biases and fears?

Inspired by the work of my friend and colleague in the philosophy department, Jessie Munton, I’m going to try to convince you that even if these kinds of claims are true, they’re not in general justified. Rather, they often encode, and are based on, our biases against and fears of poor people and underrepresented groups in our city. And yet we hear them all the time, and hardly bat an eye.

When you moved to New Haven, you probably heard that New Haven was dangerous. You probably saw a news article that ranked cities by their violent crimes per capita, and had New Haven listed somewhere in the top 10. Some of you probably took it almost as a point of pride, bragging about New Haven’s — and by extension your — edginess.

By now, you’ve taken the advice implicit in that ranking to heart. you avoid New Haven’s urban neighborhoods, preferring instead to sit comfortably inside Yale’s walls and gates. Perhaps you justify your lack of integration, or even exploration, on the basis of a few city-wide statistics.

Let’s suppose that you actually did hear about some such statistics, and that those statistics were roughly correct. After all, certain cities are indeed more dangerous than others, at least if we allow that the statistical prevalence of violent crime is an indicator of danger. But here is a general problem with statistical reasoning: Statistics presuppose a domain of which they are supposed to be representative, and the statistic in question may not be representative of every subdomain. For instance, suppose that, as is in fact the case, lower socioeconomic status in New Haven correlates with the likelihood of committing a violent crime. Crime-statistics are rarely given on a neighborhood-by-neighborhood basis, much less on a block-by-block basis. Such neighborhoods can vary in important ways for which the statistics in question do not account. We could, of course, try to find out more information about particular neighborhoods, but very few of us read the weekly New Haven crime reports.

So how do we deploy these statistics in our reasoning? Well first, we ignore domain variation. We presume that if a statistic holds for the whole city, it will hold for this particular block on which we’re walking, or for this particular person walking toward us on the sidewalk. In so doing, we flatten that person (or neighborhood) out into a single feature — the feature that is relevant to the statistical inference. We presume, on the basis of a statistic that holds for the city, that this particular person is more likely to commit a crime because they have one particular property.

But this is absolutely unjustifiable. People, and the neighborhoods in which they live, have lots of features, many of which confound to any large-scale statistical measure. Just think — you, in virtue of your status as a Yale student, are way more likely than the average American to be involved in a corporate embezzlement scam. But you might be an English major, and never plan to go into business at all, much less embezzle money. So it turns out you aren’t more likely to embezzle, all things considered. We just presumed that you were because we focused on one of your properties: being a Yale student. The conclusion is defeasible! Statistical inference is a form of what logicians call non-monotonic inference: learning new information can invalidate what we thought was a good inference. In other words: the types of inferences I’m talking about are only rational if we keep our blinders on.

This problem for statistical reasoning is a variant of what is known as “The Problem of the Reference Class” in statistics. People have many properties, and whenever we try to make an inference from a general statistical claim to a more specific conclusion, we are forced to pick out one particular property as the relevant one, and ignore the others. But it often turns out that focusing on those countless other features will yield different conclusions about the person and their prospective behavior.

But now ask yourself: how many times have you called a area in New Haven “dangerous” because you’ve seen the signs of urban poverty there? And is there any way of justifying your doing so? I am sure that the answer is “no”.

Justin D’Ambrosio is a graduate student in the Philosophy Department. Contact him at justin.dambrosio@yale.edu .

JUSTIN D'AMBROSIO

Tweets by yaledailynews