With the Ivy League football season set to start on Saturday, Matt Robinson of the Yale Undergraduate Sports Analytics Group offers an in-depth, computational look at the 2016 Ancient Eight landscape.

With the first weekend of Ivy League matchups approaching, there is much speculation as to who will claim the league title come late November.

Not surprisingly, Harvard is the favorite in most preseason polls. However, many believe that Penn, co-champion a year ago, will repeat its 2015 performance and challenge Harvard for the top spot. Others think that Dartmouth, which also earned a share of the championship, similarly has a chance to repeat.

We at the Yale Undergraduate Sports Analytics Group thought it might prove useful to supplement these predictions with some numbers. Using these statistics, we have a better idea of what to expect on average and also where Yale stands in the Ivy League.

Our predictions involve two different methods of analysis. First, we gathered data from Jeff Sagarin’s 2016 college football ratings. Sagarin’s ratings are widely used and are complete for all FBS and FCS football teams, including every Ivy League team. The ratings are a combination of three different score-based methods, meaning that they rely heavily on margin of victory as an indicator of a team’s strength. After each team is given a rating, predicting the outcome of a matchup is quite simple: the expected difference in score is simply the difference in ratings for the two teams. The only modification is that the home team is also given an advantage of 2.7 points.

For our analysis, we took the predicted spread of each game and converted it to a probability. For this step we used a popular converter, which relies on past data to estimate win probabilities from spreads for college football games. With probabilities in hand, we were also able to run Monte Carlo simulations for the upcoming season, which basically means that we play out the season thousands of times on a computer and observe the average results.

The last thing to notice is that Sagarin’s ratings update very quickly. For example, Yale’s rating was hurt severely by its lopsided loss to Colgate last Saturday. Similarly, Penn’s rating suffered due to its loss to Lehigh by a large margin. In case these losses represent first-game anomalies, we have included results from both pre- and post-week 1 simulations.


A few things stand out in the graphs. In particular, with one lopsided loss, Yale dropped from fourth to sixth in the league projections using the Sagarin ratings. Dartmouth, meanwhile, improved its rating significantly after an unexpected win over New Hampshire. Even with such surprises happening elsewhere in the league, Harvard’s chances to win the Ancient Eight remained relatively constant.

To supplement our use of the Sagarin ratings, we decided to create a ratings system of our own. Following the model of Nate Silver’s site FiveThirtyEight, we decided to use Elo ratings. Elo is a popular ratings system that first came into use for ranking chess players. It has now been applied to a diverse range of fields including NFL football, NBA basketball, FIFA soccer and even the dating app Tinder.

Essentially, Elo works by taking into account only a few simple factors: wins and losses, home-field advantage and margin of victory.

Initially, each team is given an Elo rating of 1500, which roughly corresponds to the average rating. Then, after each game, the ratings are updated based on the final score, location and strength of each team. For example, a team gains more points if it wins by a large margin. It also gains more points if it beat a team rated much higher than it was. But the opposite also occurs; if the winning team gains x points, the losing team must lose x points. Additionally, it should be noted that home-field advantage corresponds to 65 Elo points in this model.

The powerful simplicity of Elo makes it tractable, but also naturally leads to a few problems.

First, Elo ratings do not have access to any knowledge beyond game results; they simply take into account past performance. Predictions using Elo, for example, will not change if a team’s star player has just been hurt. And the Elo ratings at the start of this season do not consider that Dartmouth just lost 16 seniors.

Furthermore, the system does not account for specific matchups. As an example, the ratings cannot discern styles of play that are specifically suited for beating another team. Despite these issues, Elo still performs consistently well in a wide range of sports.

We present preliminary Elo ratings for the Ivy League, which uses a very similar model to the one FiveThirtyEight uses for the NFL. We suggest you visit FiveThirtyEight.com for more on specific formulas. We are still working out some quirks ourselves, since college football is quite different from the NFL. Just think if an NFL team played only 10 games a year and were to have 25 percent of its roster change after every season. Therefore, we have changed our system to update slightly quicker and regress to the mean slightly more after each season, though the general framework remains almost entirely the same.





Note that the Elo projections did not change after week one because they only take into account Ivy League results. It is also worth mentioning that the Ivy title percentages need not add up to 100 percent because two or more teams can share the title.

(Graphics by Ellie Handler)

We are undergraduates passionate about the field of sports analytics. Members discuss the current state of advanced statistics in a variety of sports and attempt to contribute to the field through original research. Check out more great content at our website, sports.sites.yale.edu!