Entering the 2016 season, the Yale football team had the top freshman recruiting class in the Ivy League and the second overall freshman class in the entire Football Championship Subdivision, per 247Sports Composite Team Rankings. Yet Yale currently sits at fourth in the Ivy League.

The Yale Undergraduate Sports Analytics Group wanted to examine how well recruiting success translates into success on the field. Using data for all of Division I football — including both Football Bowl and Football Championship Subdivision — we tried to determine how well the strength of a team correlates with the caliber of its roster. Specifically, we looked at how all the Ivy League teams performed relative to their recruited talent.

To perform our analysis, we gathered data on both recruiting and overall strength of the team. For our recruiting data, we used the 247Sports Composite Team Ratings. In college recruiting, each player is given a number of stars out of five, with five-star recruits being the highest rated high-school players. Then, 247Sports converts the star rating system into a grade. Five-star prospects are rated 98–100, four-star prospects are 90–97, three-star prospects are 80–89 and two-star recruits and unrated players fall between 0 and 79. Yale recruiting class of 2020 comprised six three-star recruits and five two-star recruits, averaging a rating of 76.87.

Then, 247Sports generates a Composite Team Rating that determines the overall strength of a recruiting class by taking the ratings of all committed players into account. According to an explanation of their methodology on their website, “each recruit is weighted in the rankings according to a Gaussian distribution formula (a bell curve), where a team’s best recruit is worth the most points. You can think of a team’s point score as being the sum of ratings of all the team’s commits where the best recruit is worth 100 percent of his rating value, the second best recruit is worth nearly 100 percent of his rating value, down to the last recruit who is worth a small fraction of his rating value. This formula ensures that all commits contribute at least some value to the team’s score without heavily rewarding teams that have several more commitments than others.”

Since the current roster for each team is comprised of recruiting classes over the last four years, our calculations use an average rating of the last four recruiting classes to give an indication of the current roster’s talent.

For the overall data, we used Jeff Sagarin’s 2016 college football ratings. Unlike other rating systems, Sagarin’s ratings incorporate both FBS and FCS teams, which is important since our recruiting rankings also examine both FBS and FCS teams. The ratings combine three score-based methods, and margin of victory is a major contributor to a team’s overall rating. Each team is assigned a rating, and the difference between the ratings of the two teams is the expected difference in score of the game. The only adjustment made is an advantage of 2.7 points awarded to the home team.

Sagarin’s ratings are highly responsive and update week-to-week, meaning individual games can heavily impact a team’s rating. For example, Yale’s 55–13 loss to Colgate earlier in the season dropped the Bulldogs’ Sagarin rating by a large margin, while Princeton’s recent 56–7 victory at Cornell greatly increased its rating.

With both recruiting and overall strength ratings in hand, we plotted the two against one another to examine the correlation.

The average of the 247Sports recruiting rankings are on the horizontal axis, and the Sagarin ratings on the vertical axis. The line running through the middle of the graph is the trend line that explains the relationship between recruiting and team strength. Essentially, it gives a prediction of what a team’s strength would be given the average rating of its last four recruiting classes. The overall trend line had an R squared value of 0.714, which indicates a relatively strong correlation between the two variables.

If a team is above the trend line, that means its players have outperformed our model’s expectations. Conversely, if a team is below the line, that means it has underperformed relative to expectations based on the talent of its roster. The distance of a team from the trend line is the magnitude by which they have over- or underperformed. This data for all the Ivy League teams can be found in the accompanying table. As can be seen, among Ivy League teams Yale most drastically underperforms relative to its predicted strength.

One thing to note on the graph is the large gap from approximately 50 to 100 on the horizontal axis. The gap represents a difference in caliber of recruits between FBS and FCS teams, as FCS teams are bunched at the lower end. This is due to the 247Sports rating system, which assigns unrated players a rating of zero. The majority of FBS recruiting classes are composed of rated players, which would give any unrated players little weighting as a result of using a Gaussian distribution. However, FCS recruiting classes will have many unrated recruits, which means players with ratings of zero are factored into the overall ranking of a school’s recruiting class. This difference causes the divide seen in the graph. Furthermore, it is harder to fully differentiate the talent of FCS rosters, since little data exists on many of the players.

Another possible source of error is the equal weighting of separate classes. By using an average of the previous four recruiting classes, freshmen are weighted equally as much as seniors in our formula. For this preliminary model, we decided on the simplest breakdown, which is to weight all four classes equally. Starting rosters will mostly be comprised of older, more experienced players; however, some freshman are able to contribute earlier on, as evidenced by the standout running back Alan Lamar ’20 for Yale this season.

On the other end of the spectrum, some of the best would-be senior players declare early for the NFL Draft since they are already ready to play at a professional level. Especially with a Gaussian distribution that is already skewed toward the best players in class, we did not want to weight superstar seniors even higher if they were not even playing for their respective teams. However, in future models we may attempt to see if different weights of each recruiting class provides further insight.

Additionally, it appears that teams in the FCS do not follow this linear model as clearly, which is shown by the heavy concentration of data points on the left side of the graph. Again, this concentration is perhaps due to the Gaussian weighting rating system discussed above. Nevertheless, this model still gives an idea of how a team performs relative to the talent on their roster.

In this way, our model shows that Yale appears not to be maximizing the talent of its players, while both the University of Pennsylvania and Princeton seem to have teams that are greater than the sum of their parts. Harvard, meanwhile, has a rating that is only slightly higher than its predicted rating; however, the strength of Harvard’s roster is such that its predicted Sagarin rating is already quite high.

But with Lamar and fellow freshman Kurt Rawlings ’20 confirmed to be starting this weekend, Yale has the chance to realize the potential of its highly rated, freshmen recruiting class.

Contact the Yale Undergraduate Sports Analytics Group at yalesportsanalytics@gmail.com .