Monday, August 24, 2015

2015 College Football!

It's baaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaack! College football is back! I can't even fully articulate why, but college football is just my favorite. I grew up in Portland as a Blazers fan, I live in Seattle and am a Seahawks fan, I like the Mariners because they're something to do on a warm summer evening.

But nothing is quite like college football. It's just different! I'm still working on getting the week 1 excitement index up, and the team dashboards live, but in the meantime I have plenty to share! I'll also be tweeting out team dashboards for every team over the next few days @actuarygambler.


Who are the best and worst teams?

To the surprise of no one, Ohio State is #1. On one hand they have three of the best quarterbacks in the country, and on the other hand their team stacked basically everywhere. On the other end of the scale, FBS newcomer UNC Charlotte starts the season on the bottom. Welcome to the show Charlotte!




Who plays the toughest schedule?

Bama. They have a brutal stretch mid-season of @Georgia, Arkansas, @Texas A&M, Tennessee, Bye, LSU, @Miss St. Yikes.


We can dispel that idea about SEC playing cupcake schedules. The SEC plays the toughest overall schedule (followed by the Pac-12), and they play a non-conference schedule on par with every other conference.

If you're looking for teams to mock for playing a pansy non-conference schedule, here's that list: Mississippi State, Baylor, Arizona, Oklahoma State, NC State.



What's new this year in the model?

Uncertainty in team ratings, and a more developed CFP model! And I'm changing the name of the Watchability index to Excitement index. Watchability never sat well with me as a word.

Uncertainty
After much consultation with my colleagues, I've added components to the model to account for the idea that, while a team's rating does represent an average estimation of that team's strength, there's uncertainty around that estimation. Most people think UCLA will be pretty good, and they probably will be! But they might not be, and we have no way of knowing that. To reflect that, when calculating outputs that involve simulating the season, the model samples a team's rating in each simulation from a given distribution. In some simulations UCLA is exactly what we thought, while in some simulations they're a little bit better, and in some simulations, they're a little bit worse.

What does this look like? Let's look at the distribution for UW and UCLA:


The model has UCLA rated as probably between 0.800 and around 0.920, and UW between probably around 0.400 and 0.800. So every time it runs a simulation of the season, instead of using one value for each of UCLA and UW, it picks a random value from these distributions. For UW it's likely to pick a number around .550-.600, but sometimes the season run a simulation with UW rated at 0.800! We have to recognize that there's a chance UW IS actually better than UCLA this year, and simulate some seasons that reflect this.

So that's the first new thing. It figures into some parts of the model's output like CFP likelihoods, conference championships, and not into others, like single game odds. The reason it doesn't figure into single game odds is they already have that uncertainty baked in. If UW and UCLA were to meet, the model would give UW a 14% chance to win that game. Inherent in that 14% are scenarios where UW is actually the better team this year.


CFP Model

The CFP committee are a crafty bunch. Last year when the rankings started, I watched all the talks by Jeff Long, tried to get an idea of what it was they cared about, and built an ad-hoc model to try and predict their rankings. This year I took it up a notch. Based on their rankings, and their defense of those rankings, I identified these 6 things the committee cares about:


  • How a team has performed on the field (who they've beaten, and by how much)
  • How a team is predicted to perform in the future
  • Number of losses (losses to good teams are discounted)
  • Wins (of any kind) over good teams
  • General strength of schedule
  • Recency

Each of these things independently improve the predictive power of my CFP model. I used Stata and Excel to figure out how much to weight each element and generated "predictions" fit to last year's data. The scatter plot below shows how the model's predicted CFP ranking compared to its actual rating, in each of committee's weekly rankings. For example, in week 9, Mississippi State was the predicted #1 and the actual #1; this is indicated by a dot at 1,1 on the chart. There are dots of different sizes because some dots have multiple observations. The CFP model correctly guessed the #1 team in each of the 8 rankings, so the dot at 1,1 is big.

You can see it all on the graph below; ultimately the correlation between Predicted CFP Ranking and Actual CFP ranking was 0.878! Not bad when you're trying to use math to predict what a committee of people using a completely opaque process will do.










Sunday, August 9, 2015

2015 College Football is coming!

We're under a month to College Football kickoff! Who's excited!

The model is coming along nicely. The betting lines the model would set for week 1 games are aligning well with those Vegas has already set, and I've made a bunch of technical improvements to the model this year; smoothing out edges.
  • Bowl logic is tighter. It more comprehensively addresses teams with specific bowl tie-ins and conferences with bowl tiers (vs. bowl rankings)
  • Added logic to address the group of five rule
  • Removed Sagarin from the suite of data I used to seed pre-season ratings
  • Inter-conference strength will be calibrated (in part) using games from weeks 1-3
I still have some pre-work (i.e. updated dashboards and expanded dashboards) but I now think I'll also have time for some enhancements.
  1. New and improved CFP Committee Model
  2. Expand "If they win/If they lose" concept to more teams, more weeks, more scenarios (e.g. this post)




Friday, December 19, 2014

College Football Math: Bowls!

Bowl season starts tomorrow. 76 teams and 39 games and Christmas and the inaugural College Football Playoff! Can you stand it???

But before we get to the bowl games, a quick note. This is likely my last big post on College Football for the year, and I want to get a little sappy. I feel lucky to be able to write this blog and do math on College Football and other things; it's such a rewarding hobby. Thanks to my beautiful wife who has little interest in football math for proof-reading each post, often while she was falling over with sleepiness. What a great wife she is! Thanks to everyone for reading all season. I hope it was as fun and interesting for you as it was for me. Stay tuned for the next big topic!

Now back to business.

Who's going to win?

America already won. Having 4 teams in an inaugural playoff adds so much fun to the college football regular season. So many teams were in contention late, so many regular season had CFP implications, I loved it! Hope you did too.


OK, which team is most likely to win the CFP?

Oregon


Maybe some math?

Oregon and Alabama are nearly equally likely to win their semifinals, but the model rates Oregon slightly higher than Bama so Oregon is the model favorite to win the CFP.



OK, what CFP final matchups are we likely to see?

With both teams being 9-10 points favorites, Oregon vs. Bama is the most likely finals match-up. But there's a 50/50 chance we get something else!




What's the actual most likely outcome of this whole shebang?


Oregon beating Alabama by 3 is the single most likely outcome for the college football playoff.



You know there are other bowls right? Not just the CFP?

Yes.


What about them?

Well, these are the most watchable games of the bowl season. Notice that the CFP semifinals aren't the top two. Remember, Watchability is a context neutral statistic that doesn't know about bowl games or CFPs or ducks. Also, the team in darker green is the modeled favorite.



Here's the full schedule, click to enlarge.

Last new topic: How's the bowl season shaping up by conference?

In terms of bowl representation, SEC forever.
  • The SEC has 14 teams
  • 12 of those teams are playing in bowl games
  • The model has the conference favored to win in all 12 games
Next comes the ACC with 79% (11/14) of its teams in bowls, then Independents, then Big Ten (what? Big Ten??? It surprised me too, but wait for it, the Big Ten's reputation will be redeemed).



Going to a bowl is great, but winning a bowl is what you want to do!

Of course. The SEC is expected to win the most bowl games (8.1) but the Pac-12 is expected to have a higher winning %.

Here's where the Big 10 resumes its familiar spot.




A last bit of bowl game math: here's a chart showing how likely each number of wins is for each conference. For example, the SEC has a 6% chance of winning 11 of its bowl games, and a 1% chance of going 12/12



That's a wrap on the major college football math. Thanks again for reading and go Huskies!

Friday, December 5, 2014

College Football Math: Last Week of the Regular Season

It comes down to this. At last, on Sunday we'll find out which are the four initial CFP teams. But there's a lot of football to be played between now and then. According to the model, if Oregon and Bama (and FSU, kind of) lose they still have a chance to back into the CFP. Everyone else is lose and you're out.

Enjoy!

Also both dashboards have been updated.



Wednesday, November 26, 2014

College Football Math: Week 14

edit: both the dashboard and extended dashboard are up to date

For this week I refreshed my chart from week 12 showing how team's CFP chances are likely to look after this weekend. For most teams, this weekend is do or die; lose and you are functionally out. The 3 top CFP teams (Oregon, FSU, Bama) are the only teams with even a little wiggle room.



Most Watchable Games


  • Most watchable games from this week are listed below
  • CFP Leverage is the difference winning makes (vs. losing) in the team's CFP chances. For example, Mississippi State has a 61% better chance to make the CFP if they win than if they lose (more about this)
  • Watchability is a measure of how good the teams are, and how close the game is likely to be
  • The box up and to the right of this post has more background information on all the math and links to both dashboards


Full schedule 


Wednesday, November 19, 2014

College Football Math: Week 13


Do you watch college football solely for the national championship? If yes, you're doing it wrong. This week will break you of that. While plenty of teams with CFP aspirations are playing this week, they're mostly playing cupcakes like Indiana and Western Carolina. The top games look like fun to me but the have no relevance for the College Football Playoff. Just need to forget the CFP exists and enjoy football. 

Plus there's always the chance Boston College upsets FSU and we're spared the farce of FSU in the CFP.



  • Full Schedule is below
  • Watchability is a measure of how good the two teams are, and how close the game is likely to be



    • This post has way more detail on everything Watchability. Quick refresh:
      • Watchability is a combination measure of how good the two teams playing are and how likely the game is to be close
      • Teams are shaded by their chance to win, the greener the better

    College Football Math: Dashboard Updates

    Dashboards are updated through week 12.