Monday, September 28, 2015

T-Rank 2016 Preview: Nuts and Bolts

Update for 2017-18:

Been making some pretty big improvements—well, changes—to the preseason projections. Maybe I'll do a separate post on it at some point, but here's the gist:

I've completely revamped the offensive projections, moving away from the prior "team based" model to an almost completely "player based" model. The old model, explained in detail below, was to look at past team performance, then adjust up or down based on the characteristics of the returning players. Now the model starts with projected player performance, and builds a team projection from that. One fun aspect of this is that I'm actually projecting core offensive player stats—offensive rating, usage, and minutes—and you can go look at those on the team pages at the T-Rank site.

Just a couple of team-based effects remain: (1) I've kept the "momentum" modifier for now, and (2) There is a coaching adjustment for teams who've hired a new coach in the last three years. Both of these are pretty ... stupid, I guess is the word—but I like them.

As with all of my T-Rank stuff, this is not to be taken seriously, and it's far from clear this is a "better" way to do things, given my fundamental incompetence. But I've always been convinced that this was the better way to project offensive performance, and Dan Hanner laid out the road map for this model long ago. This new model is basically my attempt to follow the steps he laid out then. Indeed, the compulsion mainly arose after realizing I now have the data and programming ability to do it.

So that's offense. As for defense, I think a team-based model is actually pretty good. In the blog post linked above, Dan Hanner says as much—though he links performance to coach rather than team. So for now I'm continuing with the exact same model as before, with one significant change: an effect based on the projected effective height (that is, height at the center and power forward positions), which is well-correlated with adjusted defensive efficiency.

Update for 2017:

A couple changes to note:

1) I tweaked the recruiting points to make them even more top heavy, but also discounted them less. What this means is that top 10ish recruits are more important, 10-20 about the same, below that less important.

2) Added some coaching effects. This only comes into play for teams with coaches in their 3rd year or less. A coaching change is now expected to induce a reversion to the overall program mean (based on 2002-present). On top of that, each coach has their own ratings on offense and defense, which are based on their past performance compared to their school's overall program mean. Overall, it's a minor thing, but something fun to throw in the mix.


******Original post:

This will be the second year of the full 351-team preseason T-Rank, third year overall.

It's worth emphasizing that this is a hobby of mine. Other preseason ratings are better: more scientific, more well thought out (i.e., thought out at all), etc. But this one is mine.

I don't think I've ever really explained all the inputs to the preseason T-Rank. There's a good reason for this: they're arbitrary. Not completely arbitrary, but pretty arbitrary. And when you look at each individual component too closely, the whole thing seems kind of stupid.

The whole thing is kind of stupid, I guess.

But it's fun.

At this point, I'll be honest, the main function of the preseason T-Rank is to produce an input into what I call the preseason T-Rank+ (brand manager needed), which is a conglomeration of a few similar preseason ratings to create a Voltron-like super rating. Specifically, I intend to combine the preseason T-Rank with the Kenpom preseason ratings and Dan Hanner's preseason ratings (assuming he does them again) to provide the base ratings for the in-season T-Rank that you all love so much (I'm talking to you, mom). This year I think I will probably give them unequal weights, and T-Rank won't be the most hefty. 

Anyhow, here's the basic formula.

T-Rank starts with a "program rating" of every school. This is based on the efficiency ratings of the last three seasons. Previously I used Kenpom historical efficiency data for this. Starting this year, this is pure in-house mojo, because I have the data. I went back and calculated the T-Rank power ratings for 2013 and 2014, and of course there are the 2015 ratings my mom loved so much in real time.

The idea here is that it doesn't take a rocket scientist to figure out that teams that were good last year, and the year before, and the year before that, are probably going to be pretty good this year. Same thing with perennially crappy teams.

The two other main inputs are: returning minutes and returning players. Returning minutes is the main driver of variance of expected defensive efficiency (according to T-Rank). The philosophical idea here is that defense is a lot about team play, smarts, positioning, etc., and players just get better at it -- particularly if a bunch of players are playing together for while. So higher returning minutes means better expected defensive efficiency, lower returning minutes means worse.

The returning players analysis is for the offensive side, and is a little more "granular." There I look at each returning player's offensive rating and usage rating (which measures how often they actually do something on offense), discount it by percentage of team minutes played, and calculate an expected figure I call "Opts" (offensive points). This figure is then adjusted for class. Specifically, sophomores get a 50% bump, juniors get a 30% bump, and seniors and grad transfers get a 10% bump. [For 2017 I've altered the bumps somewhat, more like 40%, 15%, 10%] This reflects the well-known pattern that college players generally take their biggest leap as sophomores, with lesser improvements thereafter.

I add up the returning Opts for each team and then adjust their "program" Offensive rating based on  whether it is above or below average.

I also make an effort to track every transfer in and out, although I'm sure I miss a bunch. [Edit: for 2017 I've more or less automated this, which is one of the great accomplishments of my life.] Transfers in get their Opts added, but somewhat discounted, and there's a separate calculation for their expected effect on defense. This is all pretty guessy-bessy, to coin a phrase.

Similarly, top 100 recruits are tracked and accounted for. The number one recruit has an effect similar to a returning all-American level player. But the effect tapers out pretty quickly. The idea here is that most of the effect of good recruiting is probably already reflected in the "program rating." At this point, we don't really need to know much about the specifics of Duke's recruiting class to know that they're going to be pretty good. 

Finally, there is a "momentum" algorithm, which rewards (or penalizes) teams that are on the way up (or down). These are teams that significantly over- or under-performed their "program rating" last year, with signs (in terms of returning players) that the trend may continue. This year, for example, Vanderbilt and Utah are programs notably on the rise; on the flip side, Creighton and St. Louis seem to be sinking. 

Put it all together, and -- voila! -- the preseason T-Rank. I'll unveil the output in the coming weeks, in dribs and drabs to maximize drama and advertising revenue. Play Draft Kings Daily Fantasy!

4 comments:

  1. Hey Bart, I'm wondering if you still publish your T-Rank Projections for previous seasons. For example, I would like your Projections for the 2020-21 season for each team.

    Thanks!

    ReplyDelete
    Replies
    1. Yes, the prior preseason projections are preserved at e.g. trankpure21.php -- change the number to get prior years back to 2016.

      Delete
  2. Hey Bart. Can't thank you enough for all of this info. I noticed that the trankpureYEAR.php pages for 2020 and earlier are missing Talent numbers (as well as experience and transfer points). Did those numbers not exist back then or is it a bug? Thanks in advance!

    ReplyDelete
    Replies
    1. Correct, those pages are archives of the pages as they existed and those figures were not calculated back then.

      Delete