Monday, September 28, 2015

T-Rank 2016 Preview: Nuts and Bolts

Update for 2017-18:

Been making some pretty big improvements—well, changes—to the preseason projections. Maybe I'll do a separate post on it at some point, but here's the gist:

I've completely revamped the offensive projections, moving away from the prior "team based" model to an almost completely "player based" model. The old model, explained in detail below, was to look at past team performance, then adjust up or down based on the characteristics of the returning players. Now the model starts with projected player performance, and builds a team projection from that. One fun aspect of this is that I'm actually projecting core offensive player stats—offensive rating, usage, and minutes—and you can go look at those on the team pages at the T-Rank site.

Just a couple of team-based effects remain: (1) I've kept the "momentum" modifier for now, and (2) There is a coaching adjustment for teams who've hired a new coach in the last three years. Both of these are pretty ... stupid, I guess is the word—but I like them.

As with all of my T-Rank stuff, this is not to be taken seriously, and it's far from clear this is a "better" way to do things, given my fundamental incompetence. But I've always been convinced that this was the better way to project offensive performance, and Dan Hanner laid out the road map for this model long ago. This new model is basically my attempt to follow the steps he laid out then. Indeed, the compulsion mainly arose after realizing I now have the data and programming ability to do it.

So that's offense. As for defense, I think a team-based model is actually pretty good. In the blog post linked above, Dan Hanner says as much—though he links performance to coach rather than team. So for now I'm continuing with the exact same model as before, with one significant change: an effect based on the projected effective height (that is, height at the center and power forward positions), which is well-correlated with adjusted defensive efficiency.

Update for 2017:

A couple changes to note:

1) I tweaked the recruiting points to make them even more top heavy, but also discounted them less. What this means is that top 10ish recruits are more important, 10-20 about the same, below that less important.

2) Added some coaching effects. This only comes into play for teams with coaches in their 3rd year or less. A coaching change is now expected to induce a reversion to the overall program mean (based on 2002-present). On top of that, each coach has their own ratings on offense and defense, which are based on their past performance compared to their school's overall program mean. Overall, it's a minor thing, but something fun to throw in the mix.

******Original post:

This will be the second year of the full 351-team preseason T-Rank, third year overall.

It's worth emphasizing that this is a hobby of mine. Other preseason ratings are better: more scientific, more well thought out (i.e., thought out at all), etc. But this one is mine.

I don't think I've ever really explained all the inputs to the preseason T-Rank. There's a good reason for this: they're arbitrary. Not completely arbitrary, but pretty arbitrary. And when you look at each individual component too closely, the whole thing seems kind of stupid.

The whole thing is kind of stupid, I guess.

But it's fun.

At this point, I'll be honest, the main function of the preseason T-Rank is to produce an input into what I call the preseason T-Rank+ (brand manager needed), which is a conglomeration of a few similar preseason ratings to create a Voltron-like super rating. Specifically, I intend to combine the preseason T-Rank with the Kenpom preseason ratings and Dan Hanner's preseason ratings (assuming he does them again) to provide the base ratings for the in-season T-Rank that you all love so much (I'm talking to you, mom). This year I think I will probably give them unequal weights, and T-Rank won't be the most hefty. 

Anyhow, here's the basic formula.

T-Rank starts with a "program rating" of every school. This is based on the efficiency ratings of the last three seasons. Previously I used Kenpom historical efficiency data for this. Starting this year, this is pure in-house mojo, because I have the data. I went back and calculated the T-Rank power ratings for 2013 and 2014, and of course there are the 2015 ratings my mom loved so much in real time.

The idea here is that it doesn't take a rocket scientist to figure out that teams that were good last year, and the year before, and the year before that, are probably going to be pretty good this year. Same thing with perennially crappy teams.

The two other main inputs are: returning minutes and returning players. Returning minutes is the main driver of variance of expected defensive efficiency (according to T-Rank). The philosophical idea here is that defense is a lot about team play, smarts, positioning, etc., and players just get better at it -- particularly if a bunch of players are playing together for while. So higher returning minutes means better expected defensive efficiency, lower returning minutes means worse.

The returning players analysis is for the offensive side, and is a little more "granular." There I look at each returning player's offensive rating and usage rating (which measures how often they actually do something on offense), discount it by percentage of team minutes played, and calculate an expected figure I call "Opts" (offensive points). This figure is then adjusted for class. Specifically, sophomores get a 50% bump, juniors get a 30% bump, and seniors and grad transfers get a 10% bump. [For 2017 I've altered the bumps somewhat, more like 40%, 15%, 10%] This reflects the well-known pattern that college players generally take their biggest leap as sophomores, with lesser improvements thereafter.

I add up the returning Opts for each team and then adjust their "program" Offensive rating based on  whether it is above or below average.

I also make an effort to track every transfer in and out, although I'm sure I miss a bunch. [Edit: for 2017 I've more or less automated this, which is one of the great accomplishments of my life.] Transfers in get their Opts added, but somewhat discounted, and there's a separate calculation for their expected effect on defense. This is all pretty guessy-bessy, to coin a phrase.

Similarly, top 100 recruits are tracked and accounted for. The number one recruit has an effect similar to a returning all-American level player. But the effect tapers out pretty quickly. The idea here is that most of the effect of good recruiting is probably already reflected in the "program rating." At this point, we don't really need to know much about the specifics of Duke's recruiting class to know that they're going to be pretty good. 

Finally, there is a "momentum" algorithm, which rewards (or penalizes) teams that are on the way up (or down). These are teams that significantly over- or under-performed their "program rating" last year, with signs (in terms of returning players) that the trend may continue. This year, for example, Vanderbilt and Utah are programs notably on the rise; on the flip side, Creighton and St. Louis seem to be sinking. 

Put it all together, and -- voila! -- the preseason T-Rank. I'll unveil the output in the coming weeks, in dribs and drabs to maximize drama and advertising revenue. Play Draft Kings Daily Fantasy!

Wednesday, September 23, 2015

The Case Against Maryland

The case for Maryland as Big Ten favorites is straightforward and compelling. They are coming off a surprising 27-8 (14-4) season; they return stars Melo Trimble and Jake Layman; and they add center Diamond Stone, a very highly touted recruit, and Robert Carter, an impact transfer at forward. They were good last year, and they should be significantly better this year.

This is good analysis, and in my opinion Maryland has to be considered at least a contender in the Big Ten for these reasons, maybe even the frontrunner.

But there’s some overhyping going on with this team. The early previews are comparing this year’s Terrapins to last year’s Wisconsin team, which was more or less unanimously anointed the team to beat before last year began, and which carried the burden of Final Four buzz from the very start.

The comparison of this year’s Maryland team to last year’s Wisconsin team has no basis in fact.

Last year’s Badgers were coming off a 30-win season and a trip to the Final Four. They returned four starters, and 7 of their 8 rotation players. They finished 6th in the 2014 Kenpom ratings after taking Kentucky to the final seconds in the national semifinal.

This year’s Terrapins are coming off a 27-win season that ended in the second round of the NCAA tournament. They lose Dez Wells, an extremely high-usage player whom they counted on to create offense. They also lose two other seniors, Richaud Pack and Evan Smotrycz, who contributed significantly. They also finished just 32nd in the Kenpom ratings last year.

It’s this last fact – Maryland’s per-possession profile – that makes me most skeptical about them. Put simply, they overachieved last year. That doesn’t take anything away from what they achieved: all 27 of their wins counted. But it should make us a little skeptical about extrapolating last year’s success into this year.

Finally, Mark Turgeon has been coaching for a lot of years now and his record just isn’t that great. In 17 years as a head coach, he’s won just one conference title (2006, at Wichita St.). In his eight seasons as a major conference coach, his teams have averaged a 5th place finish. He’s never had a particularly good offensive team (last year’s Terps finished 58th in adjusted offensive efficiency) which makes me doubt he’ll deftly adjust to this coming season’s rule changes.

All told, if you give me an even-money bet on Maryland versus the field for the Big Ten title, I'm reaching for my wallet and betting on the field.