Thursday, March 17, 2016

Retrofitting T-Ranketology

I had fun with my hobby project T-Ranketology this year. The results are over at, and I think they're acceptable -- better than major algorithmic projections like KPI and Team Rankings, but worse than most human bracketologists. This makes sense to me, because the tournament selection is a very human affair, and it's hard for a simple model like T-Ranketology to encompass all the vagaries of that process with any real fidelity. As I put it shortly after the bracket was announced: you can't model Madness.

But I will continue to try. To that end, I ran a bunch of experiments to see if I could retrofit T-Ranketology to produce a more accurate bracket. Here are the inputs to T-Ranketology:

WAB (wins against bubble)
Elo (my E-Rank, a simple elo rating seeded by T-Rank)

Each team was ranked in each of these categories, then their ranks were added up to get a total score (with lower being better). That was T-Ranketology this year.

The "resume" rating needs a better name, and I've got my branding people looking into it. But it was clear to me that bracketology is impossible if you aren't paying attention to "top 50 wins" and the like. So to come up with the Resume rating I used the following point values:

Top 50 win: 10 points
other top 100 win: 3 points:
sub-100 loss: -3 points
sub-200 loss -6 points

Obviously these point values were assigned rather arbitrarily, though I did some experimentation to get a bracket that passed the eye test.

One other possible addition to the algorithm would be T-Rank itself. It's pretty clear that efficiency rating does come into the selection process, at least at the margins. For example, efficiency rating must have been the determining cause of Vanderbilt's inclusion. But it's also ignored a lot, particularly when it comes to seeding.

Anyhow, I've now done a bunch of experiments -- running thousands and thousands of brackets with different values of inputs for each of the T-Ranketology inputs -- to see what the ideal weight of the factors would be. Here are the results:

Resume x 3.5
RPI x 1.5
T-Rank x. 1.5
Elo x 0.25
WAB x 0.2

With Resume being calculated as follows:

Top 25 wins: 16 points
other top 50 wins: 13 points
other top 100 wins: 5 points
sub 100 losses: -1 point
sub 200 losses: -5 points

The original T-Ranketology got a score of 312 on bracketmatrix, by getting 65 teams right, nailing the seed on 31 and within 1 seed on 24 others.

This version of T-Ranketology gets a score of 351 (which would tie for first place this year), by getting 66 teams right, nailing the seed on 44 teams, and within 1 seed on 21 others. This algorithm gets Tulsa and Vanderbilt into the tournament, but leaves Providence and Wichita State as 2nd and 3d teams out, respectively. (St. Bonaventure is the first team out.) Saint Mary's remains in the field, though as a 10-seed instead of an 8-seed. Florida also sneaks into the tournament.

If you want to see the bracket produced by this algorithm, it's here.

Clearly, this version of the algorithm is "over-fit" to this year's results. But I think this exercise does provide some insights. Most obviously, the "resume" rank is extremely important. This is how you get Tulsa into the tournament. You have to really value those wins against "top 25" and "top 50" teams. Bad losses don't matter too much, at least not much more than they already matter for the other ranks. The elo and WAB ratings add a little, but not much, to the analysis.

So, this is the algorithm I'll go with next year. The committee will probably do something entirely different and prove, once again, that you can't model Madness.

No comments:

Post a Comment