## How is T-Rank calculated?

The core of T-Rank is calculating offensive and defensive efficiency: points scored and points allowed per possession ("PPP" = points per possession, often rendered as points per 100 possessions). Although coaches like Dean Smith and Bo Ryan have long relied on PPP, it really hit the big time when Ken Pomeroy popularized it about a decade ago.

Kenpom's innovation was to separate out the offensive and defensive PPP and then adjust them for opponent quality and venue. Although Kenpom has made some changes over the years, that's still the core of his ratings, and that is also the core of T-Rank.

Calculating adjusted efficiency for a given game is fairly straightforward:

Game Adj. OE = PPPo / (Opponent's Adj. DE / Average PPP)

Game Adj. DE = PPPd / (Opponent's Adj. OE / Average PPP)

For example, assume the average PPP league wide is 100, and Team A scores 110 PPP against Team B, which has an Adj. DE of 90.0. Team A's Adj. OE for that game will be:

110 / (90 / 100) =

**122.2**

This is for a game on a neutral court. If it's a home game, each team's Adj. OE and DE are adjusted by 1.4 %. So if Team A was on the road it would be:

110 / (90 * .986 / 100) =

**124.0**

And if Team A was at home it would be:

110 / (90 *1.014 / 100) =

**120.5**

The tricky part of calculating the efficiencies is that every result affects its own inputs. If a team comes into a game with an Adj. DE of 90.0 and it gives up more points than expected, its Adj. DE will go up—and then you have put that new Adj. DE in as the source for calculating the game's efficiency. Fortunately, computers can do all that stuff relatively quickly: you just keep doing it and doing it until the numbers stop changing.

Once the numbers have stopped changing, for each team you average their Adj. OE and Adj. DE from each game to get their overall adjusted efficiencies. From the adjusted efficiencies, I use Bill James' "pythagorean expectation" formula to calculate the actual rating, which I jokingly call its "Barthag" (a play on "pythag," which is the correct term). The Barthag is an estimate of what a team's chance of winning would be against the average DI team. So it is between 0 and 1, and higher is better.

There is a constant, called the exponent, used in calculating the Barthag. For my system, I have found that an exponent of 11.5 gives the best results from a predictive standpoint.

From each team's Barthag, we can use another Bill James creation, the log5 formula, to calculate their expected chance of winning against any other team. This allows me to do fun stuff like project records, and run simulations, etc.

UPDATE:

I've made a small but significant change / addition to the ratings, which is that I now incorporate a metric I call "GameScript +/-" which is derived from play-by-play data and measures a team's average lead / deficit during a game. Also, for these purposes I lock this metric when the game is no longer in question. This adds a measure of "game control" and potentially weeds out some "garbage time" effects. More explanation here.

UPDATE:

I've made a small but significant change / addition to the ratings, which is that I now incorporate a metric I call "GameScript +/-" which is derived from play-by-play data and measures a team's average lead / deficit during a game. Also, for these purposes I lock this metric when the game is no longer in question. This adds a measure of "game control" and potentially weeds out some "garbage time" effects. More explanation here.

## How is T-Rank different from Kenpom?

The short answer is that T-Rank is very similar to Kenpom, which is no surprise given that T-Rank is basically an offshoot of Kenpom. But there are three main sources of difference:

**GameScript and Garbage Time**

The incorporation of the GameScript stat, and its degradation of garbage time gives T-Rank a slightly unique aspect. Whether it's a good aspect is another question.

#### Pythags versus Efficiency Margins

Prior to the 2017 season, Kenpom switched away from the pythagorean expectancy / log5 method, to a still very similar system that uses adjusted "efficiency margins" (EMs) instead. The main difference is that instead of being multiplicative, the new Kenpom system is additive. So the basic formula is:

Game Adj. OE = (PPP - Average PPP) - (Opponent's Adj. DE - Average PPP) + Average PPP

For our neutral court example above that would be:

(110 - 100) - (90 - 100) + 100 =

**120**

So, similar, but a little different. When Kenpom decided to go to adjusted EMs, I decided to stick with the Barthag, for old time's sake.

#### Secret Sauce

Here are the additional adjustments I make:

- There's a recency bias—all games in the last 40 days count 100%, then degrade 1% per day until they're 80 days old, after which all games count 60%.

- An adjustment that discounts blowouts in mismatches—if the margin of victory (MOV) is more than 10 points and the difference in Barthags is above a threshold, the game starts getting discounted. If the MOV is 20 points or higher, the discount is (Higher Barthag - Lower Barthag - .5) * 2. So if a team with a Barthag of .8000 is playing a team with a Barthag of .2000, and it wins by 20 points, the game value will be 1 - (.8 - .2 -.5) * 2, or 80%

- As with Kenpom, there is also a preseason component that is phased out once a team has played 13 adjusted games (since not all games count for 100% of a game, it typically sticks around for 15 or 16 games).

One other adjustment Kenpom makes that I do not is that later in the year he gooses the average efficiency and depresses the average tempo. He does this, I presume, because it is a consistent pattern that efficiency rises and tempo falls as the year goes on. It makes a lot of sense. Though sound in theory, it turns out to be kind of unnecessary since the two adjustments counteract each other and pretty much cancel out. So I've never bothered with it, but it is a main reason why Kenpom's adjusted efficiencies are higher than T-Rank's and Kenpom's adjusted tempo numbers are lower.

Ultimately, because of these differences, the final numbers are similar but different. Notably, T-Rank has a wider "spread" between top and bottom teams, probably because Kenpom has a much more significant cap on margin of victory.

## What is T-Rank For?

I don't envision T-Rank as a competitor to or potential replacement for the Kenpom ratings. People should pay for a Kenpom subscription. Those ratings are deservedly the "industry standard," and I have no ambitions of displacing them. My work started by using the published Kenpom ratings to fill some gaps, specifically the fact that he doesn't publish adjusted efficiency margins for conference-only play. He could easily do so, which means he probably has a good reason (probably that there are fewer games, and the schedule mostly evens out in the end) for not doing so. But that didn't stop me!

Eventually, I figured how to make a similar set of ratings, and making my own ratings from scratch allows me to fill more gaps and make more interesting tools for looking at college basketball. So the purpose of T-Rank is mainly to be the foundation for those tools—it's not an attempt to create a better or truer ranking of teams.

Why do teams ranked 340-351 have Barthags higher than any other team below 330? Coppin St. .767?

ReplyDeleteThanks for pointing that out -- it's a display error as the leading zero is being dropped. Will fix. Eg., Coppin St. is actually .0767

DeleteAs a freestanding analytic tool (now not tied to Kenpom) your T-Rankings will provide an excellent comparison tool (set). Thanks much.

ReplyDeleteCan you explain WAB? I read the definition and then i see the higher number get greener. If a bubble quality team would win more games against the team's schedule, why would that be a good thing?

ReplyDeleteThe WAB number isn't how many games a bubble team would win against that team's schedule, it's how many MORE (or fewer) games a team has won against its schedule than a bubble-quality team would be expected to win. So say a team has a schedule that a bubble quality team would be expected to go 10-10 against. If the team is actually 15-5, that's a WAB of +5.0. If they were 5-15, the team's WAB would be -5.0. If they are 10-10, it's par, 0.

DeleteCan you add sortablity for the team names column and maybe filters to view 1 or 2 teams at a time? Great site btw. I like that customized filter where you can limit the time frame and see not only the ranks , but the adjusted offense and defense during the time selected against top teams. Also, the team pages are great.

ReplyDeleteThanks! I can add a teamname sort to the main page at last. As for 1 or 2 teams, will have to think about that from an interface perspective. Can filter by conference to narrow things down to a more manageable viewing experience.

DeleteCan now look at & compare two teams on the main page at a time by clicking on a matchup on the schedule or team pages. You can also choose any two teams by manipulated the URL parameters (t1l and t2l -- those are ELLs on the end, short for "limit")

DeleteThanks for the explanation and for making T-rank. I really like the tools you have, like the ability to select games from a certain time period and the ability to compare tournament performance to expected wins.

ReplyDeleteThanks!

DeleteI'm really impressed with your T-Ranketology algorithm. It's the most accurate near real-time bracketology I know of, which should make for a great resource to follow during the conference tournaments. I have noticed that your live scores sometimes don't acknowledge that a game has ended for quite a while. For instance today's Louisville/Florida St game ended at 1:00 CT today, but it still shows the game being in progress as of now. It does however already acknowledge that subsequent Boston College/NC St game has ended. Is there a possible fix for that? It would be awesome to see how some of these games affect your seed list in real time.

ReplyDeleteThanks! I'm pleased with the performance of the T-ranketology algorithm, though I'll probably try to improve it some more this offseason. Basically, the idea is to give a general idea of how games a given outcome will affect things, and I think it does a reasonable enough job of that. Ultimately, you can't model madness, but it's fun to try.

DeleteAs for the live scores, etc ... I update the site data every 15 minutes, but sometimes it takes quite a while for box scores to go officially final (which is when I pull the data). This seems to be especially common during tournaments, so that's probably what was going on with the Louisville /Florida St. game.

All in all the live scoring feature is sort of a beta thing. If you like it, you can still see the live scores by putting live=1 in the URL, even though I took the checkbox away.

Hey Bart big Mean Green fan here- can you please update your site and replace Tony with Grant McCasland please. Big fan of your work. GMG

ReplyDeleteDone - thanks for pointing out the error.

DeleteHi. Is it better to have a lower rank in the FUN? I assume the lower rank means less lucky because those are shaded green and green is good usually. Lucky would not be a good thing because it makes the team look better than they are.

ReplyDeleteAs Socrates said, better lucky than good.

DeleteThis is incredible, and I love everything on your website. It has helped me a lot with a project I'm working on. The only thing I would want to see is the inclusion of RPI and BPI, especially with the ability to filter by date. I can only seem to find current BPI rankings on other websites, but nothing with the ability to see BPI rankings, say, one week before the tournament started.

ReplyDeleteThanks Adam! If you want to look at RPI as of certain dates, you can find that buried on that ncaa website in its archive of the "team sheets." For last year, the BPI is listed on those. For example, here are the team sheets as of March 4th last year: https://extra.ncaa.org/solutions/rpi/Stats%20Library/March%204,%202018%20Team%20Sheets.pdf

DeleteFound it. This is everything I could have dreamed of. Thanks Bart!

DeleteHey Bart, I'm a big fan of your work! I was looking at your 2019 player finder to compare some freshmen PRPG! projections w/ returning players and noticed freshman have not yet been included in the 2019 player finder. Are you planning on adding them? Thanks!

ReplyDeleteThanks, and sorry for the late reply. It would be sort of apples to oranges to put them in the player finder because for 2019 I've got a player's "returning" stats, and for Freshmen of course I can only do projections. I do have all the projections, including for freshmen, here: http://barttorvik.com/allrosters19.php?conlimit=&yvalue=Fr&type=All&s=15

DeleteWhat does the percentage in parentheses mean next to the T-Rank line on the Today's Games tab?

ReplyDeleteEstimated chance of winning.

DeleteIn the team shooting stats section, for the "share" of the different categories (dunks, at the rim, other twos and threes), is the "share" a team's share of its attempts or a share of its points?

ReplyDeleteIt's the share of attempts.

DeleteThanks! I love your site. You're doing fantastic work!

DeleteHey man, your site is incredible. Great work with all the stats, easy to understand, and fun to find any stat youre looking for. Thanks for the hard work youve put in.

ReplyDeletethanks!

DeleteIt's December 17th and Purdue is 6-5. Yet you have them ranked 20th. It's pretty clear they aren't playing like the 20th best team in the nation. IIRC it takes 13 games for the T-Rank preseason predictions to "wear off," but in this case I think it greatly diminishes the quality of the rankings. Curious what you think.

ReplyDeleteThanks!

It really isn't the preseason projection that's causing Purdue's rating: 1) the preseason projection had Purdue at about 41st, and 2) as you mention, the preseason projections are largely phased out at this point anyhow.

DeleteWhat is causing it is that T-Rank is about projecting future performance, so it cares about quality of performance not just the result. By my metrics, Purdue has played the 3rd hardest schedule in DI, so you'd expect them to have taken some bumps. All of Purdue's losses have been away from home against quality opponents, and except for the Michigan game they've been hard fought contests. That's what T-Rank and other similar systems look at. (Notably, Kenpom also has Purdue at 20th right now.)

The best guess is that they're unlucky to be 6-5 and will perform like a top 25 team going forward. But that could of course be wrong. It could be that their losses were lucky to be close, and they're a harbinger of bad losses to come. But looking beyond wins and losses to my mind is exactly what this kind of ratings system is supposed to do.

You’ve got the 1/5/19 Duke-Clemson game at Clemson. It’s at Cameron.

ReplyDeletethanks, will fix.

DeleteHi Bart - How are injuries factored into the equation?

ReplyDeletethey're not!

DeleteCan you explain how the Talent stat is calculated?

ReplyDeleteIt is based on composite recruiting ranks weighted for minutes played.

Deletehttp://barttorvik.com/cgi-bin/ncaat.cgi?type=coach&sort=1&yrlow=2000&yrhigh=2018

ReplyDeleteAre you aware that if one clicks on FAQ and then clicks on About Me, one gets "Never Gonna Give You Up" by Rick Astley.

That used to called Rickrolling. Now, you may just be pranking your fans or be a big Astley fan......

Hmm, not sure what's going on with that. Should be fixed now:

DeleteAbout Me

ðŸ˜‚ ðŸ˜‚ ðŸ˜‚

Deletewhat a great reply

It's still not OK. I'd much rather be learning more about your methodology. Also, I'd much rather this be discussed offline. When I (or anyone else) wants to find out more about you and, instead gets a video that even Rick Atsley has called "chessy", it tends to reflect badly on what you've done. Rickrolling has been a thing on and off since 2007 or so. Are you positive you're not be pranked? => that someone put up to video to play with you. I can't see the code...else I might be able to help you more. I hope you get it fixed.

ReplyDeleteOK, are you going to the Final Four? If so, let's meet up and discuss.

DeleteLove your stuff. Can you briefly explain how Tempo is adjusted from the raw possession formula? Thanks!

ReplyDeleteIt's the same basic process as adjusting the offensive and defensive efficiency.

DeleteThis comment has been removed by the author.

ReplyDeleteJust curious if there's a spot on your site to view your prediction accuracy on the year.

ReplyDeleteGreat content, thanks!

The schedule page has some info on that at the bottom of the table for each completed day. for the entire year so for the mean absolute error for scoring margins is 9.0, for totals it's 13.8, and favorites are 3875–1399 (73.5%).

DeleteThat 73.5 number is exactly what I was looking for, thanks!

DeleteI love the site! I've looked into at the rim, other 2 point, and 3 point attempts as well. I noticed a lot of home scorekeeper issues in consistency with what they define as shots at the rim opposed to jump shots. I would compare shot distribution at home to road/nuetral site games and there were some significant differences. For the 17-18 season Indiana and Rutgers were a couple extreme cases where the home scorekeeper seemed to label more at the rim shots than they should. You don't happen to try to account for that in your share by chance? Or do you have a way to sort home/road? Thanks

ReplyDeleteThere are definitely scorekeeper issues with the play by play data, so it's kind of "use at your own risk" stuff. I don't believe I currently have a way (on the site) to sort that data by home/road, though it's definitely something I could look into on the back end. Interesting idea for something to look at over the summer.

DeleteWould you be willing to share (I will gladly pay or donate) your projected score formula? I've been plugging in barthag ratings (2/15/19-current) into the Log5 formula, and then comparing that result to the full season Log5 result to make some minor adjustments to your projected point spread. I also convert the Log5 result into a money line:

ReplyDeleteif Log5% > 50%

Log5% / (100% - Log5%) x (-100) = money line

I've been manually collecting/entering data to find the relationship between the moneyline/total and the spread, but I've started to realize that it's going to be near impossible to get a large enough sample size. Anyways, your site has reignited my interest in college basketball! Best of luck to Wisconsin next year (;

To calculate scores I use each team's adjusted efficiency numbers (modified for home/road as necessary) to calculate an expected points per possession for each team, and then take each team's adjusted tempo numbers to calculate an expected number of possessions, then multiply the expect points per possession by the expected number of possessions.

DeleteCalculating projected points per possession:

t1_proj_ppp = ((t1oe / avgeff) * (t2de / avgeff) * avgeff) / 100

t2_proj_ppp = ((t2oe / avgeff) * (t1de / avgeff) * avgeff) / 100

Calculating projected number of possessions:

tpro = (t1t/avgt)*(t2t/avgt)*avgt

t1_proj_pts = t1_proj_ppp * tpro

t2_proj_pts = t2_proj_ppp * tpro

Thanks torv!

DeleteTwo Questions:

ReplyDeleteWhat is the difference between Effective and Average Height?

Also, what is the PAKE Stat or the PASE Stat?

Thanks!

Hi Dylan,

Delete"Effective Height" is an attempt to calculate minute weighted height of the 4s and 5s. So it's basically the average height of the tallest 40% of minutes.

"Average Height" includes all minutes, not just the bigs.

PAKE is "performance against kenpom expectations" -- tourney wins versus the amount of tourney wins expected based on team's adjusted efficiency rating (used Kenpom 2.0 for before 2017, T-Rank since).

PASE is the same think except using "seed expectations" as the baseline -- so wins versus the amount expected for a given seed.