T-Rank FAQ

As I've added more interesting features to the T-Rank website, it's gotten some more attention. As a result, I've gotten a fair number of good questions about what T-Rank is supposed to be. The easy answer is that it's supposed to be fun, but this doesn't seem to satisfy people. Beyond that, I've explained much of it in old blog posts here, but even I have a hard time finding them. So I decided to change this page (which used to be a mirror of the T-Rank site for some stupid reason) into a FAQ. This is a work in progress.

How is T-Rank calculated?


The core of T-Rank is calculating offensive and defensive efficiency: points scored and points allowed per possession ("PPP" = points per possession, often rendered as points per 100 possessions). Although coaches like Dean Smith and Bo Ryan have long relied on PPP, it really hit the big time when Ken Pomeroy popularized it about a decade ago.

Kenpom's innovation (one of them, at least) was to separate out the offensive and defensive PPP and then adjust them for opponent quality and venue. Although Kenpom has made some changes over the years, that's still the core of his ratings, and that is also the core of T-Rank.

Calculating adjusted efficiency for a given game is fairly straightforward:

Game Adj. OE = PPPo / (Opponent's Adj. DE / Average PPP)
Game Adj. DE = PPPd / (Opponent's Adj. OE / Average PPP)

For example, assume the average PPP league wide is 100, and Team A scores 110 PPP against Team B, which has an Adj. DE of 90.0. Team A's Adj. OE for that game will be:

110 / (90 / 100) = 122.2 

This is for a game on a neutral court. If it's a home game, each team's Adj. OE and DE are adjusted by 1.4 %. So if Team A was on the road it would be:

110 / (90 * .986 / 100) = 124.0

And if Team A was at home it would be:

110 / (90 *1.014 / 100) = 120.5

The tricky part of calculating the efficiencies is that every result affects its own inputs. If a team comes into a game with an Adj. DE of 90.0 and it gives up more points than expected, its Adj. DE will go up—and then you have put that new Adj. DE in as the source for calculating the game's efficiency. Fortunately, computers can do all that stuff relatively quickly: you just keep doing it and doing it until the numbers stop changing.

Once the numbers have stopped changing, for each team you average their Adj. OE and Adj. DE from each game to get their overall adjusted efficiencies. From the adjusted efficiencies, I use Bill James' "pythagorean expectation" formula to calculate the actual rating, which I jokingly call its "Barthag" (a play on "pythag," which is the correct term). The Barthag is an estimate of what a team's chance of winning would be against the average DI team. So it is between 0 and 1, and higher is better.

There is a constant, called the exponent, used in calculating the Barthag. For my system, I have found that an exponent of 11.5 gives the best results from a predictive standpoint.

From each team's Barthag, we can use another Bill James creation, the log5 formula, to calculate their expected chance of winning against any other team. This allows me to do fun stuff like project records, and run simulations, etc.

UPDATE:

I've made a small but significant change / addition to the ratings, which is that I now incorporate a metric I call "GameScript +/-" which is derived from play-by-play data and measures a team's average lead / deficit during a game. Also, for these purposes I lock this metric when the game is no longer in question. This adds a measure of "game control" and potentially weeds out some "garbage time" effects. More explanation here.

How is T-Rank different from Kenpom?


The short answer is that T-Rank is very similar to Kenpom, which is no surprise given that T-Rank is basically an offshoot of Kenpom. But there are three main sources of difference:

GameScript and Garbage Time

The incorporation of the GameScript stat, and its omission of garbage time gives T-Rank a slightly unique aspect. Whether it's a good aspect is another question.

Pythags versus Efficiency Margins

Prior to the 2017 season, Kenpom switched away from the pythagorean expectancy / log5 method, to a still very similar system that uses adjusted "efficiency margins" (EMs) instead. The main difference is that instead of being multiplicative, the new Kenpom system is additive. So the basic formula is:

Game Adj. OE = (PPP - Average PPP) - (Opponent's Adj. DE - Average PPP) + Average PPP

For our neutral court example above that would be:

(110 - 100) - (90 - 100) + 100 = 120

So, similar, but a little different. When Kenpom decided to go to adjusted EMs, I decided to stick with the Barthag, for old time's sake.

Secret Sauce

Here are the additional adjustments I make:
  • There's a recency bias—all games in the last 40 days count 100%, then degrade 1% per day until they're 80 days old, after which all games count 60%.
  • An adjustment that discounts blowouts in mismatches—if the margin of victory (MOV) is more than 10 points and the difference in Barthags is above a threshold, the game starts getting discounted. If the MOV is 20 points or higher, the discount is (Higher Barthag - Lower Barthag - .5) * 2. So if a team with a Barthag of .8000 is playing a team with a Barthag of .2000, and it wins by 20 points, the game value will be 1 - (.8 - .2  -.5) * 2, or 80%
  • As with Kenpom, there is also a preseason component that is phased out once a team has played 13 adjusted games (since not all games count for 100% of a game, it typically sticks around for 15 or 16 games).
Ultimately, because of these differences, the final numbers are similar but different. Notably, T-Rank has a wider "spread" between top and bottom teams, probably because Kenpom has a much more significant cap on margin of victory.

What is T-Rank For?


I don't envision T-Rank as a competitor to or potential replacement for the Kenpom ratings. People should pay for a Kenpom subscription. Those ratings are deservedly the "industry standard," and I have no ambitions of displacing them. My work started by using the published Kenpom ratings to fill some gaps, specifically the fact that he doesn't publish adjusted efficiency margins for conference-only play. He could easily do so, which means he probably has a good reason (probably that there are fewer games, and the schedule mostly evens out in the end) for not doing so. But that didn't stop me!

Eventually, I figured how to make a similar set of ratings, and making my own ratings from scratch allows me to fill more gaps and make more interesting tools for looking at college basketball. So the purpose of T-Rank is mainly to be the foundation for those tools—it's not an attempt to create a better or truer ranking of teams. 

287 comments:

  1. Why do teams ranked 340-351 have Barthags higher than any other team below 330? Coppin St. .767?

    ReplyDelete
    Replies
    1. Thanks for pointing that out -- it's a display error as the leading zero is being dropped. Will fix. Eg., Coppin St. is actually .0767

      Delete
  2. As a freestanding analytic tool (now not tied to Kenpom) your T-Rankings will provide an excellent comparison tool (set). Thanks much.

    ReplyDelete
  3. Can you explain WAB? I read the definition and then i see the higher number get greener. If a bubble quality team would win more games against the team's schedule, why would that be a good thing?

    ReplyDelete
    Replies
    1. The WAB number isn't how many games a bubble team would win against that team's schedule, it's how many MORE (or fewer) games a team has won against its schedule than a bubble-quality team would be expected to win. So say a team has a schedule that a bubble quality team would be expected to go 10-10 against. If the team is actually 15-5, that's a WAB of +5.0. If they were 5-15, the team's WAB would be -5.0. If they are 10-10, it's par, 0.

      Delete
  4. Can you add sortablity for the team names column and maybe filters to view 1 or 2 teams at a time? Great site btw. I like that customized filter where you can limit the time frame and see not only the ranks , but the adjusted offense and defense during the time selected against top teams. Also, the team pages are great.

    ReplyDelete
    Replies
    1. Thanks! I can add a teamname sort to the main page at last. As for 1 or 2 teams, will have to think about that from an interface perspective. Can filter by conference to narrow things down to a more manageable viewing experience.

      Delete
    2. Can now look at & compare two teams on the main page at a time by clicking on a matchup on the schedule or team pages. You can also choose any two teams by manipulated the URL parameters (t1l and t2l -- those are ELLs on the end, short for "limit")

      Delete
  5. Thanks for the explanation and for making T-rank. I really like the tools you have, like the ability to select games from a certain time period and the ability to compare tournament performance to expected wins.

    ReplyDelete
  6. I'm really impressed with your T-Ranketology algorithm. It's the most accurate near real-time bracketology I know of, which should make for a great resource to follow during the conference tournaments. I have noticed that your live scores sometimes don't acknowledge that a game has ended for quite a while. For instance today's Louisville/Florida St game ended at 1:00 CT today, but it still shows the game being in progress as of now. It does however already acknowledge that subsequent Boston College/NC St game has ended. Is there a possible fix for that? It would be awesome to see how some of these games affect your seed list in real time.

    ReplyDelete
    Replies
    1. Thanks! I'm pleased with the performance of the T-ranketology algorithm, though I'll probably try to improve it some more this offseason. Basically, the idea is to give a general idea of how games a given outcome will affect things, and I think it does a reasonable enough job of that. Ultimately, you can't model madness, but it's fun to try.

      As for the live scores, etc ... I update the site data every 15 minutes, but sometimes it takes quite a while for box scores to go officially final (which is when I pull the data). This seems to be especially common during tournaments, so that's probably what was going on with the Louisville /Florida St. game.

      All in all the live scoring feature is sort of a beta thing. If you like it, you can still see the live scores by putting live=1 in the URL, even though I took the checkbox away.

      Delete
  7. Hey Bart big Mean Green fan here- can you please update your site and replace Tony with Grant McCasland please. Big fan of your work. GMG

    ReplyDelete
  8. Hi. Is it better to have a lower rank in the FUN? I assume the lower rank means less lucky because those are shaded green and green is good usually. Lucky would not be a good thing because it makes the team look better than they are.

    ReplyDelete
  9. Adam SwindlehurstMay 8, 2018 at 2:12 PM

    This is incredible, and I love everything on your website. It has helped me a lot with a project I'm working on. The only thing I would want to see is the inclusion of RPI and BPI, especially with the ability to filter by date. I can only seem to find current BPI rankings on other websites, but nothing with the ability to see BPI rankings, say, one week before the tournament started.

    ReplyDelete
    Replies
    1. Thanks Adam! If you want to look at RPI as of certain dates, you can find that buried on that ncaa website in its archive of the "team sheets." For last year, the BPI is listed on those. For example, here are the team sheets as of March 4th last year: https://extra.ncaa.org/solutions/rpi/Stats%20Library/March%204,%202018%20Team%20Sheets.pdf

      Delete
    2. Adam SwindlehurstMay 10, 2018 at 4:24 PM

      Found it. This is everything I could have dreamed of. Thanks Bart!

      Delete
  10. Hey Bart, I'm a big fan of your work! I was looking at your 2019 player finder to compare some freshmen PRPG! projections w/ returning players and noticed freshman have not yet been included in the 2019 player finder. Are you planning on adding them? Thanks!

    ReplyDelete
    Replies
    1. Thanks, and sorry for the late reply. It would be sort of apples to oranges to put them in the player finder because for 2019 I've got a player's "returning" stats, and for Freshmen of course I can only do projections. I do have all the projections, including for freshmen, here: http://barttorvik.com/allrosters19.php?conlimit=&yvalue=Fr&type=All&s=15

      Delete
  11. What does the percentage in parentheses mean next to the T-Rank line on the Today's Games tab?

    ReplyDelete
    Replies
    1. A friend recommended your site yesterday, wow, great work, very detailed info.
      Regarding win %, is that like ML win or against the spread?

      Delete
  12. In the team shooting stats section, for the "share" of the different categories (dunks, at the rim, other twos and threes), is the "share" a team's share of its attempts or a share of its points?

    ReplyDelete
    Replies
    1. Thanks! I love your site. You're doing fantastic work!

      Delete
  13. Hey man, your site is incredible. Great work with all the stats, easy to understand, and fun to find any stat youre looking for. Thanks for the hard work youve put in.

    ReplyDelete
  14. It's December 17th and Purdue is 6-5. Yet you have them ranked 20th. It's pretty clear they aren't playing like the 20th best team in the nation. IIRC it takes 13 games for the T-Rank preseason predictions to "wear off," but in this case I think it greatly diminishes the quality of the rankings. Curious what you think.
    Thanks!

    ReplyDelete
    Replies
    1. It really isn't the preseason projection that's causing Purdue's rating: 1) the preseason projection had Purdue at about 41st, and 2) as you mention, the preseason projections are largely phased out at this point anyhow.

      What is causing it is that T-Rank is about projecting future performance, so it cares about quality of performance not just the result. By my metrics, Purdue has played the 3rd hardest schedule in DI, so you'd expect them to have taken some bumps. All of Purdue's losses have been away from home against quality opponents, and except for the Michigan game they've been hard fought contests. That's what T-Rank and other similar systems look at. (Notably, Kenpom also has Purdue at 20th right now.)

      The best guess is that they're unlucky to be 6-5 and will perform like a top 25 team going forward. But that could of course be wrong. It could be that their losses were lucky to be close, and they're a harbinger of bad losses to come. But looking beyond wins and losses to my mind is exactly what this kind of ratings system is supposed to do.

      Delete
    2. Always interesting to read through comment sections and see what sticks out as a future reader - and this was spot on. If not for a freak finish to the Purdue/Virginia Elite 8 game of 2019, Purdue finishes as a Final 4 team at minimum (and considering they lost to the eventual champion, there's a high chance Purdue could've advanced past the Final 4), which you and your rankings seem to have called!

      Delete
  15. You’ve got the 1/5/19 Duke-Clemson game at Clemson. It’s at Cameron.

    ReplyDelete
  16. Hi Bart - How are injuries factored into the equation?

    ReplyDelete
  17. Can you explain how the Talent stat is calculated?

    ReplyDelete
    Replies
    1. It is based on composite recruiting ranks weighted for minutes played.

      Delete
  18. http://barttorvik.com/cgi-bin/ncaat.cgi?type=coach&sort=1&yrlow=2000&yrhigh=2018
    Are you aware that if one clicks on FAQ and then clicks on About Me, one gets "Never Gonna Give You Up" by Rick Astley.
    That used to called Rickrolling. Now, you may just be pranking your fans or be a big Astley fan......

    ReplyDelete
    Replies
    1. Hmm, not sure what's going on with that. Should be fixed now:

      About Me

      Delete
    2. 😂 😂 😂
      what a great reply

      Delete
  19. It's still not OK. I'd much rather be learning more about your methodology. Also, I'd much rather this be discussed offline. When I (or anyone else) wants to find out more about you and, instead gets a video that even Rick Atsley has called "chessy", it tends to reflect badly on what you've done. Rickrolling has been a thing on and off since 2007 or so. Are you positive you're not be pranked? => that someone put up to video to play with you. I can't see the code...else I might be able to help you more. I hope you get it fixed.

    ReplyDelete
    Replies
    1. OK, are you going to the Final Four? If so, let's meet up and discuss.

      Delete
  20. Love your stuff. Can you briefly explain how Tempo is adjusted from the raw possession formula? Thanks!

    ReplyDelete
    Replies
    1. It's the same basic process as adjusting the offensive and defensive efficiency.

      Delete
  21. This comment has been removed by the author.

    ReplyDelete
  22. Just curious if there's a spot on your site to view your prediction accuracy on the year.
    Great content, thanks!

    ReplyDelete
    Replies
    1. The schedule page has some info on that at the bottom of the table for each completed day. for the entire year so for the mean absolute error for scoring margins is 9.0, for totals it's 13.8, and favorites are 3875–1399 (73.5%).

      Delete
    2. That 73.5 number is exactly what I was looking for, thanks!

      Delete
  23. I love the site! I've looked into at the rim, other 2 point, and 3 point attempts as well. I noticed a lot of home scorekeeper issues in consistency with what they define as shots at the rim opposed to jump shots. I would compare shot distribution at home to road/nuetral site games and there were some significant differences. For the 17-18 season Indiana and Rutgers were a couple extreme cases where the home scorekeeper seemed to label more at the rim shots than they should. You don't happen to try to account for that in your share by chance? Or do you have a way to sort home/road? Thanks

    ReplyDelete
    Replies
    1. There are definitely scorekeeper issues with the play by play data, so it's kind of "use at your own risk" stuff. I don't believe I currently have a way (on the site) to sort that data by home/road, though it's definitely something I could look into on the back end. Interesting idea for something to look at over the summer.

      Delete
  24. Would you be willing to share (I will gladly pay or donate) your projected score formula? I've been plugging in barthag ratings (2/15/19-current) into the Log5 formula, and then comparing that result to the full season Log5 result to make some minor adjustments to your projected point spread. I also convert the Log5 result into a money line:
    if Log5% > 50%
    Log5% / (100% - Log5%) x (-100) = money line
    I've been manually collecting/entering data to find the relationship between the moneyline/total and the spread, but I've started to realize that it's going to be near impossible to get a large enough sample size. Anyways, your site has reignited my interest in college basketball! Best of luck to Wisconsin next year (;

    ReplyDelete
    Replies
    1. To calculate scores I use each team's adjusted efficiency numbers (modified for home/road as necessary) to calculate an expected points per possession for each team, and then take each team's adjusted tempo numbers to calculate an expected number of possessions, then multiply the expect points per possession by the expected number of possessions.


      Calculating projected points per possession:
      t1_proj_ppp = ((t1oe / avgeff) * (t2de / avgeff) * avgeff) / 100
      t2_proj_ppp = ((t2oe / avgeff) * (t1de / avgeff) * avgeff) / 100


      Calculating projected number of possessions:
      tpro = (t1t/avgt)*(t2t/avgt)*avgt

      t1_proj_pts = t1_proj_ppp * tpro
      t2_proj_pts = t2_proj_ppp * tpro

      Delete
    2. Thanks torv!

      Delete
  25. Two Questions:

    What is the difference between Effective and Average Height?

    Also, what is the PAKE Stat or the PASE Stat?

    Thanks!

    ReplyDelete
    Replies
    1. Hi Dylan,

      "Effective Height" is an attempt to calculate minute weighted height of the 4s and 5s. So it's basically the average height of the tallest 40% of minutes.

      "Average Height" includes all minutes, not just the bigs.

      PAKE is "performance against kenpom expectations" -- tourney wins versus the amount of tourney wins expected based on team's adjusted efficiency rating (used Kenpom 2.0 for before 2017, T-Rank since).

      PASE is the same think except using "seed expectations" as the baseline -- so wins versus the amount expected for a given seed.

      Delete
    2. What are your preseason rankings based on? Thanks.

      Delete
    3. Some info on that here: http://adamcwisports.blogspot.com/2015/09/t-rank-2016-preview-nuts-and-bolts.html?m=1

      Delete
  26. Under the "Today's Games" tab, what does the TTQ (Torvik Thrill Quotient) measure?

    ReplyDelete
    Replies
    1. Basically takes into account how good the teams are, how close the game is projected to be, and how fast the tempo is projected to be.

      Delete
  27. Bart! Looks like you put a great deal of effort into your content...awesome job!

    Question: In the today's game tab, there is a line under the table of games that report a few statistics. Is the stat for Record of Favorites based on a straight win, not ATS under your T-Rank Line?

    Thanks!

    ReplyDelete
    Replies
    1. Correct. It's mainly a way to see how well-calibrated the projected win percentages are. If there are two games where the both favorites are 51% chance to win, their expected record will be 1-1.

      Delete
  28. Hey Bart! Love the shooting data on the site. What site do you parse the shooting data from? Thanks!

    ReplyDelete
    Replies
    1. I pay for a feed of play by play stats, but they are also available at NCAA's stats website.

      Delete
  29. Hello, Do you know of any sites in which you can save players and quickly pull up comparative stats. Let's just say, for example, I wanted to save a set of college point guards(7-10)as a list and be able to quickly pull up YTD stats as a table for comparison throughout the year.

    ReplyDelete
  30. Why is the DRB rate of a powerful team lower than that of an average team?

    ReplyDelete
  31. Hey Bart,

    Any chance you could explain the "Net" statistic that shows up in the box scores for each game? And also, related to that, what does it mean if a player has a "threshold win"?

    Thanks! Love the site.

    ReplyDelete
    Replies
    1. Hi Joey, the "Net" is a conversion of the box-plus minus stat (which is per-possession) grossed out for the minutes actually played to give an indication of the player's net points added / subtracted for the game. A "threshold win" is when that Net figure exceeds the team's margin of victory, implying that his performance was the difference between winning and losing.

      Delete
  32. Question on these formulae when it's the first game of the season:

    Calculating adjusted efficiency for a given game is fairly straightforward:
    Game Adj. OE = PPPo / (Opponent's Adj. DE / Average PPP)
    Game Adj. DE = PPPd / (Opponent's Adj. OE / Average PPP)

    For the first game of the year to get the team's initial "Game Adj. OE", would it be as simple as:
    Game Adj. OE = PPPo ?

    ReplyDelete
  33. Hi Bart,

    One more question. I am trying the get the part of the OE/DE formula where the results affects it's own inputs. I was testing it in Excel first but the numbers are not stabilizing to one number like you mentioned. I have a simple screenshot here if you have time to look: https://imgur.com/a/VNKcU97. I am in fact feeding the opponent's new AdjDE into the formula, but the number's each row go back and forth from original Adj.OE to another float.

    Thanks for your time! you rock.


    ReplyDelete
    Replies
    1. Hi Justin, I've always used a preseason prior as part of my ratings, so I've never had to deal with the "game one" problem directly. But I think you are correct that there would be no adjustments in game one.

      Doing the iterations in a spreadsheet is pretty complicated. I did it that way for the first year of T-Rank (the site was actually just a single page that was created directly from Excel). The way I did it was to have a macro that essentially cut and paste everything over a bunch of times. I did have issues with things stabilizing that took a lot of tinkering to get working, and I frankly don't remember how I did it. That is pretty tricky stuff.

      Delete
  34. Hi Bart,

    I see that for Maryland, freshman Chol Marial is not included on their roster, but he has played some minutes for them since his debut on 12/29/2019. I expect he will make a significant impact as the season progresses, if nothing else, due to his size. Is it possible to add him to their roster? Any insight as to how or why he was omitted? Is this a special case since his long term injury status was uncertain at the start of the season? Could there be similar omissions for other teams?

    Thanks, and as others have said, this is fantastic work you have done!

    ReplyDelete
    Replies
    1. Correction: He is listed on the roster for Maryland, but does not show up on the T-rank line page for their upcoming game against Wisconsin. I suppose then the question becomes, is there a threshold for including a player in any particular game projection?

      Delete
    2. Hi, I don't really show "rosters" per se -- just stats. So guys who haven't accumulated any stats won't show up. On the team pages, I also exclude guys who've played less that 1% of available minutes (just to keep superscrubs off the page)

      Delete
  35. Hi, what is ADJ T.? i hover over it and nothing pops up. Trying to figure out what this is. It is the column right before WAB on this page: http://barttorvik.com/#.

    ReplyDelete
  36. Do the rankings ever get adjusted due to injuries of a teams top player(s)?

    ReplyDelete
  37. Hello!

    First of all your site and your work on it is pretty impressive!

    I would like to ask that under the "Today's Games" tab, what does the "MOV Mean absolute error", "Totals MAE" and "Score bias" mean?

    Thanks in advance and sorry if it's a lame question.

    A.

    ReplyDelete
    Replies
    1. MOV Mean Absolute Error is how far off the day's predictions were. So if's 8, that means actual result of the average game differed from my prediction by 8 points.

      Totals MAE is the same, except for totals (predicted total score)

      Score bias shows how predicted totals compared to actual total scores. So if it's minus 1, that means the average game actually had 1 more point than predicted.

      Delete
  38. Hi,
    This result should be red, shouldn't it? Kansas did not cover the spread. i thought they would as well. http://barttorvik.com/schedule.php?sort=time&date=20200304&conlimit=

    ReplyDelete
    Replies
    1. Nevermind my question. Your results being red or not is based on flat out win or SU win. Sorry about that. It is fine.

      Delete
    2. Right, results are red when the favorite loses (upsets).

      Delete
  39. It's not clear to me how the iterative update of the adjusted ratings is performed. How you update Adj. OE after you calculate Game Adj. OE at each iteration before convergence? You replace Adj. OE with Game Adj. OE, you average both or you perform a weighted average with less weight to Game Adj. OE?

    ReplyDelete
    Replies
    1. Honestly I’m not sure how to explain it. You do it once, and then you do it again and again until the changes are infinitesimal.

      Delete
    2. I think I got it, you "solve" all the equations and update the adjusted ratings at every iteration with the mean of game ratings.

      Your rankings and your blog are very good, great work!

      Delete
    3. Hi Gustavo,

      Could you explain how you figured out what to input on the 2nd,3rd, etc iterations? I am having the same trouble you did. I am calculating the first iteration of "Game Adj. OE" and "Game Adj. DE" no problem. The problem comes when I re-input them for the 2nd iteration, and 3rd iteration, but the numbers never converge to one specific number.

      What did you mean (average) together? I assume "Game Adj. OE" is one, but what are you averaging it with before you put it back into the main formula for the 2nd iteration?

      Thanks in advance!
      Justin

      Game Adj. OE = PPPo / (Opponent's Adj. DE / Average PPP)
      -- Game Adj. DE

      Delete
  40. When will Chaundee Brown be added to the Michigan roster on your site?

    ReplyDelete
  41. I apologize if I missed someone asking this previously within this thread, I only did a quick scan to see if its been asked. What Data/Factor's are you using to calculate your BARTHAG (power rankings) before any games have been played yet? Is it using the data from prior seasons, and returning players data from prior seasons?

    Thank you!

    ReplyDelete
    Replies
    1. Yes I do preseason projections: http://adamcwisports.blogspot.com/2015/09/t-rank-2016-preview-nuts-and-bolts.html

      Delete
  42. I'm sure I missed this in the most obvious place, but is there a blog post on PRPG! ? A quick explanation on this would be awesome.

    Thank you!!

    ReplyDelete
  43. hello! Love the site! I was wondering if you could add Total Minutes as show/hide column in your player stats? Right now i believe it is only part of the filter feature >0.

    Also, some of your MIN% may be off. Absolutely random player, but Tyrese Martin of UCONN played 25 minutes in his one game, yet his min% is 31.3%, when it should be 62.5%. Unless Min% is players MIN of total team minutes. UCONN has played two games so this might make sense.

    ReplyDelete
    Replies
    1. will consider adding total minutes ... actually quite a production to change that table. Yes, min% by convention is percentage of the team's total minutes. I do keep track of min% per game played for purposes of calculated porpagatu, but it's kind of buried in the innards of the database, hard to surface.

      Delete
  44. Hi Mr. Torvik, if you do not mind explaining, how are the tournament odds calculated? I have a formula I have been experimenting with to determine H2H probabilities (especially for March Madness) but beyond the first round of games, the amount of permutations is a mathematical headache. MathExchange and my current statistics professor were stumped with an efficient way to calculate probabilities for a certain region, so you are my last hope!

    ReplyDelete
    Replies
    1. Hello - what I do is run a bunch of "simulations" of the tournament and keep track of the results. (Sometimes called a "monte carlo" for reasons that I don't know.) At some point I tried to figure out a way to do it with pure math but it was way, way, way, way beyond my capabilities.

      Delete
    2. There's a fun history for why they're called "Monte Carlo" simulations. The concept of a bunch of simulations using random numbers was developed as part of the Manhattan Project. As such, it needed a code name. They used Monte Carlo because the inventor's uncle used to always gamble at Monte Carlo.

      Delete
  45. Mr. Torvik, you have a great site with very interesting and helpful information. I apologize if you have already answered this, but I was wondering if you had thought about or would be interested in adding one more color scheme to your daily games info. You have Red for teams that lose outright, and Blue for teams that win, but how about a color that shows teams that win and also cover your prediction.

    ReplyDelete
    Replies
    1. I will think about this but to be honest I'm not really focused on "covering" or not. What I am looking for ideally is to nail the projected spread on the nose. So if a team is favored by 6, it is a better result for my model if they win by 5 (off by only 1) than if if they win by 16 (off by 10, but "covered"). So I think if I were to add more visual information to that page it would be some way to show how close/far the actual spread was to the predicted spread.

      Delete
  46. Mr Torvik, love your site. How is rebounding % calculated? Thanks!

    ReplyDelete
    Replies
    1. Offensive rebounding percentage is OR / (OR + opponent's DR) -- basically, what percentage of available rebounds did you corral. Defensive rebounding % on my site is actually opponent's offensive rebounding percentage (what % of their rebounds did you let your opponent get, so lower is better.)

      Delete
  47. I noticed on “ T-Rank History for Utah St.” you have Steve Henson as the coach for 2017, 2018 for Utah St.. That is incorrect, Tim Duryea was the coach those years.

    ReplyDelete
  48. Is there anyway we could have a search by “head coach”? I want to see who has been the best head coach the last 5 years for example.

    ReplyDelete
    Replies
    1. Not currently, though that's a good idea and something I've thought about. You can search for games by coach on the game stats page:

      https://barttorvik.com/gamestat.php?

      Delete
    2. I did later add a "by coach" checkbox to the main page.

      Delete
  49. What is the meaning of the highlighted team?

    ReplyDelete
    Replies
    1. Depends on the page, but on the main page a team will be highlighted if it as chosen as the "team" to focus on.

      Delete
  50. Would there be any chance that in the "Team Charts" section, you could have the Close 2 and Far 2 (offense and defense) as various drop down options to correlate to other stats?

    Thanks!

    ReplyDelete
    Replies
    1. Will think about that. Can see the appeal, but unfortunately the way I have the data segregated means it would take some doing esp. to make it "sliceable"

      Delete
  51. How hard would it be to scan through box score data and be able to quantify the various box score stats such as "Bench Points", "Fast Break Points", "2nd Chance Points", etc. Would be a great feature of the site to include some of that data.

    ReplyDelete
    Replies
    1. It would be a major project for me because I haven't focused on those things (e.g., I have not kept track of starters) and some aren't actually in a lot of box scores.

      Delete
  52. What a wonderful site, can't believe I had never seen this before. Appreciate all the hard work! I do have a question though and it basically revolves around "simulating" the last couple weeks of games under the T-Ranketology Forecast. For example, I went to Virginia and clicked "win out" then had them also win the ACC Tournament, which puts them at 22-4. Below in the Teamcast section of the page it has Virginia slotted in as the best 2 seed following these simulations. Now does this take into account how other teams will finish up the season? Should I view the Cavaliers chances of getting that top 2 seed as unlikely even in this scenario because the other teams around that range (Alabama, Houston, all the Big Ten teams) will also get quality wins down the stretch?

    ReplyDelete
    Replies
    1. Thanks Austin!

      Yes, in a sense it does account for how other teams are expected to do. The scenario for all the other teams is their "average" (expected) results for the rest of the season. It does not simulate conference tourneys for other teams (other than conf tourney games that have already been scheduled).

      One other thing to note is that the baseline T-Ranketology projection for Virginia is also assuming average/expected results for Virginia the rest of the regular season. Currently that's five games in which they're expected to go 4-1 or so. So winning out the regular season doesn't move them up much from that baseline, and then even the ACC tourney (depending on opponents) might get them only 1 more quad win or so.

      So yeah overall I think it's unlikely that UVA moves up to a one seed, but if they win out my guess is it's actually more likely than this projection is giving it credit for. I do run 10,000 full simulations (including an attempt at conf. tourneys, though who knows how those will look this year) and currently UVA gets a 1 seed in 2.6% of the simulations.

      https://barttorvik.com/tourneycast.php?date=20210219&conlimit=All&sort=oneseed

      Delete
    2. Thanks for the reply, that makes sense!

      Delete
  53. I love your page. I was just curious how many times you end up adjusting, I know you said you are trying to find the limit, but I'm just curious how many times that ends up being that you usually do that before the numbers aren't changing

    ReplyDelete
    Replies
    1. Thanks! I don't really count, the algorithm just runs until the change gets to a very small number that I chose sort of arbitrarily. Earlier in the season it takes fewer runs, later in the season (when there's more adjusting going on) it takes more. Maybe like 150? Something like that.

      Delete
  54. Great data! I was looking at your POY standings and noticed a larger than expected gap between 1 and 2. I'm wondering - how do you calculate this? Is there an input of what's expected, media, odds?

    ReplyDelete
    Replies
    1. Thanks Dan! The POY rating is based on a variety of factors, mostly tempo-free player stats, but there are team & conference components as well. Among player components one somewhat unusual one is a factor related to percentage of a team's shots that a player accounts for. Honestly my subjective opinion is that Garza really is the runaway POY - his efficiency at his usage level is almost unheard of, he hardly comes off the court, and his rebounding and defensive stats are good as well.

      Delete
  55. Could you please add player's jersey numbers to the players stats on the team pages? Love the page by the way, I use it all the time.

    ReplyDelete
    Replies
    1. I used to have that, but when I went to the method of providing the more interactive table, that particular datum is no longer easily accessible (for reasons of me being a bad programmer). But I will try to add it back at some point. And that actually does show up if you look at a game matchup (by clicking on the projected score), e.g., https://barttorvik.com/trank.php?year=2015&t1l=Wisconsin&t2l=Kentucky&#

      Delete
  56. Thrill Quotient is a great way to decide which games to watch in December, but this time of year I'm more interested in watching bubble games than top-25 games. For example, I'm more likely to tune into (48) Arizona vs (55) Oregon than (13) Creighton vs (14) Villanova today, because there is more on the line for the teams playing.

    It would be great to have a number on how a win or loss would affect a team's tournament chances; for example, the difference between the team's chances in case of a win and a loss. Maybe it would have to be adjusted so it doesn't overstate the importance of a hypothetical Gonzaga vs Bubble or Quad 4 vs Bubble game, which is likely to be a blowout. A similar calculation could say how many seed-lines a certain game is worth for a surefire tournament team.

    Have you considered a feature like this? Or is there already something like this on the site?

    ReplyDelete
    Replies
    1. Hi Alex, yes - check out the Daycast, which is ordered by the Torvik Tourney Thrill Quotient (T3Q), which is very similar to what you describe.

      https://barttorvik.com/daycast.php

      Delete
    2. This is really useful; not sure how I missed it before. Thanks!

      Delete
  57. is there anyway to compare two teams, head to head?

    ReplyDelete
    Replies
    1. JUst to see projected score? https://barttorvik.com/cgi-bin/pred.cgi

      If you want to compare stats, if you select a team from the drop down menu on the main page it will allow you to select another team to compare them to. If it's an actual matchup, there are links on the team & schedule pages for each matchup. (And if you examine those URLs you can deduce how to bring that page up for any two teams.)

      Delete
  58. When you click on a team, you can see their "four factor" stats on the left side, with a checkbox option to "adjust" the stats. I'm wondering how these stats get adjusted (says it's not iterative) and I'm wondering if there is a single page/database with all of the teams adjusted four factors, so I don't have to click on each individual team to see the adjusted stats? Love your stuff and thank you!

    ReplyDelete
    Replies
    1. So the adjustment basically looks at their opponent averages compared to the D1 average, and adjusts up or down accordingly. There's also a home/road adjustment based on historical differences in home/road performance for each stat. So if a team's opponents on average force more turnovers than D1 average, their offensive turnover rate will be adjusted down (better) accordingly. It's not iterative in the sense that there's just a single adjustment, instead of running the adjustments back through iteratively like I do with the adjusted efficiencies. This is sort of an experimental feature, not to be taken too seriously (though I think it passes the smell test.)

      To see the whole D1, just click on one of the adjusted numbers in the table and it will take you to the main page but with adjusted stats, e.g. https://barttorvik.com/trank.php?year=2021&mingames=1+&top=347&venue=All&adjall=1&split=0&sort=11 (can also put "adjall=1" into the URL).

      Delete
    2. Got it, thank you so much! Going forward I think it'd be awesome to expand on those adjusted stats and make them iterative. Not sure if that's something you planned on or not, but I'd love to see it if that's something you're interested in trying.

      Delete
  59. Is there any reason that the trendlines are not working on the Team Charts feature?

    ReplyDelete
  60. Hey Bart,

    Love the site. I have a question though. Is the percentage by the predicted score the chance that the team covers the spread your research as spat out or is that just the chance that they win straight up?

    ReplyDelete
    Replies
    1. chance of winning. I presume chance of covering the spread is always going to be about 50%.

      Delete
  61. Any reason why I can't view the 2022 'Team Totals' shooting splits page? I can see the 2022 player totals page just fine, as well as the 2022 team % share page but not the team totals.

    Love the site!

    ReplyDelete
  62. Error on the Denver UTSA game?

    ReplyDelete
  63. Can you give a layman's explanation of WAB and what it is trying to do and how?

    ReplyDelete
    Replies
    1. See response to comment in February 2018 above.

      Delete
  64. Bart,

    How is your adjusted tempo calculation different from KenPom? Thanks

    ReplyDelete
    Replies
    1. I don't know because I don't know how exactly Kenpom's is calculated. But they are very similar. At this point in the season main difference is probably still preseason prior. Other main source of difference would be how/whether to discount stats from mismatch-blowouts and how to account for recency bias.

      Delete
  65. I see F.U.N. in the team pages and there is a team rank next to it, but I can't seem to find a spot where you can sort all the teams to see which ones are the FUNnest and which ones are the mopiest...

    ReplyDelete
    Replies
    1. I think you're right that FUN is not listed for all teams in one spot anywhere. It is available in the 2022_team_results.csv file (columns AH and AR)

      Delete
  66. Hello! love the site and the data you are providing. I was wondering which shots count as Close 2s and Far 2s, in terms of feet from the basket. Thanks!

    ReplyDelete
    Replies
    1. Hi -- it is based on descriptions in the play by play data. So "layups" "tips" "dunks" (etc) are counted as rim attempts, and all other twos are counted as mid-range. Obviously this is far from an exact science, as it relies on scorekeeper descriptions and play-by-play fidelity ... but it's the best I can do, which is good enough for me.

      Delete
  67. Hi Bart!

    Is there an easy way to get the data for adjusted tempo for each team from conference games only?

    Thanks so much for all the work you've put into this site!

    ReplyDelete
    Replies
    1. Yes, if you change the "type" filter on the main page to "Con (C)" the adjusted tempo will show just conference games.

      Also, on the team pages, if you "Vs." filter to conference games the adj. tempo figure will show conf. only figure.

      Delete
  68. Hi, thanks for all the work you do to maintain the site and make the statistics so accessible. I hope my question isn’t too confusingly-worded. When looking at WAB numbers from different points in previous seasons, is the WAB number influenced by future games of the same season? For example, were I to look at USC’s WAB up until January 2021, would that be incorporating the future March success of their PAC-12 opponents or would it only be reflective of how their opponents had played up until that January date?

    ReplyDelete
    Replies
    1. Hi Ben, yes the WAB number is based on the current strength of a team's opponents, so it does change. One caveat to that is that post-season games are not included in the WAB number & ranking shown on the main page (though those numbers do change slightly as the regular season opponents ratings change during the post-season).

      Delete
  69. what qualifies as a quad 1-A win?

    ReplyDelete
  70. Love the site. What is the formula for transforming adjusted game efficiencies to a predicted score? Then once you have that predicted score, do you assumed a normal distribution to get win probability? If so, what standard deviation do you use? 11 is a value I have heard before.

    ReplyDelete
    Replies
    1. Hello -- see response to comment on 4/15/19 above for how projected scores are calculated. Win probability is calculated directly from venue-adjusted efficiencies: calculate each teams' pythag based on adjoe and adjde, then team 1's projected win percentage is

      (t1py - t1py * t2py) / (t1py + t2py - 2 * t1py * t2py)

      Delete
  71. Hi,
    What does Quad1A mean? I understand quad1, but can't figure out the 1A part. Thanks.

    ReplyDelete
    Replies
    1. Quad 1A are top 15 home, top 25 neutral, top 40 road ... These used to be specified on the publicly available teamsheets, but aren't any more. My understanding, though, is that committee still has access to this category and in any event they certainly do give special treatment to "top half" Quad 1 wins.

      Delete
  72. Hi,
    Can you add a filter here to show only active coaches? That would be great. Thanks.
    https://barttorvik.com/cgi-bin/ncaat.cgi?conlimit=&yrlow=2002&yrhigh=2021&type=coach&sort=23

    ReplyDelete
    Replies
    1. If you add "&active=1" to the URL that should work now.

      Delete
  73. Hi,
    I noticed that your WAB doesn't change regardless of the date you put in. So if I wanted to know what WAB was on March 14, 2022, it will say the same thing on all dates from March 14 through the championship game. It seems the record and other stats change but WAB doesn't. Why don't you show the WAB by date? Thanks, Rich

    ReplyDelete
    Replies
    1. Hello Richard. WAB does actually change by date -- but the caveat is that postseason games do not count. That was just a decision I made years ago because the purpose of WAB is really for evaluating a resume for tourney selection, so I wanted to freeze it on Selection Sunday.

      Delete
  74. When you look at the upcoming games. It has a TTQ number in the far right column where the game is essentially ranked. What exactly is it ranking?

    ReplyDelete
    Replies
    1. http://adamcwisports.blogspot.com/2014/01/introducing-torvik-big-ten-thrill.html

      Delete
  75. Hey Bart,
    I noticed the two D-1 independent schools (Chicago State & Hartford) are not listed in the Team Tables page. Not sure if that was on purpose or not, but could those teams be added?

    ReplyDelete
    Replies
    1. thanks John - typo in the code that had never mattered because there hadn't been Independents since that page was created until this year.

      Delete
  76. Let's say a team has one of their starters injured and ruled out for the entire season just before the season starts, or in the first couple games of the year. Let's also assume they're listed near the top of your top 10 projected contributors for that team. Will your projections start to account for this player being missing from the lineup after the 15 or so games it takes for preseason projections to wear off? My understanding is that your algorithm will take quite a while to pick up on long term injuries once a player has played a good chunk of the season and made measurable contributions, but how will it react to situations like this early season-ending injury to a projected impact player?

    ReplyDelete
    Replies
    1. That is generally correct, with a caveat: I believe there have been a few occasions where I have altered the preseason component to account for an injury after the season started. This doesn't affect the version of the preseason projections shown on the archived pages, but I can change the actual data I use to run the numbers during the season, and I can fairly easily re-run the projections to account for a missing player. I don't remember the precise circumstances, but I believe I have done that at least once. But it obviously has to be something major that I take notice of and take a manual step to address.

      Delete
  77. Hi Bart,

    I'm wondering if you have metrics on 1H data for these teams? Looking to do some model testing on how teams perform in the 1H. Thank you for putting all of this together this is an amazing CBB reference!

    Thank you!

    ReplyDelete
    Replies
    1. Sorry, I do not keep track of half-by-half stats.

      Delete
  78. Hi Bart,
    I know on the main t-rank page. I can adjust the dates to see efficiency rankings in a customized range of dates. I am wondering if there’s a way to see what the projected line for an upcoming game would be only using data from a specific date range. For instance with the preseason rankings factoring in Wyoming is ranked 101and a slight favorite against Santa Clara 11/30. When I change the start date to 11/7 Wyoming drops all the way to 239. Presumably this makes Santa Clara a favorite in the game. Is there a way to see what the projection of the game would be only based on data from select dates? I believe I’ve heard some people say there’s a way, but I’m not sure if it’s something easily accessible to a stats noob like myself or if those people are downloading data and then doing some magic I don’t understand.
    Thanks!

    ReplyDelete
    Replies
    1. Whenever you do a filter on the main page, a link will appear right above the table with the text "Today's games with these ratings" that will take you to the schedule page showing the modified predictions. You can also go there directly and do the filtering there: barttorvik.com/schedule_mod.php

      Delete
  79. On the schedule page, what do the bold and/or red game times indicate?

    ReplyDelete
    Replies
    1. It's a little easter egg / "secret" that I prefer to keep a mystery.

      Delete
  80. Hi Bart,

    Not sure if this is a repeat question, don't think my comment went through the other day. I'm wondering how I can go about collecting 1H data for these teams? Appreciate all the the work you've done with the site - it's a fantastic resource!

    Thank you.

    ReplyDelete
    Replies
    1. Hello, I believe I responded above -- I don't track stats by half. Half-by-half boxscores are available on the ncaa's stats site (stats.ncaa.org)

      Delete
    2. You totally did - I missed it my bad. Thank you though!

      Delete
    3. Thank you for everything you do here truly, this is one of the most useful handicapping resources.

      Delete
  81. What does 3P/100 mean and how is it calculated? It doesn't appear to be 3 pointers made per 100...am I correct? What does STL mean and how is it calculated? It doesn't appear to be steals total or steals per game??

    ReplyDelete
    Replies
    1. 3P/100 is three pointers attempted per 100 possessions played. It's a stat that draftniks requested I add.

      STL is typically steal percentage - an estimate of opponent possessions that end with a steal by that player while he is on the floor.

      Delete
    2. Small correction-- I believe the 3P/100 is actually three-pointers MADE per 100 possessions (not attempted)

      Delete
  82. You write in T-Ranketology: "Note: This uses old T-Ranketology "score" system. Still fun to play with."
    Any reference you could provide which explains the old score system and how the Ranketology weightings work?

    ReplyDelete
    Replies
    1. I believe the "old score system" text you're referring to appears on the "Create your own T-Ranketology" page. That allows you to mess with the inputs and weights, etc., that formed the basis for the original version of T-Ranketology. If you follow the link on the word "T-Ranketology" on that page (and the real T-Ranketology page) you'll go to an explainer that describes the current system and the old system. http://adamcwisports.blogspot.com/2016/02/t-ranketology.html

      Delete
  83. Hi Bart,

    Could you please tell me how you arrive at the numbers shown in your team table for the following two columns:

    1. EXP
    2. TALENT

    Is it fair to assume that these numbers will not change throughout the season?

    ReplyDelete
    Replies
    1. Both can change during the year because they are minutes-weighted.

      Experience is based on class year (3 for senior, 0 for freshman) with caveat that it actually counts how many years a guy has played 10 games in, so if a guy is listed as a soph even though he's played two full years already, he'll count as a junior.

      Talent is based on recruiting rank.

      Delete
  84. Hi Bart! I'm curious if there are ways to access day by day team ratings from before the 2014-15 NCAA Season? Thanks!

    ReplyDelete
    Replies
    1. The site didn't exist before that season, so what exists for those season is based on retroactively rerunning the seasons. And there were no (comprehensive) preseason projections, so there is really no way to even do it retroactively for early in the season (because you'd get crazy results). With those caveats, there are compressed files available at timemachine/teamresults/YYYYMMDD_team_results.json.gz starting on Feb 1 back to the 2010 season.

      Delete
  85. Hi Bart! I was wondering if there's a way to import your T-Rank Line and TTQ into google sheets. Thanks!

    ReplyDelete
    Replies
    1. To be honest I have not used google sheets in this way so I don't know.

      Delete
  86. Just an observation, I look at your NCAA BB rankings and being somewhat visually impaired have a very hard time seeing the ranking #s to the left. Maybe you can reduce the team names by 20% mag. and increase the ranking #s by 20%. Just a suggestion. Otherwise great work. THX

    ReplyDelete
    Replies
    1. Thanks for the feedback, will see what I can do.

      Delete
  87. Long time reader/user/lover of the site. This year has been very interesting. Not a criticism but a genuine question, this year I'm seeing outliers in the top of the rankings (i.e. Saint Mary's, Rutgers, Ohio State) where the algorithm is seemingly propping up these teams higher than they probably are from an efficiency stand point. What in the formula is causing these teams to be somewhat 'out of place'?

    ReplyDelete
    Replies
    1. Thanks for the kind words. There's really nothing special going on with those teams - generally speaking they've just performed very well on per-possession basis given the level of their competition. They've each been "unlucky" in that their win-loss record is mismatched from what you might expect based on their overall efficiency (all are 300 or lower in Kenpom's luck rating currently). If T-Rank's ratings for these teams was significantly deviating from other respectable systems I'd worry about it, but they're all similar in Kenpom etc.

      There are always teams like this. For a famous example that turned out pretty well, see this comment exchange above dated December 17, 2018 at 9:25 PM

      Delete
    2. Good point. Will be interesting to watch the rest of the year.

      If OSU fans chase Holtmann out of town this year, I feel like the line for him will go around the block. What's happening in Columbus is crazy.

      Thanks for the reply!

      Delete
    3. Just want to let people know that you have to select the "load more" button below to see additional comments!

      Delete
  88. Hello. I am pretty new to your website. Why are the numbers different between the home page and team charts? For example, 2022 regular season Gonzaga BARTHAG is 0.9731 on the home page. 2022 regular season Gonzaga BARTHAG is 0.972 on the team chart. I know they are very very close, but I am curious as to why they are different. Thanks

    ReplyDelete
    Replies
    1. Interesting observation, and I will look into it. I suspect it has something to do with some numbers getting rounded differently but they should be identical.

      Delete
    2. Thank you for noticing this and bringing it to my attention - actually was a significant coding error regarding calculation of game values that had an insidious effect.

      Delete