Sacred Data

If you want to work with any data on the T-Rank site, please get in touch with me—I'm happy to share and most of it is available in bulk on the site without the need to scrape.

For example, much of the data is available at the site in .csv and .json files in the format of XXXX_team_results.csv (or .json) where XXXX = the year. So, for example, http://barttorvik.com/2019_team_results.csv gives final stats from last season. These files update constantly during the season.

For player stats, see the first comment below. Also, please search the comments below because I’ve answered many questions in them over the years. 

Sometimes I notice mass scraping operations that are detrimental to site performance, and I take efforts to block those. If that happens to you and your aims were not malicious, let me know.

217 comments:

  1. Hi. I wanted to pull player stats from 2009 to 2016 for a school project. Is there any way could help me get the csv files for each year?

    Thanks

    ReplyDelete
    Replies
    1. csvs for player stats are available on my site at getadvstats.php?year=2009&csv=1 (change the year for other years)

      The column header info is available here: https://www.dropbox.com/s/ryugeykvntto5ji/pstatheaders.xlsx?dl=0

      Delete
    2. Had a question about the column header info? It seems as the order of some columns change as the years change? Making it hard to match the headers to the dataframe itself. Any tips on how to properly match ?

      Delete
    3. I think what's going on is that in 2008 and 2009 the blank columns (stats from play by play, such as shot locations, dunks, etc) are being skipped when the csv is created. All years from 2010 to present should have the same columns as shown in that dropbox link.

      Delete
  2. Hey I am attempting to pull lineup/player efficiency numbers but cannot find a reliable boxscore api feed with subsitutions. Can you share where you are pulling your data?

    ReplyDelete
    Replies
    1. I use a variety of sources. I've a paid subscription to the feed at natstat.com and also fill in gaps from stats.ncaa.org if necessary. But I don't parse play-by-play for subs (on/off) so not exactly sure if this will help you.

      Delete
  3. Hi Bart, just want to say thanks very much for all your data. Your work is really engaging, and it has been a big hit for us over at No Bid Nation (the only William & Mary-focused basketball blog). I am hoping to put together a model to track the CAA this year, and I will be sure to give you credit!

    ReplyDelete
  4. Hello,
    Do you have a returning production data point? I am happy to compile it myself from a csv file if the compiled data points are available.

    Sincerely,
    Kevin

    ReplyDelete
    Replies
    1. I typically calculate "returning possession minutes" for preseason projections https://www.barttorvik.com/rpms.php

      Delete
    2. Hey Bart, this website and data is so cool! I just stumbled upon it. I am trying to find returning possession minutes for the 2019-2020 season, 2020-21 season, and 2021-2022 seasons. I am just checking if there is data for all of those?

      Delete
    3. Thanks! There are files available at YEAR_rpm.json going back to 2017 I believe.

      Delete
  5. Checkout the bigballR R package! Even if you aren't familiar/fluent in R programming the package has functions that will enable you to download/calculate play-by-play/stats (including lineup and on/off stats) and save data it as a csv with only a couple lines of code. Checkout the package's github page (https://github.com/jflancer/bigballR) that includes a handful of examples that should be a big help.

    ReplyDelete
  6. Hi! Is there an easy way to access "Today's Games" with each matchup and its predicted winner, spread, and probability? I'm looking to pull games from 2012-2019.

    ReplyDelete
    Replies
    1. This information is available at YEAR_results.csv - but it only goes back to 2015.

      Delete
    2. Bart- this is a great site so kudos to you and the rest of the crew for compiling this information. I downloaded the YEAR_results.csv files and cannot figure out what the last two columns represent. Can you tell me or point to column headers file? Thanks!

      Delete
    3. I believe the last two columns are pregame "Torvik Thrill Quotient" and pregame projected tempo.

      Delete
  7. Hello, big fan of your content. I run a sports betting YouTube channel a major focus point is a monte carlo simulation model I use. I have used a scraper for ncaa.org for years, but with the mass cancellations this year, its been a bit of a pain, but I've been able to work around it. However, there are still some games missing data, such as Eastern Illinois-UW Green Bay from December 5: https://stats.ncaa.org/contests/1983012/box_score

    I've only found that game and UTEP-St. Mary's that have returned a "Box Score Not Found". It's only 2 games, but still, it bothers me. So I am interested in your thoughts about NatStat as you said you subscribe. I don't need play by play data, just box score data. Is it worth it for just that? Or should I just let go the very small percentage of games on ncaa.org that have no data and not worry about them.

    Thanks, William

    ReplyDelete
  8. Bart: I just found out about your stats website. My bad! I am IndyStar's Butler beat writer and am surprised to see Aaron Thompson 19th in player rankings. It has long been evident how valuable he is, but somehow you have quantified that. If you don't mind, please send short explanation: david.woods@indystar.com.

    ReplyDelete
  9. Hi Bart! This data is awesome! I'm doing data analysis on home field advantages during COVID, but it looks like there is a slight problem with the first few columns of 2021_results.csv. It looks like it is combining both teams and the date into a single column, so the first game of the 2021 season looks like this: McNeese St.Nebraska11-25. Do you have an easy fix for that?

    ReplyDelete
    Replies
    1. Hi Carver. That is intentional, as that field is what I use as a unique gameID. There is a file at YEAR_super_sked.csv that has more information.

      Delete
    2. Hi Bart,
      Do you have a guide to what the columns are in YEAR_super_sked.csv?

      Delete
    3. Not really - best I can do is this;

      muid, date, conmatch, matchup, prediction, ttq, conf, venue, #0 - 7
      team1, t1oe, t1de, t1py, t1wp, t1propt, team2, t2oe, t2de, t2py, t2wp, # 8 - 18
      t2propt, tpro, t1qual, t2qual, gp, result, tempo, possessions, t1pts, #19 - 27
      t2pts, winner, loser, t1adjt, t2adjt, t1adjo, t1adjd, t2adjo, t2adjd, #28 - 36
      gamevalue, mismatch, blowout, t1elite, t2elite, ord_date, t1ppp, t2ppp, gameppp, #37-45
      t1rk, t2rk, t1gs, t2gs, gamestats, overtimes, t1fun, t2fun, results] #46-54

      Delete
  10. Hey Bart! do you have a .csv file for all team stats?

    ReplyDelete
    Replies
    1. Most are available at YEAR_fffinal.csv

      Delete
    2. This doesn't seem to be accurate anymore, is this still correct? Putting in 2023_fffinal.csv doesn't download any files.

      Delete
    3. Hmm, it still works for me and the files are still there.

      Delete
  11. Hi Bart! Is there any way to download pre-tournament team statistics from the last few years?

    ReplyDelete
    Replies
    1. Couple ways to do this.

      1) You can use the T-Rank Time Machine (https://barttorvik.com/trank-time-machine.php) to get the actual ratings on the day after Selection Sunday. Those data files are available at /timemachine/team_results/YYYYMMDD_team_results.json.gz(compressed json files)

      2) You can filter the main page to just pre-tournament games by selecting only Regular Season games in the "type" drop down. This doesn't give the exact pre-tourney adjusted efficiency because it doesn't account for the recency bias that the actual ratings use. You can accomplish the same thing by setting the date ranges to end at Selection Sunday.

      This data can be pulled at, e.g. teamslicejson.php?year=2019&json=1&type=R (for 2019). Change "json=1" to "csv=1" for a csv. (I leave it as a fun project for your to figure out the columns.)

      Delete
    2. Hey Bart - Is there any way to get data for 2009 and 2010 using step 1 above? Looks like data only populates back to 2010. Thanks!

      Delete
    3. Sorry, at this point the time machine only goes back to 2011 season.

      Delete
    4. Hey Bart,

      First of all, thanks for all of your work. The website and data are really cool and it's clear you're passionate about this stuff!

      I'm trying to pull pre-tournament data for the teams for the 2021 season and tried your method listed above. However, I am noticing that it isn't taking into account conference tournament games if you set the date any earlier than May 1 2021. For example, the data through May 1 says Abilene Christian's record is 24-5, but if you move it back a day to April 30, 2021 it removes (presumably) the conference tournament data while keeping the NCAA tournament data (ACU's record drops to 20-5, which incudes their 1-1 record in the NCAA tournament but does not include their 4-0 conference tournament record). It appears that the other statistics are affected as well (the Barthag and other stats change if you go back a day).

      Thanks!
      Ryan

      Delete
    5. Hi Ryan - thanks for the kind words. Sorry didn't see this comment till now -- but I believe we corresponded on Twitter about this. In case anyone else runs into this issue, what was actually happening by chancing the date to April 30th was that it took the system off the "real" ratings that incorporate recency (this happens whenever you customize anything) and the records were different because the non-D1 games don't count for the ratings are only included in the win-loss record when the "real" ratings are displayed just so that the record matches the official ncaa win loss records.

      Delete
  12. Hey Bart, is it possible to get the T-Ranketology Now data in json format?

    ReplyDelete
    Replies
    1. There is a file at now_inprob.json

      Delete
    2. thank you, is there a way to include the seed or to sort it by the seed?

      Delete
    3. the "score" is in there (the sixth element for each team) so if you can manipulate the data in your programming language of choice it should be trivial to sort by that.

      Delete
    4. thanks! indeed it does appear that sorting on the sixth element for each team manipulates the data into the correct order for almost all of the 1-12 seeds.

      maybe you can help me further, as i am trying to build a visual representation of the T-Ranketology Now bracket. i can sort on the score element to get most of the 1-12 seeded teams. however, it seems natural that a lot of the First Teams Out have higher scores than the teams that would be seeded 13-16... do you know if it might be possible to use this data to seed teams 13-16 correctly as well?

      Delete
    5. I've created a new file at now_seeding.json that has the projected tourney teams in order of score

      Delete
    6. amazing, thank you so much!!!!

      Delete
  13. Hi Bart,

    Thank you for all you do for the CBB community. Do you have a JSON/CSV file with information on quad 1/2/3/4 wins that includes who team x has beaten in each quadrant?

    ReplyDelete
    Replies
    1. The closest thing I have set up is a file at columns_now.json - it's a poorly organized json file but elements 8 - 11 are dictionary/objects that show who each team has played in each quadrant (8 is Q1, 9 is Q2, etc) but it is not broken down by wins & losses.

      Delete
    2. Okay, that's a start. Thanks. Is there JSON for each team's schedule with results? Maybe I could map the quadrant names from columns_now.json to values in the results file.

      Delete
  14. Hey Bart! big fan of the website and thanks so much for making all of that data available to us! I'm trying to use your super_sked dataset for a class I'm in, and I was just wondering though if you'd possibly be able to share what the column headers are for that dataset? Some are pretty self-explanatory but others I'm not quite sure, thanks again!

    ReplyDelete
    Replies
    1. Sorry I don't actually have this have this easily accessible in a way that would make much more sense so I prefer to leave it as a little puzzle ;)

      Delete
  15. Hi Bart - This is so cool. Is the data from the Teamsheets Rank page available in a .csv?

    ReplyDelete
  16. Hi Bart! Is there a CSV file or Json for a team's schedule and a result of the matchup? We found this page,https://barttorvik.com/results.php?team=Memphis&begin=20081101&end=20090501&conlimit=All&year=2009&top=0&hteam=&quad=5&rpi=&f=1, and we're hoping to find a source of this data without having to scrap it. The statistics you post are really awesome!

    ReplyDelete
    Replies
    1. getgamestats.php?year=2008&tvalue=Memphis will get you most/all of those stats in json.

      Delete
    2. Is there any way to grab team stats in a CSV?

      Delete
    3. see reply to comment on 1/15/21 above

      Delete
  17. Hi, Bart. Fantastic website! I am doing a school project on NCAA Tournament teams and would love to download your data for just NCAA Tournament teams each year from 2008-2019. Is there a CSV file for that? For instance, I would like to download all data from a page like this for each tournament: https://barttorvik.com/trank.php?year=2008&sort=&top=0&conlimit=All&venue=All&type=T&lastx=0#

    Thanks so much.

    ReplyDelete
    Replies
    1. if you put "&json=1" or "&csv=1" into the URL, you should get the data.

      Delete
  18. Hi Bart, is there anyway to view team strength of schedule ranks over a multi year span? (specifically looking for the 3 seasons from 2018-2021)

    ReplyDelete
    Replies
    1. Here is one way: https://barttorvik.com/program-maps.php?tvalue=Wisconsin&year=2021&sort=&t2value=None&avg=all&top=0&quad=4&venue=All&type=All&xax=99&yax=38

      Delete
  19. Noticed some missing data from Wichita's last game: https://www.barttorvik.com/box.php?muid=CincinnatiWichita+St.3-13&year=2021
    Not sure how this affects anything else related to your ratings.

    ReplyDelete
    Replies
    1. Weird, thanks for letting me know. SHouldn't affect the ratings, but does affect player stats.

      Delete
  20. Hi Bart,

    Great site. Love all the work you do. I'm curious if versioned Team data is available for download? That is, do you have and would you make available the team data from each day of the past few seasons (e.g. 2/17/2019, etc.)?

    ReplyDelete
    Replies
    1. data files are available at /timemachine/team_results/YYYYMMDD_team_results.json.gz(compressed json files)

      Delete
    2. Hey Bart, thanks for all the data and help. Along the lines of this ask, is there a way to get the team charts from each day the past few season? Trying that same url with charts instead of result didn't work. Thanks

      Delete
    3. Hello. Not exactly sure what you're looking for as far as "team charts" data - the charts don't really have separate data sources, they typically use the same data that's in the team_results and super_sked files, but they are also created dynamically depending on the selected filters. So I guess shorter answer is that there are no saved day-by-day files specific to the charts.

      Delete
  21. Bart,

    Thanks for an amazing resource.

    Any chance you could leave players on the transfer page after they have committed to a new school? It would be interesting to be able to compare incomings based on Porpagatu! (or whatever else you want).

    ReplyDelete
    Replies
    1. Stats for committed transfers are here: https://barttorvik.com/playerstat.php?link=y&year=trans&minmin=0&start=-11101&end=trans0501

      Delete
  22. Hi Bart,

    Big fan of the site.

    I have been getting the advanced game stats for each game using getgamestats.php?year=2021 and I was wondering if there is anyway to get the raw totals for each game (like total turnovers, total rebounds, etc.) in a similar format as well.

    Thanks in advance

    ReplyDelete
    Replies
    1. those are available in the year_super_sked.json file or the year_season.json file.

      Delete
  23. Hey Bart, thanks for a great resource and being responsive.

    I was wondering if there is a strength of schedule data point? I know you adjust several things based on schedule strength, but I was looking for SOS as a specific number and maybe I'm dumb, but I'm unable to find it.

    If it is available, I am looking for it for multiple years as well.

    Appreciate any help you can provide.

    ReplyDelete
    Replies
    1. Hello,

      There are SOS metrics on the team page, and a summary table here:

      https://barttorvik.com/sos.php?year=2021

      Delete
    2. OK I could be going brain dead again, but I was able to load the CSV of this for 2021, but 2020 is not working. Or I have just forgotten how to do it.

      Appreciate any help. I was trying to put player PORPAGATU! by year with SOS by year dating back to 2009 (but probably didn't really need to go back that far, that's just what I saw on a previous question so for some reason I picked it.

      Delete
    3. Feel free to not post this msg. Just to clarify the previous.

      Actually no, it wasn't the schedule data I got, it was a copy of something I had already loaded.

      I am attempting to use the year=xxxx&csv=1 method.

      Delete
    4. Hi, not sure I'm following completely but there is no CSV available for that SOS page - cant just pull down the table though.

      Delete
  24. Hey Bart,

    Thanks for all of the amazing data you make available.

    I would like to use the player advanced stats gamelog data to pull stats like ORTG for a player by game. Files in json or csv would be great.

    Thanks,

    Wilson

    ReplyDelete
    Replies
    1. Hi Wilson,

      This data is available at YEAR_all_advgames.json

      Delete
    2. Hey Bart, love your stuff. This is fantastic!

      Is this page also updated during the season, the game by game stats?

      Delete
    3. Is this still working? For some reason I'm not getting anything here. Thanks for your help!

      Delete
    4. because of the size of the files, they are now in a compressed format and the files are called YEAR_all_advgames.json.gz

      Delete
  25. Hi Bart,

    I am trying to navigate to the 2021 team shooting split TOTALs on your site, but it keeps taking me to the 2021 T-Rank page when I try to navigate there. It seems 2021 is the only year with this issue. Any other way I can see this data?

    Thanks!
    Jason

    ReplyDelete
    Replies
    1. Sorry about that, should be fixed now (just a bug related to going live with some 2022 stuff and my bad programming skills)

      Delete
  26. Hi Bart,

    Recently I went to go pull data in the JSON format through the JSONIO package in R, but I received an error that the connection could not be opened. This was odd since I've used this method many times in the past. Through some more digging I found that this may be a problem with the SSL certificate. On September 30th Let's Encrypt (who issued the certificate for your website) had their root certificate expire which meant that some connections will no longer work. They claim that there is some form of fix for this, but it's beyond me. Do you have any knowledge of this issue/potential workaround?

    Much appreciated, thanks.

    ReplyDelete
    Replies
    1. Hello. I'm sorry I don't know anything about the issue with Let's Encrypt's certificate not working. I frankly barely understand the SSL stuff at all, so it's somewhat miraculous that I got it set up at all, and if LE stops working that would be very bad.

      I do call some of my own data files through python, and that still seems to work.

      One thing you might try is just changing the URL you use to pull the data so that it starts "http://" instead of "https://"

      Delete
    2. Update on this. I discovered that my Tourneycast simulations actually broke after 9/30 for this reason because I run that script on my local PC, and the version of Python/requests on that machine was having this issue with the expired SSL root certificate. I upgraded the requests module for python, and that fixed it. So one thing you may wish to try is seeing if there is an update to the JSONIO package you're using for R, or else trying a difference package to pull down the data. If you figure it out, please do let me know.

      Delete
    3. After a very arduous process I have come up with a solution that so far has proven to work in R using the httr library.

      library(httr)
      set_config(config(ssl_verifypeer = FALSE))
      options(RCurlOptions = list(ssl_verifypeer = FALSE))
      options(rsconnect.check.certificate = FALSE)
      WebScrape <- GET("https://barttorvik.com/2021_team_results.json")
      Data <- as.data.frame(do.call(rbind, lapply(content(WebScrape,"parsed"), as.vector)))

      Obviously this then needs to be cleaned up, but it will return the proper JSON dataset.

      Delete
  27. Is there any way to see previous year's preseason projections? (such as the preseason ratings for 2019)

    ReplyDelete
    Replies
    1. Yes prior seasons are at trankpure19.php (change the year for other years)

      Delete
  28. Thank you so much for your data! I am interested in working with the player data from 2009 to present, and all together these excel sheets amount to over 60 thousand players. I was wondering if I can access a .csv file specifically for high-ranking prospects (or in other words, some sort of filter to look at only players that were drafted) or if I would have to do that manually. Thanks!

    ReplyDelete
    Replies
    1. You can pull the table from this: https://barttorvik.com/playerstat.php?link=y&sIndex=45&minGP=15&sortToggle=1&minpick=60&year=all&start=-11101&end=all0501&pickSelect=-1&erk=1500

      Delete
  29. Hello Bart,

    I was curious if you had an updated schedule spreadsheet for this coming CBB (Men's) season and if you do where one can find it?

    Thank you in advance, Sam!

    ReplyDelete
  30. Hello Bart,

    Huge Fan and I love the site that you have created! I wanted to reach out to see if you had the 2022 schedule in a .csv format that one could download from the website? Thank you!

    ReplyDelete
  31. Hey Bart, absolutely love this site! 2 questions for you.

    First, I can pull individual games stats from YEAR_all_advgames.json for every year 2008-2021, but nothing is popping up for 2022. Is this just a delay thing?

    Second, is there a way to pull a csv for allrostersYY.php? I've tried allrosters19.php?&csv=1 for example, but no dice.

    Thanks a ton, and excited to follow for another season!

    ReplyDelete
  32. Hi Bart,

    Have you ever done some backtesting of your model accuracy vs. Vegas opening and closing lines? I hope you don't mind, but I've been trying to run this analysis on 2 year's worth of historical data and have found some interesting results. It does appear that there's a positive correlation between the magnitude of the discrepancy between your line and Vegas, and the likelihood that your line is closer to the final result. This appears to be especially strong in November/December. As the season progresses however, the frequency and magnitude of prediction discrepancy progressively reduces, and accuracy performance vs. Vegas becomes more random, especially when the magnitude of prediction discrepancy is low. While the frequency of large prediction discrepancies drops dramatically late in the season, the predictive power seems to become stronger.

    This has all been in an endeavor to find a sweet spot of when/where betting based on your predictions is profitable. Curious to see if you've found similar trends?

    ReplyDelete
    Replies
    1. Hello - that is interesting stuff. I have not tracked betting lines prior to this year, just haven't had the data. I also always figured it would be kind of brutal because (1) sports books have access to all kinds of additional useful stuff including actual betting data (wisdom of the crowd) and can respond to injuries in ways my model does not. (Not to mention they have access to my site!)

      But I have been tracking them this year and have been pleasantly surprised. In games where the discrepancy is more than 3.0 points, at this moment taking the T-Rank suggestion would have you 96-69-5 which seems prettay good. It seems the preseaon projections were unusually good this year, which is gratifying (because they are the result of a shit-ton of effort.)

      That said, I don't recommend anyone use my site for gambling purposes and certainly make no warranties.

      Delete
    2. Yes, 58% success rate against the spread is phenomenal, and if it is repeatable over a large sample is something that professional gamblers salivate over. The 3 to 4 point discrepancy seems to be a key inflection point based on my analysis as well. When you say that you are tracking betting lines starting this year, are you doing that manually and can we find it on your site anywhere or is it available to download like the other stats files you've pointed to in these threads?

      Much respect to the hard work you have put into this project!

      Delete
  33. Hey Mr. Bart, I am familiar with pulling simple tables off of websites into excel and google sheets but I am having trouble with the Team Table which has some amazing information. Thanks for all your hard work and would love to know if you could help me get this table, specifically for stats like the height and experience.

    ReplyDelete
    Replies
    1. Hello, it should work to add "csv=1" to the URL parameters. E.g., team-tables_each.php?csv=1 will allow you to save the data in the table.

      Delete
  34. What are the hurdles that you see to adding a feature to account for injuries in future seasons? If it's something you're interested in, I would love to try and help accomplish it.

    ReplyDelete
    Replies
    1. First, there would have to be pretty robust model for what effect to give to an injury. I don't have that, and it's not something I really have any good ideas on. (I only have my "highly dubious missing player analysis.").

      Second, I would need highly detailed and accessible information about injuries for all 358 teams. (Because for this to be useful it would ideally be prospective, and for retrospective usefulness a DNP-coach's decision is qualitatively different that a DNP-injury.)

      Both of those are pretty much dealbreakers for me.

      One thing I've considered, which I may look into this offseason, is figuring out a way to incorporate betting lines into the ratings, since really provide the best information about injuries that I can think of.

      Delete
    2. Just my 2 cents, but I think you will find that incorporating betting lines is not a good adjustment for injuries. It may be a powerful indicator late in the season, but early in the season when there's very few injuries it will have a detrimental impact. Your ratings do so well early in the season it would be a shame to tinker with that.

      Delete
    3. Thanks for the input. Realistically I probably never will incorporate betting lines, just something I've thought about.

      Delete
  35. Hi Bart,

    I was looking for shooting split data separated into close 2s, long 2s, and threes like you have on your site here https://barttorvik.com/teampbp.php?year=2021&conlimit=&sort=1 for a school project for the years 2017-2021, but I was having trouble downloading it as a cvs file. I would really appreciate it if you could provide directions for how to download the shooting split data. Thank you!

    ReplyDelete
    Replies
    1. Hello, there are files at YEAR_pbp_teamsstats.json - sorry don't have those easily accessible in CSV format but hopefully you can work with the json. Also, if you're just looking for those five years it's pretty easy to just pull the tables from the page into Excel, whether through a simply copy & paste or through an extension like Table Capture (https://chrome.google.com/webstore/detail/table-capture/iebpjdmgckacbodjpijphcplhebcmeop?hl=en)

      Delete
    2. Thank you for the advice. I was able to get the data into an excel file using the table capture tool. Appreciate it!

      Delete
  36. Hi Bart,

    First off, this website is sensational. Thanks for all that you do with it and for your responsiveness!

    Second, I am working on a school project where we are planning to run a logistical regression model to predict what teams will make the NCAA tournament based on projection data. I'm not trying to go too crazy with this, but there are two important features that we will definitely need -- returning player data and incoming transfer data.

    Starting with returning player data, I found returning possession % and returning minutes % at this link: https://barttorvik.com/trankpure17.php?. However, this only goes back through 2017. I know data for returning minutes % goes back through 2009 because I can find it at this link: https://barttorvik.com/program-maps.php?tvalue=Virginia&year=2022&sort=&t2value=None&avg=all&top=0&quad=4&venue=All&type=All&xax=37&yax=3 . Is there a link I can go to to get this data for each season without having to scrape it myself from these team charts? Ditto for returning poss. % if possible. Lastly on returning player data, I know for a fact I saw returning points % as well, but I cannot find that stat anywhere. To summarize what I am looking for in terms of returning player data, it is as follows:
    1. Returning minutes %
    2. Returning possession %
    3. Returning points %
    for all teams from 2010-present. Is this possible? Even one of these stats would be extraordinarily helpful, but several or all would be better.

    As for transfer player data, this seems a bit more tricky. In one of your player stat pages (https://barttorvik.com/playerstat.php?link=y&xvalue=trans&year=2021) you linked a nice website that lists all known transfers (https://verbalcommits.com/transfers/2012). It wouldn't be a perfect solution since it is often unclear whether the player was immediately eligible or not, and it would require a decent amount of preprocessing, but it is a possibility. But more ideally, I was hoping you had some sort "one-number" stat that rates the incoming transfers for a particular team in a particular season. I would imagine you had something like that for projection purposes, but I am not sure where to find it.

    Thanks in advance for your help!

    ReplyDelete
  37. Can you tell me where to find each team's adjusted tempo? Most of the team data are in https://barttorvik.com/YEAR_fffinal.csv, but I didn't see tempo stats there. Thanks!

    ReplyDelete
  38. Hey,

    I really appreciate all the information you have provided. I am trying to download 2019-2022(current year) metrics and having trouble downloading it as a csv for a school project. All I am looking for is the data that's provided on the home page.
    Thanks!

    ReplyDelete
    Replies
    1. Hi Cameron. You could try adding &csv=1 to the URL on the main page, should get you most of the info in a CSV although you'll have to do some manual manipulation of the resulting file and you'll have to figure out which columns are which. All the stats are also available directly in files at YEAR_fffinal.csv and YEAR_team_results.csv

      Delete
  39. Hello Bart, where could I find a list of the oldest to youngest teams in D1 BB, for the 2021-22 season? Thank you, sir!

    ReplyDelete
    Replies
    1. I'm not sure - I do not have/publish age data on the site. I do keep track of an "experience" stat that's based on class year, and you can look at that on the Team Charts page or the Team Tables page.

      Delete
  40. Hello Bart, do you have any downloadable data on home/road splits? Thank you sir for the work. Love the numbers.

    ReplyDelete
    Replies
    1. Nothing prefabricated but if you filter to home/road splits on the main page and then add &csv=1 or &json=1 to the URL parameters, you get the data.

      Delete
  41. Thanks again. I tried adding &csv=1 to the end of the URL, but it didn't download. Is the home split URL "https://barttorvik.com/trank.php?year=2022&sort=&hteam=&t2value=&conlimit=All&state=All&begin=20211101&end=20220501&top=0&revquad=0&quad=5&venue=H&type=All&mingames=0#" ?

    ReplyDelete
  42. Mr. Torvik, Got it. Thanks again!

    ReplyDelete
  43. Mr. Torvik, one last thing. Is there a way to find home/road FT%? Only the FTRate is included in the home page splits. Thank you.

    ReplyDelete
    Replies
    1. You can look at it for individual teams by using the filters on the team pages and for all teams at once on the team_tables page: https://barttorvik.com/team-tables_each.php

      Delete
  44. Hi Bart,

    Is it possible to download Teamsheet data over a given time range. I'm interested in looking at partial season F.U.N. data, but I can only find daily snapshots. Would it be possible to find a team's F.U.N. from, say, 1/15/2020 - 5/1/2020?

    ReplyDelete
    Replies
    1. Unfortunately that's not something I've got set up to work.

      Delete
  45. Hello! Thank you for yo TIAur hard work and dedication. I was wondering if you had this same type of website but, for NBA?

    ReplyDelete
  46. Is a CSV or JSON of the transfer stats available anywhere, perhaps? I tried adding “&csv=1,” etc., to the url and returned all player stats, not transfers only. Thanks in advance!

    ReplyDelete
    Replies
    1. The transfer stats page is created dynamically by just excluding guys not on the transfer list. Though obviously you can just copy the table.

      Delete
  47. Is there any way to extract all of the season stats available, not just those on the team stats page. I would like to see 3pt attempts, total assists, etc?

    ReplyDelete
    Replies
    1. Not really from my site in a ready format - obviously those kinds of counting stats are generally available and not really what what my site is focused on.

      Delete
  48. Hey Bart,
    Appreciate all you have done with the website. I am trying to get all regular season data from 2008 to now from the team tables. As I'm trying to predict ncaa tournament success and don't want tournament data included. Where would I place the csv=1 in the url to grab this data.

    ReplyDelete
    Replies
    1. You would change the "type" filter to "Regular Season" and then add the "&csv=1" to the resulting URL

      Delete
  49. Hello, I am using your player stats for a project, thank you very much for the quality data you provided. I got a question about the feature "Min_per". What exactly is this feature, I think it is the percentage of the total minutes of the games which the player get a chance to play. Please correct me if I am wrong.

    ReplyDelete
    Replies
    1. Hello, yes it's the player's percentage of available team minutes played. Available minutes are total team minutes divided by 5.

      Delete
  50. Hello Bart, I have been working with your player data for a machine learning project, while I was analyzing the features, I encountered a feature named rec rank. what is that feature stands for? could you please give info about that? Thank you very much for the data you provided.

    ReplyDelete
    Replies
    1. Hello that is short for "recruiting rank" i.e. what the player was ranked as a recruit coming out of high school.

      Delete
  51. Hi Bart, in your player data, There are features named as "rimmade" and "midmade" what are they mean? Thanks a lot.

    ReplyDelete
    Replies
    1. rimmade = shots made at or near the rim; midmade = two point shots that were not made at or near the rim (i.e., midrange)

      Delete
  52. Hello Bart, I am using your player data for a machine learning project. I am examining the features at the moment. I have a question about the feature called gbpm, I know what bpm is, but could not figure out what gbpm is. Can you inform me about that? Thanks a lot.

    ReplyDelete
    Replies
    1. Hello. I am not very careful with these internal labels since they only need to make sense to me. And there are some vestigial/superseded stats that for logistical reasons still get tracked & produced. In the data you are looking at, I believe "bpm" is the *original* version of Daniel Myers's BPM, and "gbpm" is the revised & updated version (BPM 2.0). I originally labeled it GBPM because when Daniel was developing it one of his main goals was to remove non-linear variables, which would make it suitable for use on small samples, including a single game (thus: "game box plus minus," GBPM).

      Delete
    2. Thanks a lot, you are the man.

      Delete
  53. Hi Bart, is there anyway to get all the box score stats for a team for a season?

    ReplyDelete
    Replies
    1. I don't think I have anything team by team. Obviously there are many sites that publish the box scores.

      Delete
  54. Hi Bart, I am currently studying a Master of IT degree in New Zealand and would like to center my project around a machine learning model which attributes win shares to individual players. This is a purely academic non-profit project. Would it be possible to get access to college player statistics from 2005 to 2022? I would be happy to share my results with you when finished if you are interested. Thanks

    ReplyDelete
  55. Hi Bart. What is the difference in Time Machine historical ratings vs. the historical ratings on the other pages (team table/customizable t rank)? I noticed the ratings are vastly different, so just curious which one was most accurate. Also, those rating are the ratings after the games have been played on those days, correct? Thanks.

    ReplyDelete
    Replies
    1. The time machine is an archive of the ratings as they were on that day (or, for older seasons, would have been on that day). So they incorporate the preseason prior and recency bias as of that day, for example. The main difference with the customizable t-rank is that it doesn't incorporate the preseason prior, so there will be vast differences when comparing early season results, and which one to use in that circumstance depends on what you're trying to do (e.g., if you're trying to see how accurate my ratings are early in the season, you want to use the time machine version, since that's what the ratings actually look like early in the season).

      There shouldn't be vast differences late in these season -- if you see something like that I would be interested in taking a look to make sure nothing has gone haywire.

      Delete
  56. Hello Bart, thank you for all the hard work you put into this! I am attempting to pull in a JSON file that I can work with for each team's schedule for the upcoming season. Unfortunately the platform I'm using (Glide Apps) has an issue with JSON files over 1MB in size, so your master sked won't work. Is there any other files that are more condensed, or separate schedule files for each individual team? Any ideas would be appreciated! Thank you in advance.

    ReplyDelete
    Replies
    1. Hi Colby. Hmm. It looks to me that all the master_sked files (both JSON and CSV) are significantly smaller that 1MB. I don't think I have any smaller schedule files. (The "super_sked" docs are much larger but they have a bunch of extraneous information.) I have recently come across this website that has a pretty simple list of games that may be easily parseable: http://basketball.kislanko.com/2022/schedules/index.html

      Delete
    2. Yep, I'm an idiot. haha I was trying to use the super_sked, not the master. However, I do have one other question. Is there any way to tell from your master_sked file whether a game is to be played at a Neutral site? I see that home and away teams are always in the same order, but I see no designation for neutral.

      Delete
  57. Mr. Torvik,

    Last year I was able to download the T Rank data in CSV format. I was just wondering when the 2023 data will be able to be downloaded, and how I can go about downloading this?

    Thanks for all the work you do.

    ReplyDelete
    Replies
    1. Should be working as far as i know. the team_results.csv file wasn't updating becuase I forgot to change a 2023 to 2024 but that's fixed now.

      Delete
  58. Hello sir,

    Which file will provide the strength of schedule data?

    Thank you.

    ReplyDelete
    Replies
    1. Hello, there is no handy separate file for SOS data but it is a simple page that you can just cut and copy the data table from (or pull using something like the Table Capture extension for Chrome) https://barttorvik.com/sos.php

      Delete
  59. Hey Bart,
    I am trying to pull this page: https://barttorvik.com/team-tables_each.php into excel using Get Data from Web and pull it as a refreshable Web Query. I really enjoy your work and want to be able to pull it and have it re fresh with updated data day-by-day. Do you know why the web server does not recognize this page as a table and how I could be able to pull it? I really appreciate it!

    (Sorry if this is a repost it does not seem that my first message published)

    ReplyDelete
    Replies
    1. Hello. Unfortunately the data in that table is populated dynamically using javascript so that may be why you can't easily pull it in to excel programatically. But if you use something like the "table capture" chrome extension it's easy to cut and paste the table into excel. Also, you can download a csv directly by putting &csv=1 in as an url parameter (e.g. team-tables_each.php?csv=1)

      Delete
  60. Hi Bart!

    Do you have a csv file that immitates your game stats page?

    ReplyDelete
  61. Hello Mr. Torvik,

    Is there a way to download the daily schedules as a CSV file? I tried adding &csv=1 to the end of https://barttorvik.com/schedule.php, but it redirects me back to the t-rank page.

    Thank you.

    ReplyDelete
    Replies
    1. Hello, not in that precise format. All the basic information is available for all days in the YEAR_super_sked.csv files though. Also, the schedule page is just one HTML table which is fairly trivial to cut and paste into excel (or use the table capture tool in chrome)

      Delete
  62. Hi, I visited your website for NCAA Basketball and I love it!

    One question I do have though is where do I find the data for the following:
    a.) the raw number of possessions that each team has for the past x games?
    b.) the pace of each team for the past x games?

    The website includes stats like Adjusted Offensive/Defensive Efficiency (whose equations incorporate the number of possessions that each teams has had) but I'm having trouble seeing the aforementioned stats above.

    ReplyDelete
  63. Hi Bart, truly great site! I love the Teamsheet Ranks in particular. However, it only goes back to 2019. Will those ranks be updated for years prior to 2019?

    ReplyDelete
    Replies
    1. Thanks! Teamsheets only goes back to 2019 because that was when the NET was introduced (along with inclusion of the other metrics on the actual team sheets) so not really possible for that page to go back any further.

      Delete
  64. Hi Richard. The version on the SOS page (for Elite) is for all scheduled games, in other words a projected strength of schedule for the season. The version on the team tables page shows just for games played (or the games in the filtered view).

    ReplyDelete
  65. Hi Bart, could you explain how you sort for record (Rec) on the main page and team table? It doesn't make sense to me.
    Thanks, Rick

    ReplyDelete
    Replies
    1. I believe it sorts by wins, but disregards non-D1 wins so can look off, especially at the bottom where non-D1 wins are included in in the displayed total when you're looking at the full season ratings.

      Delete
  66. Hello Bart, Is there a way on your home page web address bar to select only a certain number of teams rather than the default list of all teams?

    ReplyDelete
  67. You can narrow by conference and couple other similar designations by that drop down method. You can also choose a specific team, and if you do so you'll be given an option to compare to one other team. Beyond that, under the FAQ menu there's a "search/filter" input box and if you put in team names separated by the pipe character ("|") the table will narrow to just those teams (you may need to put a tilde ("~") before the search

    ReplyDelete
    Replies
    1. Hi Bart, I tried to do that while I was on the main page and entered: ~duke|villanova|kansas|north carolina Four teams popped up but they weren't these teams. lol I remember the first two: Arkansas and Arkansas Pine Bluff followed by two random teams. What am I doing wrong?

      Delete
  68. You should be able to just filter to NCAA-T teams in the conference selector.

    ReplyDelete
  69. Filtering for NCAA-T teams on the T-Ranks page; it seems that Games <> Wins + Losses for about 1/4 of the 68 teams -- is there a reason for that?

    ReplyDelete
    Replies
    1. Usually when something like that happens it's because non-D1 games are getting taken out of the W-L record. Just filtering for NCAA-T shouldn't do that, and it doesn't look like it does it for me. (Perhaps you have some other filter applied as well.)

      Delete
    2. Looking at it again, the games total just doesn't include non-d1 games but on the unfiltered views I put the non-d1 into the W-L record just to make it match NCAA.

      Delete
  70. Hello love your site! I appreciate you've made all this data so available. I've tried downloading the csv for college player data for 2006 and 2007 but it seems to download 2023 data instead. Do you have the college player data for 2006 or 2007?

    ReplyDelete
  71. Hi, I was wondering if you had recruiting data anywhere? It looks like player pages have a RecruiTRank number but incoming freshman don't have accessible player pages it seems? Thanks, love the site!

    ReplyDelete
  72. Hello, I don't really have any independent recruiting data. The "RecruiTRank" is mainly based off the 247 composite, so for incoming freshmen that's mainly what I'm relying on as well.

    ReplyDelete
  73. Hello, thanks for reaching out. During the offseason my site was at times overwhelmed with bots (mostly AI scrapers) so I took a look at the logs and noticed that that particular file was being downloaded hundreds of times a minute every hour (or something like that) by a google apps script, so I blocked google apps scripts. For instance, I just checked the logs and today the 2024 mastersked filed was attempted to be accessed by a google apps scripts 811 times in less than three minutes.

    I'm guessing this is what affected you. Using a script to pull the data in that file occasionally--even like once an hour--is certainly fine (although it is more or less a static file that, once the season starts, is only going be updated when the schedule changes, so not very often). But the hundreds of pulls per minute is the kind of thing that gets my attention. I'm sure this wasn't intentional, but if you could look into how you are pulling the data and see if you can streamline it that would probably be good. Feel free to DM on Twitter to see if we can get it worked out.

    ReplyDelete
  74. Hi there. Is there a way to get the rosters data here: https://barttorvik.com/allrosters24.php

    I've tried the csv trick but couldn't figure it out. I'm trying to get full rosters for all teams. Thank you.

    ReplyDelete
    Replies
    1. I don't exactly track or publish rosters, per se, just top 10 projected contributors for each team. but if you add "?full=1" to that URL you'll get a table with all the projections for every player that you can easily copy and paste into a spreadsheet.

      Delete
    2. my man! top ten contributors is perfect. tyvm!

      Delete
  75. Hi Bart, Do you store box score stats for each game? I am looking to access playing time data for each player for each game.

    ReplyDelete
    Replies
    1. Player game data is available in the YEAR_all_advgames.json.gz (compressed json) file

      Delete
  76. Hi Bart - love this site, thanks for all your work on it. I was wondering whether there's anywhere we can find game-level adjusted tempo data. On the team pages I see the box on the team pages with its adjusted tempo for the season, but the table doesn't include values for the individual games. Thanks!

    ReplyDelete
  77. Hello Bart. Thank you for all of this! I am currently working on a project where I am using all player game data. I found the data in the YEAR_all_advgames.json.gz but was wondering if you could send a column header for the data set because I am not sure about some of the data points. Thank you again!

    ReplyDelete
    Replies
    1. hopefully thi shelps

      numdate datetext opstyle quality win1 opponent muid win2 Min_per ORtg Usage eFG TS_per ORB_per DRB_per AST_per TO_per dunksmade dunksatt rimmade rimatt midmade midatt twoPM twoPA TPM TPA FTM FTA bpm_rd Obpm Dbpm bpm_net pts ORB DRB AST TOV STL BLK stl_per blk_per PF possessions bpm sbpm loc tt pp inches cls pid year

      Delete
  78. Hey Bart. I was wondering if you would be able to help with the column headers for your timemachine/team_results/YYYYMMDD_team_results.json.gz files. I can get about half the data points but I have not been able to figure out most of the middle data points. Thanks!

    ReplyDelete
    Replies
    1. A lot of them are probably not real useful but here's a key, I think:

      ['rank','team','conf','record','adjoe',"oe Rank",'adjde','de Rank','barthag','rank', #0-9
      'proj. W','Proj. L','Pro Con W',"Pro Con L",'Con Rec.', #10-14
      'sos','ncsos','consos','Proj. SOS','Proj. Noncon SOS','Proj. Con SOS', #15-20
      'elite SOS', 'elite noncon SOS', 'Opp OE', "Opp DE",'Opp Proj. OE', #21-25
      'Opp Proj DE', 'Con Adj OE', "Con Adj DE", 'Qual O', 'Qual D', #26-30
      'Qual Barthag', 'Qual Games', 'FUN', 'ConPF', 'ConPA', 'ConPoss','ConOE','ConDE', #31-38
      'ConSOSRemain','Conf Win%','WAB','WAB Rk','Fun Rk, adjt']) #39-44

      Delete
  79. Is there a way to download the scheduled games directly into a .csv Excel file? I use this, compiled with some other info for personal bets. The only way I have been able to do it is to copy and paste into Excel, but it takes hours to make the data useable. P.S. Thank you for building this, it has saved me in the past, helped me, and forced me to dig into statistics and data more than I ever would without sports. You rock.

    ReplyDelete
    Replies
    1. Thanks! That page is created from other data but I believe it's all available (at the season level) in the YEAR_super_sked.csv file

      Delete
  80. Have a question about player stats. Trying to do a project using stats across several years, but the player stat csv files seem inconsistent with what columns are being used. Just curious if there's a quick/easy way to get player data from several different years that follow the same format data-wise? Thanks for all the data and the help

    ReplyDelete
    Replies
    1. My guess is that you're running into an issue with the data from 2008 and 2009, which doesn't include play-by-play derived data. See comment from June 6, 2022 at 4:09 PM

      Delete
  81. This is just fantastic data. Wondering if there's any way to get data going back to 2000? I'm looking for player seasonal totals if possible.Thanks!

    ReplyDelete
    Replies
    1. Unfortunately not, all the data I have is on the site and I have no plans to go back further.

      Delete
  82. Hey Bart, would it be possible to get an export to excel feature for the teamsheet ranks page in the future? or if something of this nature already exists where would that be? would be very helpful for automation of updating my bracketology spreadsheet, rather than manually updating the predictive and resume averages, q1a and q1+q2

    ReplyDelete
    Replies
    1. Don't have them in csv but there are files for the teamsheet ranks at teamsheets.json and for the quadrants (in a weirdly organized format) at columns_now.json ... A little information on how to decipher that file is in the comments below.

      Delete
  83. Hey Bart,

    I am doing a data project and was trying to download all of the data from the player stats tab. I have tried all of the different combinations of, .csv, csv=1, etc. and was unable to get the data in csv format. Is there a way to fix this?

    ReplyDelete
    Replies
    1. Hello, see information below in my reply to a comment on November 5, 2020 at 9:59 AM (can control-f for "player stats")

      Delete
  84. Hi Bart, I love the site! Lots of good information especially combined with Kenpom.
    Is the t-rank page downloadable as csv or excel? There are some things that are tough to clean up like having the scheduled game as part of the row for the team data, or having the averages that show up as part of the header row, actually be it's own row rather than part of the header. Basically a data simplified version of the html page to make the data easier to consume in conjunction with other inputs like Kenpom.

    ReplyDelete
    Replies
    1. Yes, add”&csv=1” to the URL parameters (more info in comments below)

      Delete
    2. Thanks, my screen didn't scroll all the way down to the real "first" comments so I missed some stuff.

      Delete