Monday, July 31, 2017

Badgers' Big Ten schedule

The Big Ten finally released its conference basketball schedule today—well, at least the pairings. Here's the Badgers' schedule, ordered by Torvik Thrill Quotient:

 

Based on the current preseason T-Rank projections, such as they are, it's the 4th toughest conference slate, and T-Rank projects the Badgers to go 9-9 and tie for 5th place. 

Here's a more subjective take:

Big favorite:

IU
OSU
Illinois
Nebraska

Slight Favorite:

at Rutgers
at Nebraska
Michigan
Minnesota
Northwestern

Pick em:

Purdue
at Illinois
at Penn St.

Slight Underdog:

at Iowa
MSU
at Maryland
at Northwestern

Book an L:

at MSU
at Purdue

Based on this, we've got four games in the "should win," two games in the "won't win," and 12 games that could easily go either way. Based on that, I'd say 10 wins is the target. That would likely be enough to get into the tournament, and could well extend the "top 4" streak. If I were setting an over/under, I'd probably go with 9.5.

What do you think, Chorlton?

Saturday, July 22, 2017

How I built a (crappy) basketball win probability model

The Internet is amazing. Given that I'm a philosophy major / lawyer, I really have no knowledge or skills whatsoever. But I've been able to put together the T-Rank website by just asking the Internet how to do it. Every time I run into a problem, I just ask google how to solve it. Usually this leads me to more or less step-by-step instructions on how to solve the problem, or at least gives me enough information to figure it out.

There has been one big exception to this: for a while I have wanted to see if could use play-by-play data to build a "win probability model." Not a good win probability model, just a sort of functioning win probability model. Not for any good reason, just because. But my google searches came up empty. Not only were there no step-by-step guides, there weren't even any general instructions to set me down the right path. I was lost. Harumphf.

Nevertheless, I persisted. Although I have no independent "knowledge" or "skills" I do have an epidemiologist wife who does, so one day while we were waiting in the office of a pediatric specialist (don't worry, it was a nothingburger) I ran the problem by her. I had learned just enough by then to explain the problem somewhat coherently, and she was able to set me down the right path.

Now I thought I would fill the void and put a step-by-step guide on to the Internet, mainly so future lawyers can build their own deeply flawed basketball win probability models. Also, if anybody who knows something about this stuff wants to help me improve this, just because, that would be terrific.

Step 1: Get the data.

Okay, I'm not going to walk you through this part, but obviously if you're going to use play-by-play data to make a win probability model, you need play-by-play data. Luckily, there is play-by-play data on various websites, and using google and a little pluck you can figure out how to get it. I started acquiring this data last season to calculate the "game script" average lead/deficit stat. Unfortunately, to get a complete set of data I had to use three different sources, which leads to some problems later on... (This is not just a step-by-step guide, it's also kind of a suspense novel.)

Step 2: Make the model.

Now the easy part: make your model. Done! Thanks for reading my guide.

Now you feel my pain, folks.

Step 2, actually: Figure out what kind of model you need.

This is the point when you have to figure how what you're going to do with the data. What I eventually figured out is that I was going to run use the data to run a "logistic regression." As I understand it (and really, I don't understand it), you can use this statistical method to take various variables (like score, time left, strength of teams) to predict the likelihood of another variable (win or loss, 1 to 0).

It's one thing to know that you need to a "run a logistic regression" and quite another to actually do it. As we'll see below.

Step 3: Get PBP data ("training data") into usable form.

Here's what I did: I went through (most) of the play-by-play data I have for the past two seasons, and for every second of those games I recorded the following data:

1. Seconds remaining
2. Score difference (team 2 score minus team 1 score)
3. team 2 initial expected win percentage (based on T-Rank)
4. who won (team 2 win = 1, team 2 loss = 0)

I actually recorded some more data, but this is what I ended up using for the current model.

One thing you might notice that's missing: who has the ball. This is a big flaw in my model, and I'll discuss it more below. (Suspense!)  But for the moment I'll just say that although this is a big flaw, I don't think it really makes much difference until the last two minutes.

Another thing that's missing is home court. This is another thing I left out mainly because it was kind of a pain to figure out based on the PBP data. But, also, home-court advantage is already built in to the third variable (expected win%), so there could be kind of a double-counting problem if I included it separately. I dunno, gimme a break.

Step 4: Run the logistic regression for each second

This might not be the best way, but what I did is run a logistic regression for each of the "seconds remaining" variables (2399, 2388, ... 2, 1, 0), with score difference and initial win percentage as the variables for predicting win/loss (I don't know the proper nomenclature for discussing regression, so bear with me).

Originally I ran a single regression with time remaining as another one of the variables, but the results were unsatisfactory, particularly at the margins. For example, it was obviously wrong very early in the game — I think because linearity was being imposed, but not sure. Anyhow, running it for every second worked out pretty nicely.

As I mentioned above, saying "run a logistic regression" and actually doing it are different things, so here's how I did it: I used Python, a programming language, which has a module for doing this called LogisticRegression. Here's a link to the code.

Step 5: Test it, see if it passes the smell test

The result of this model is that you can plug in "seconds remaining" (to get the right model), score (in the form of score differential), and initial win percentage expected to get an expected win probability.

For example, here's the result using Minnesota versus Middle Tennessee in last year's NCAA tourney:
For comparison, here's the Kenpom win probability graphic for that game:


Hey, not bad!

Step 6: self-loathing

Based on comparisons like the above, I'm satisfied that the model is "good enough for hobby work." But I'm also aware that the model is flawed. I took shortcuts along the way because I was just trying to see if I could get it to work. Then once I tested it and saw that it worked reasonably well, I had very little desire to perfect it. This serves no purpose and shouldn't be relied on. :(

As mentioned above, a core flaw here is that possession is not part of this model. I actually did subsequently attempt to add possession to the model, but the results were screwy. The core problem, I think, is that I'm not parsing the PBP data correctly for possession. This goes back to the fact that when I originally acquired the data I stripped out some useful info when I saved it. It's not impossible to deduce possession from what I've got, but it's not simple either. In the end, I'm not confident that my parsing was 100% accurate, and I think that led the model-with-possessions to be unstable. 

The second problem, I suspect, is that I'm not using enough data to include possession. I'm training the model with only about 10,000 games. Adding a possession variable slices the data another way which I suspect adds some craziness.

As you can see above, though, the lack of a possession variable usually doesn't matter much. It's instructive to look at the scoreless stretch starting at the 14:00 mark of the first half. In my model, that scoreless stretch is more a less a straight line, since score is really the only variable that affects things much. In the Kenpom model, there are noticeable squiggles as possession changes hands. But, the squiggles are pretty small -- looks like about two percent change in win probability. So my model is presumably cutting that in half and is "wrong" by +/- one percent for most of the game.

Of course, this will have a big effect in late game scenarios. If you're down two with ten seconds left, whether or not you have the ball makes a big difference. My model is significantly wrong in those end-game scenarios, but based on my experimentation it still gets the gist: the team down two with ten seconds left is very likely to lose whether or not it has the ball.

Conclusion

There you have it, googlers, that's how I built an obviously flawed basketball win probability model. May you have better ideas and more energy!

Sunday, March 12, 2017

T-Ranketology note

Well, it's Selection Sunday and the current version of the T-Ranketology algorithm has the same at-large field as the consensus (of which it is a small part) over at Bracket Matrix.

Mission accomplished.

I say this because the point of T-Ranketology isn't to try to predict the most accurate bracket on Selection Sunday. The point is to project a reasonably plausible bracket earlier in the season, so that we can see where things are reasonably likely to end up if teams keep performing like they have been. That T-Ranketology is able to basically produce the consensus field on Selection Sunday, with most teams seeded within one line of the consensus, shows that it is "good enough" to provide those useful projections earlier in the season.

ADDED:

Should note somewhere, so might as well be here, that I added one tweak to the algorithm on Selection Sunday: a good record bonus. One of the notable things about the bracket the algorithm was producing over the past few weeks was that it was notably down on the three PAC-12 teams. It seemed to me that this was probably a result that those teams just had really great records in their so-so conference. Whatever you want to say about the Pac-12, it's just hard not to be impressed by a team that's 29-4 or 30-4.

But I resisted adding a good record bonus—until Gonzaga fell off the one-line. I thought it was pretty clear that Gonzaga was going to get a one seed. The only drama on the one line was whether Duke or UNC would get the ACC's slot. After winning the ACC tournament, Duke did indeed sneak onto T-Ranketology's one-line—but at the expense of Gonzaga, not UNC.

So I pulled the trigger on a record bonus—really a "few losses bonus:" Teams got one point subtracted from their score for each loss under five. In other words, four-loss teams (like Arizona and UCLA) got one-point subtracted, and teams with one loss (Gonzaga) got four points subtracted. This not only got Gonzaga back on the one line, but it was also just enough to push Arizona onto the two line, which was pretty clearly where it was going to end up.

The result was that T-Ranketology was one of the few brackets to nail both the ones and the twos. I'll take it.

Wednesday, March 8, 2017

Big Ten Tourney madness

I think Chorlton is on a cruise in the Caribbean—oh, to be childless—so it looks like he'll lose this year's Big Ten Tourney Challenge by default. Nonetheless, I'm about to spend my requisite 20 seconds thinking about this and make my picks

Before we start, here are the current T-Rank odds, first assuming no home-court advantage for Maryland:


Now, if we give Maryland a one-half home-court advantage:


Play-in games:

Ohio State over Rutgers
Nebraska over Penn State

Second round:
Nebraska over Michigan St.
Northwestern over Ohio State
Iowa over Indiana
Michigan over Illinois

I'll be rooting for either Nebraska or Penn St. to beat Michigan St. so they have to sweat things out a little on Selection Sunday. Although MSU will probably win this game, I don't have a good feeling that they have a run in them, so I'm taking them out early for funsies.

I'd like to root against Northwestern as well, mainly so their fans go through the ultimate Selection Sunday Experience (one way or the other) but when I search my soul I find that I just do not have it in me.

Quarters
Michigan over Purdue
Wisconsin over Iowa
Maryland over Northwestern
Minnesota over Nebraska

My earlier upset is robbing us of a third Minnesota / MSU game, which would be interesting if it happens. Michigan vs. Purdue is probably the game I most want to happen, since we just saw Michigan's spread-offense attack pick Purdue apart—will Purdue be able to adjust? Or will Michigan just not hit shots this time? In any event, Michigan seems like a bad match up for Purdue, and it's a tough draw for the 1-seed in its opening round game (Michigan is actually the third best Big Ten team in terms of adjusted efficiency, and was second-best in conference play).

Badgers would love to get Iowa again, I think, and it's not a team I see them losing to twice in short succession.

Semis
Michigan over Minnesota
Wisconsin over Maryland

Champs
Michigan over Wisconsin

My pick of Wisconsin to the final is pure homer, but I would love to see another Wisconsin-Michigan game. They've played two really tight, interesting games this year, and the Wagner - Happ battles have been great.

There you have it, that's how it's going down. Chorlton, if you're able to rouse yourself from your quarters and shake off the piƱa colada haze, put your picks in the comments.

Sunday, February 26, 2017

Good training

I found this ad weird. I guess Karl Anthony is not living in the past. He is a very successful pro. Good to know he is still working hard to try and beat badgers.

https://youtu.be/njj7C2Lr1Zk

Wednesday, February 15, 2017

Anatomy of a Loss: Northwestern

About eight minutes into the Northwestern game, I was pretty sure Wisconsin was going to win. Charlie Thomas had just hit a three, and the Badgers were up 14-6. Northwestern had been garbage on offense, relying on floaters and whatnot, which were predictably missing.

The Badgers would go on to score just 8 points on their next 19 possessions, and head into halftime down 31-22 thanks to a three-point barrage from Northwestern (including one lucky bank shot). How did this happen?

The main narrative coming out of the game was that Northwestern's aggressive double-teaming of Ethan Happ shut the Badgers down. This is true in a sense, but the real truth is that the Badgers just didn't make Northwestern pay. Happ was good, really good, the rest of the first half in handling the double teams. Problem was, the rest of the team did nothing. Let's break it down.

Possession number 1: Now up 14-8, Trice tries an ill-advised drive and misses badly. Vitto Brown corrals the long rebound, however. Eventually he gets an open 3:



Possession number 2: After another McIntosh miss, Charlie Thomas passes cross-court to Koenig for a relatively open three, which he misses. Note that Thomas's pass is low, which throws Koenig out of rhythm:


Possession number 3: The Badgers still lead 14-8 after McIntosh misses again. Happ is back in the game and gets aggressively doubled. He finds Koenig cross-court, again he misses. Again, the pass left something to be desired, so you could credit the double team for that.


Possession number 4: Hayes gets called for a ridiculous phantom double dribble after the post-entry pass is deflected. Inexcusably bad officiating. The kind of stuff that makes you embarrassed to be a fan of the sport.

Possession number 5: Another open 3 for Brown, this time he gets the shooter's roll.


Possession number 6: One of the evening's more depressing possessions, as Koenig fumbles a rebound out of bounds. I don't know if this counted as an official turnover or just a team rebound for Northwestern, but it was a harbinger of things to come.


 Possession number 7: Another turnover, this time Showalter trying to find a cutting Hayes.


Possession number 8: This will be the only shot Ethan Happ takes in this slideshow, and it's from the top of the key. He had the line, I guess.


Possession number 9: With the shot clock running down, Nigel pulls up for a long two and banks it in. Not pretty. But note that the Badgers have now extended their lead to 7 points while scoring 5 points in 9 possessions. Still had to feel pretty good at this point, as Northwestern had scored just 12 points in 12 minutes, and you had to figure Wisconsin would snap out of their doldrums soon.


Possession number 10: After a Northwestern 3, Ethan Happ turns it over trying to find Brown before the double team arrives.


Possession number 11: This was when the game really turned. On their possession, Northwestern hit a tough two point jumper and Showalter was called for a foul for boxing out too hard on the made basket. This is a call that is pretty much never made, particularly against a home team. Really weird. So Northwestern got a make-it-take-it possession, and capitalized with another bucket to tie the game after a 4-point possession. How do the Badgers attempt to take back control? Charlie Thomas in the post.


Possession number 12: Game still tied, but Northwestern played great D on this possession (with Koenig and Happ resting) and Trice turns it over late in the shot clock.


Possession number 13: Now things are starting to slip away, as Northwestern hits another three and the Badgers again turn to Charlie Thomas to stem the tide. Alas, no.


Possession number 14: After another Northwestern 3, the Badgers again get nothing on offense. Hayes tries to create off the dribble at the end of the shot clock, but mostly is trying to draw a foul. Fruitlessly, it turns out.


Possession number 15: Northwestern has now hit three straight threes and scored 16 points on 6 possessions to take a 28-19 lead. Badgers get an open three for Showalter but he comes up short.


Possession number 16: Happ beats a weak double team and finds Showalter for another open 3, this time he nails it.


Possession number 17: Happ finds Koenig for a nice open 3, but Bronson just doesn't have it tonight.


Possession number 18: Happ again finds an open man, Hayes, but he can't score.


Interlude:

This is the moment when it started to seem pretty likely the Badgers would lose. Northwestern goes "2 for 1" at the end of the half, which consists of McIntosh chucking up a prayer, that gets answered by the backboard. Backbreaker. 


Possession number 18: The final possession of this sad stretch. And so it ends, not with a bang, but with a whimper.


Here's one fun thing that happened during this stretch:




Saturday, February 11, 2017

Simulation Saturday Preview!!

Edited: Crap, I forgot Louisville.

Today, FOR THE FIRST TIME EVER, the NCAA will give us an early look at how the top four seed lines would look if the season ended today.  I call this "Simulation Saturday" (as opposed to the actual "Selection Sunday") and you should too. I'm getting a trademark, probably.

It will be kind of interesting to see how this shakes out. Without the benefit of full conference seasons and conference tournaments, things are really up in the air, and this is obviously a pointless made-for-tv exercise. But that's true for all sporting events.

The 1-Seeds


Right now the clear consensus among bracketeers is that Kansas and Baylor are both deserving of a 1-seed. But I think there's also a general consensus that the Committee will probably not actually award two 1-seeds to the Big 12 on Selection Sunday (unless there's absolutely no other choice). Instead, it seems likely that the leader of the ACC—whoever that turns out to be—will be given the fourth one seed (presuming that Gonzaga and Villanova are the other two).

Right now North Carolina, Louisville, Virginia, and Florida State could all make a case for being that team. (And Duke, I guess...) But things haven't shaken themselves out yet, and at this moment picking any one of those teams for the 1-line would really be an exercise in predicting which one of them comes out on top in the ACC.

So for Simulation Saturday, will the committee grant the 1-seeds based on current resumes, or will it bend its processes a bit so that this ends up looking more like the final product?

I'm predicting they will go with current resumes. The main rationale put out for this exercise is to provide a little "transparency" into the selection process, so I think they'll want to come out armed with their "record against RPI top 50" and "conference affiliation is never mentioned in that room" factoids. Accordingly, I predict the 1 seeds will be:

Gonzaga
Villanova
Baylor
Kansas

The 2-seeds


One of the reasons I think they'll stick to the script on 1-seeds is that there's no clear leader for elevation among the potential 2-seeds. 

You've got the four ACC teams mentioned above, but right now they haven't really differentiated themselves. 

There are three Pac-12 teams arguably in contention—Oregon, Arizona, and UCLA—but all of them lack the magical "top 50 RPI wins" that the committee loves so damn much. (More on this later.) 

Wisconsin is sitting at No. 7 in the AP poll, so you might think they'd be in the mix. But they also lack the magical top 50 wins, and will likely be punished for it. 

Two teams from the SEC, Florida and Kentucky, should make the top 16, but they're both long shots for even the two line at this point. 

One dark horse is Butler, which is up there with Baylor for most impressive resume. For example, they are an incredible 12-3 in tournament quality tests (similar to Kenpom "Category A" games), which is three more wins of that type than anyone else. Unfortunately for them, this doesn't quite translate into the committee's stupid "magic top 50 RPI" wins, where they are a mere 7-2. The committee won't pay much attention to what are actually very impressive wins like at Marquette, at Georgetown, at Utah, vs. Indiana (neutral). Butler also has lost two of three, and it has two "bad losses" (at St. John's and at Indiana State) that are always hard to evaluate. Still, Butler's 7-2 against the top 50 is pretty good, and I think there's a chance they show up higher than people are thinking.

Another contender purely on the numbers is Creighton. But they've been on somewhat of a slide recently, corresponding to their loss of Maurice Watson for the season. I think they'll be dropped at least for purposes of this exercise and used as a talking point.

Although it would be defensible, I don't think the committee is going to come out with four ACC teams on the two-line. But North Carolina and Florida State are probably locks for a 2. Louisville is only 3-5 against the RPI top 50, so I think they'll be demoted.  So, after all that, here's my guess:

North Carolina
Florida State
Virginia
Florida

That final spot on the 2-line is really hard. I think the main contenders are Florida, Kentucky, Arizona, and Oregon. (With a possible "popularity contest" slot for UCLA.) Arizona and Oregon are only in contention because it seems like the Pac-12 should get a 2, but their "resumes" (as traditionally defined by the committee, anyway) are lacking. Florida has only 4 top 50 RPI wins, but it is No. 7 in the RPI and boast the 6th hardest non-conference SOS (though this is somewhat juiced by "neutral court" games played around Florida while their arena was being renovated). So I'm just taking a wild stab that Florida will be elevated to allow for the talking point of rewarding a tough non-conference schedule.

The 3-Seeds


Let's talk about the Pac-12. It has three really good teams: Arizona, Oregon, and UCLA. It has one other likely tourney team, USC, one bubble team, Cal, and one or two other okay teams (Utah and Colorado?) The rest of the conference is sort of like a mid-major division. As a result, the Big Three are lacking in quality wins, or even opportunities for quality wins. They are also at risk for some "bad losses" when they play the mid-major division on the road.

I'm seeing UCLA on the two-line in some places, and I'm a bit mystified. They are 3-3 against the magic top 50, 21st overall in the raw RPI, with the 280th ranked non-conference schedule. You can obviously make an argument that UCLA is a really good team, but not using objective metrics normally cited by the committee. I'm predicting they'll be demoted and used as a talking point.

Arizona and Oregon are slightly better than UCLA on the top-50 metric, and vastly better on non-con SOS. I think they'll be here on the three line, but could really see either of them anywhere between 2 and 4.

Louisville
Kentucky
Arizona
Oregon
Butler

The 4-Seeds


The Big Ten has a problem similar to the Pac-12's, in that there are few opportunities for magic top-50 wins. The bottom of the conference is much better than the bottom of the Pac-12, but that typically doesn't matter much to the committee. So Wisconsin, with its 2-3 record against the top-50 and 246th rated non-con SOS, will likely be relegated to the 4-line for now—at best. Wisconsin might also suffer from application of the "eye test" given its recent inability to dominate inferior opponents.

Besides the teams already mentioned but not placed (UCLA, Creighton, Wisconsin), other possible contenders for the four-line are Cincinnati (great record, but lacking magic top-50 wins), West Virginia (great team, lacking some of the stuff the committee likes), Duke, and Purdue. Possibly Xavier, I guess. Can make a case for any of these teams, obviously, but here's my guess:

Butler
Duke
Cincinnati
Wisconsin
UCLA

I'm betting Duke will get a Coach K bonus, and that UCLA will get a Hollywood bonus.

Edited to add: in my morning haze, I forgot to put Louisville in there on the three line when I demoted them from the two. Of the original fours, I'm demoting Cincinnati just cuz.




Saturday, January 28, 2017

Head Games

Daylight turns into night
We try and find the answer, but it's nowhere in sight
It's always the same, and you know who's to blame
You know what I'm saying, still we keep on playing
Head games

First of all, Foreigner Rocks. 

Second of all, I hate the stupid NBA 3 point line being on the court in college games. UW kids can’t handle the head games. College kids are either too stupid, or too egomaniacal to pretend like it isn’t there. All it takes is the presence of that line, and they feel like they have to shoot the ball from NBA range.

By my official count, the badgers launched 18 of their 25 3PA from NBA range or with their feet on the NBA line. They shot 4 more from well behind the college line, and only 3 from what would be considered a normal college 3. I’m not totally making this up. I went through my DVR of the game and took photos of every 3 point shot. Most look like this:



Or this:


Or this:


Or this:


Or this:


I won’t put up every photo, but there were way too many long threes, especially for a team that only made 3 of 25. UW eventually stopped messing around, launching 0 3PA in overtime, while they made 5-6 FG.

One more thought from an ugly game I almost wish I didn’t watch. Was tonight Happ’s coming out moment. You may remember Frank set a Badger record with 43 points vs. North Dakota in an early season game in his Junior Year. After that, Frank wasn’t always dominant, but it was the coming out moment from which his dominance started. It was his team after that day. That team was loaded with good players, but it was Frank's team. Happ is not the same player, and is only a Sophomore, but I feel like this was his coming out moment. 

After setting a career high with 28 against Minnesota, he set another with 32 vs. Rutgers. Both games on the road, and both wins in OT. Most important, after the Badgers were down 9 with three and a half minutes to go, Happ scored 8 of the final 13 Badger points, including the basket with 2 seconds left to tie. On that play he could have handed off to Bronson, but Bronson was guarded pretty well, so Happ just backed his guy down and took him. Happ then went on to add 7 more points in overtime, outscoring Rutgers by himself 15-13 in the final 3+ minutes and overtime. 

Wednesday, January 25, 2017

The Committee

Seems like everyone is putting in their 2 cents about changes to the NCAA tourney selection process this week. Since the committee is meeting soon to discuss the process, and put on a TV show so they can make more money, I want to make sure they have my valuable input. I do realize that when it comes down to it, all they really care about is making money, so that will in the end be the determining factor despite my wise advice. From what I read on the interwebs, the committee may revise the metrics part of how they pick the teams and seeds. The demise of the RPI, and the birth of a more advanced metric system to determine what teams get in will be the beginning of a new era where fairness rules over the process.

As you may have guessed by now, I don’t care that much about this. I think the process as it is right now gets everything basically right. You can certainly quibble about team 68 getting in over team 69, or a team getting seeded a couple too high or low. In the end, they all have to win 6 games against good to great basketball teams, and start seeded pretty close to where they should. As for team 69 that doesn’t get that chance, there isn't any system that will not result in a subjective decision between a few teams with very little difference between them. The idea that using advanced metrics will result in significantly different outcomes on selection Sunday is silly. It’s even more silly to think any changes will lead to a better tournament.

Now that I have cleared all that up, I can dive into this, as I don’t really have any problem with changing the metrics that the committee uses, I just don’t have expectations that anything will be different or more fair.

Part of the problem lies with the committee and that they tell us they want to get the “best” teams into the tournament, without telling anyone what "best" means. This is frustrating since there is no way to know why something like overall strength of schedule is valued over top 50 wins or vice versa in any given year. Much of the recent debate seems to be focused on a pretty good indicator of “better”, the margin of victory, and why that should or shouldn’t be used to select and seed teams. Especially as it relates to just victories, and the inevitable example of Maryland.

Coaches will game any system to give themselves any advantage possible as that is their job, and they are all crazy competitive. The RPI is gamed now, and to think that coaches won’t game any new system is naive. It’s not hard to take this to the extreme conclusion that coaches will play starters longer, run up scores, and do whatever unsportsmanlike actions they have to in order to get a higher seed. You can even go to the extreme of a team being down 3 at the end of a game, and taking the easy 2 and losing, because it will close the margin of victory over a higher risk 3 point shot that would tie. Seems crazy and it is, but the discussion of should wins matter more than margin is interesting.

With the system as it is, both wins and margin matter, but some would like to see margin matter more.  I don’t like that much. It seems strange to value a metric like margin of victory over wins to select and seed teams in a tournament where each game is single elimination where margin means nothing. It values having “better” teams over having a better season, which bring us back to the example of Maryland. Maryland has recently had great seasons (measured by wins) without having great teams. 

Maryland has been winning lots of close games in recent seasons to the chagrin of advanced stats people. They don’t blow out bad teams that they probably should if they were as good as their ranking indicated, but all those close wins don’t hurt rankings much. I concede this leads to Maryland getting into the tournament/seeded higher than they should be, and I think that is good. I think teams should be rewarded for having a great season as defined by wins (close or not). Having a great season should matter over being a great team. If I wanted to watch a college football playoff game with the 2 teams playing the best, I would have seen USC play Alabama. USC was playing unbelievable football at the end of the year, but they lost 3 games during that season. I don’t want the National Champion to have 3 losses, because wins should matter, even if USC was incredible by the end of the year. In the great words of Herm Edwards “You play to win the game”.

If you have made it through these rambling thoughts, I’m sorry I can’t give you those few minutes of your life back to you. Take heart in that it doesn’t matter, the committee pretty much gets it right anyway, and that’s good enough in an imperfect world. 

Wednesday, January 18, 2017

Quick trigger

Someone at ESPN was a bit quick on the trigger tonight. OSU won with a last second shot, but this (below) was on the site right after the win:

2016-17 Big Ten Standings

TEAMCONFGBOVR
Maryland4-1-16-2
Wisconsin4-1-15-3
Northwestern4-20.515-4
Purdue4-20.515-4
Michigan State4-20.512-7
Nebraska4-20.510-8
Minnesota3-31.515-4
Indiana3-31.513-6
Penn State3-31.511-8
Iowa3-31.511-8
Illinois2-42.512-7
Michigan2-42.512-7
Ohio State1-53.511-8

Monday, January 16, 2017

WI is better than MI

Saw a tweet today that during Beilein's time at MI he is 2-15 vs UW. I checked Amaker too, and he was surprisingly better, but still a losing 4-6.
So, since the 2000-01 season, MI is a combined 6-21 vs. UW.

WI is better than MI.

The inbound

Much of what Greg Gard does is similar to what Bo Ryan did. Makes sense, since he was Bo's assistant for his entire career, and why fix what ain't broken. Some of the wrinkles Gard has added have been major, like the experiment with the 3-2 zone, (I hope it will RIP) and some minor. One of the little things he added was a change to the inbound play when under your own goal. 

This was always a rather annoying play for fans under Ryan, for pretty stupid reasons. The play often started with some screening action toward the rim that almost never resulted in someone getting open. Even if they did, they wouldn't get a pass unless it was 100% open, as Ryan did not want any turnovers, period. Then after waiting about 4 seconds of the allotted 5, the inbounder would throw the ball out past half court to a guard who would run it down. The play always came close to a 5 second call which made fans edgy, but they almost never got the 5 second call. On occasion a defender would intercept the long inbound, and this would lead to a transition opportunity, but again this was a great rarity. 

This was a very reliable way to get the ball inbounds without turning the ball over, so Bo used it almost exclusively. From a fan's perspective, (in a league with Izzo, who runs all kinds of inbounds plays with great results) it was an area that just looked like UW should be better in. This is a rather stupid fan opinion of which I occasionally agreed with, to my detriment. When you already have an offense that is spectacularly efficient at running half-court offense, you don't need to design a bunch of schemes to get open shots off the inbounds. Just get it in, and run your normal stuff. Also, don't doubt Bo. 


I do like what Gard has done with the inbounds play though. The play is similar in that it is very safe, with an extremely low turnover percentage, but has some advantages over the chuck it deep play. It starts with a guard as the inbounder, and the other 4 players basically in a box formation. One player is right under the rim, another (which is always a big, usually Happ) is on the baseline to the outside of the inbounder. The other 2 stand somewhere between each elbow and the 3 point line. 

It looks like this:



The inbound is designed to get it in right away to the big on baseline. The player under the hoop commands the attention of his man for obvious reasons. Having a man under the hoop also tends to have the man guarding the inbounder shade to the hoop to take away any chance at an easy pass under the hoop. The 2 players at the top are far enough away that they draw their man out of the action. The defender guarding the baseline player has to maintain defensive position between him and the basket. This leaves the space to the corner wide open. Since this player is always a big, they just have to create a post position with their body so the space to the sideline is open as they step away from the basket. 

The badgers will occasionally run a back screen with the player under the hoop screening a defender at the top to open a cutter to the hoop. It's a nice wrinkle, but they don't do it often. They usually just get it in to the big on the baseline. This is followed up by the inbounding guard running around the big with the ball, who can either do a screen/handoff to the guard, or keep the ball as the guard continues through to the 3 point line.

It looks like this:

video

Why is this better than the chuck it in deep play?

1)    While both plays result in a very safe, low turnover rate pass being thrown, in the event of a bonehead mistake on the pass, the turnover is on your side of the floor, so it won’t result in a runout transition play.
2)    With the chuck it in deep play, the ball is received on the other end of the floor from your basket. It takes 5 seconds or so to gather the ball, get it across half court and initiate the offense. By inbounding to the baseline and running the handoff, you are already in your offensive set. You have your triangle all set up with the player at the top who can down screen and go to the post, or you can just let the big with the ball go right to work in the post on the wing, or the guard can take the ball and reverse it to the other side of the floor.

I know it’s not a major difference, but I like the change.

Saturday, January 14, 2017

Fun with Game Script

Bart has added game script (aka average scoring difference) to the team's T-Rank pages.  I took a look at two different things with game script just for fun:
1) Does T-Rank's prediction line do a better job of predicting  the final score or two times the average scoring difference.
2) Are particular B1G teams doing better in the final score or the average scoring difference.

These numbers are for the B1G games through 12 January. The reason I use 2xgame script is because if a team is steadily increasing their margin throughout the game, the average scoring difference will be essentially 2x the game script.

First - T-Rank's prediction line is doing significantly better predicting the final scoring difference than   2x the game script. For 13 of the 14 B1G teams, the final scoring difference is more closely predicted, although for two teams it is essentially the same. Even the one team that leans more towards game script is very close. This makes sense - the T-rank prediction line is focused on the final difference, not the vagaries of the score on the way to the final score.  However this is something I just wanted to check and we have the data.  Two teams (Penn St and Minn) show the biggest differences.

Second - I looked at each team to see how the average difference score compared to the final score, which should be indicative of which teams scoring is (to use a calculus term) concave up versus concave down. Most teams in the conference are within a couple points per game between the final score and the average scoring difference, but a few teams (Peen St, Minn, NW) look poorly in this measure.

In the chart below, I've looked at the difference in each game between the final score difference and 2 x the game script.  What this is shows is that in many games, Penn ST (and to a lesser extent Minn) has a better game script than a final score - showing that they have done well for much of the game and faded at the end - or perhaps had a significant lead early, but gave it up at some point to lose.


Team Final Dif - 2xGS
Penn St -250
Minn -127.4
Northwestern -77.6
Michigan -51.8
Iowa -41
Indiana -28.4
Nebraska -28.2
Mich St -17.4
Ohio St -16.6
Wisconsin -5.8
Purdue 4.2
Rutgers 4.2
Illinois 17.2
Maryland 18.4

I'm not sure that either look at this means anything in the grand scheme of things, they are just my attempts to play with some of the game script data.  I'm hoping to do an in-depth look at 3ptA% vs game script in the near future.

Nuke