Today’s guest post – and instant analysis of the 2009 NCAA Tournament (posted within minutes of the brackets being announced) — is yet another excellent offering from Erich Doerr . Erich first contacted me prior to the 2006 NBA Draft with a statistical preview in hand. Each subsequent year has seen improvement in the depth and breadth of his analysis. Outside of his basketball writing, Erich does consulting work for major software products by day and has started a fledgling sports-themed Open Source software initiative by night.
Back by popular demand, I have run the numbers on Selection Sunday and am here to offer WoW readers a bracket breakdown with a statistical bent.
Table One: NCAA Tournament based on Pomeroy Numbers
Table Two: NCAA Tournament based on Sagarin Numbers
Last year I employed the Monte Carlo method to generate probabilities. This year I have upgraded to calculate conditional probabilities. Essentially, conditional probabilities produce the odds of a Monte Carlo simulation as the number of simulations increase toward infinity.
Just like in 2008, I am relying on the two strongest public NCAA metrics: Sagarin Ratings and Ken Pomeroy‘s Pythagorean Ratings. Statistics used by the Wages of Wins parallel Pomeroy’s approach, as both build off of offensive and defensive efficiency.
Table Three: Teams Probability
Since top seeds represent the best teams in the land, this approach will appear to heavily favor those teams due to their quality and favorable draw. The final results here attempt to predict the statistically most probable brackets, which are not necessarily the picks most likely to win an office pool.
By the numbers, who has the right to gripe? Well, the best Pomeroy teams to miss the tournament include San Diego State, New Mexico, and Florida. As for the teams that made it, an average #1 seed would make it through Louisville’s bracket 36% of the time while the same average #1 seed would only make 21% of potential final fours given Connecticut’s draw. This discrepancy is by design as Louisville’s #1 overall seed should come with an easier path, though some may say a 15% disparity gives Connecticut a reason to complain. Last, and probably least, Pittsburgh’s first round draw against East Tennessee State generates that elusive 16-over-1 upset 6 times out of 100, about twice as often as the other 1 vs 16 matchups.
Stepping back, both the Sagarin & Pomeroy brackets come out with the same predictions in 63 out of 64 matchups. The lone difference comes from each rating’s national champion. Pomeroy’s ratings prefer the Memphis Tigers over Sagarin’s favored Tar Heels. This year’s statistically-not-a-surprise upsets appear to be #12 Wisconsin over Florida State and the #6 WVU Mountaineers run to reach the Elite 8.
The tables linked above also provide odds by conference, seed, and region. The vaunted Big East takes home the title a third of the time, while the combined effort of the 7 Big Ten entrants comes up short of a 5% probability to win it all. Looking for all the number 1 seeds to reach the Final Four? That number checks in around 24%. See the tables above and comment on your own favorite observation.
Finally, please note the Wages of Wins Journal does not condone gambling. These picks are unlikely to be as successful as last year’s batch. In general, an entry in a bracket pool has a 1/N chance of winning, where N = number of entries. Due to the layout of the NCAA tournament, it is highly improbable that a good set of picks could raise the pre-ante odds to even 2/N. Generally, there may be more gains to be had in shopping for the right office pool (i.e. the one containing the least informed participants) or game theory analysis if one was so interested in improving their office pool odds. Always note that past returns do not guarantee future performance.
– Erich Doerr
Notes:
Sagarin & Pomeroy stats are as of March 15th
For simplicity I assumed Alabama State would lose in the play-in game
Update:
Some people reported having trouble accessing the above tables. If this is the case, please see NCAA and NIT Analysis from Erich Doerr
– DJ
The WoW Journal Comments Policy
Erich
March 15, 2009
Small correction- final note should read Alabama State, not Chattanooga. Thanks again for the forum Dave
dberri
March 15, 2009
Erich,
Got it fixed. Thanks for getting this done so quickly.
JoeM
March 15, 2009
Awesome info. What about the tie breaker though…
Erich
March 15, 2009
For Memphis/North Carolina, it would be 147.5 total points. For weighted average participants, its 140.4.
Apparently, table 3 (the teams table) does not show well in firefox. Either view in IE or download this pdf.
Team didn’t make it? The NIT analysis is available here, albeit without any Home Court adjustments & much error checking.
NITTeams (PDF)
Pomeroy Bracket
Sagarin Bracket
Evan
March 16, 2009
As always, your posts are awesome.
nick
March 16, 2009
Actually, the best Pomoroy team to miss the tourney is Georgetown, but given their collapse in the second half of the season, I can’t say this isn’t justified.
30tocure30
March 16, 2009
Aaahh…more info to rack my brains with in search of the elusive perfect bracket. I am beginning to think it is the Holy Grail…a lot of talk about knowing a friend of a friend who did it, but no physical proof that the sheet exists.
The only thing you can count on with March Madness is the unpredictability of it all…
http://tinyurl.com/ceyt7b
Chris Lawnsby
March 16, 2009
Love your posts, but Table 3 seems to be jumbled. Is anyone else unable to see it?
Chris Lawnsby
March 16, 2009
never mind, just saw correction
Rob O'Malley
March 16, 2009
Wait what numbers are you using? on kenpom.com he has arizona state ranked ahead of both oklahoma and syracuse but you have them losing?
Rob O'Malley
March 16, 2009
also syracuse ranked ahead of oklahoma but you have oklahoma beating both az st and syracuse?
Joey
March 16, 2009
Quick question… why do the numbers change after every round? On Ken Pom’s site, and in the first round on his bracket, Syracuse has a higher rating than Oklahoma, but when they meet each other later in the tourney, the numbers change, and Oklahoma’s rating becomes higher. Help would be greatly appreciated.
Erich
March 16, 2009
Rob,
You are looking ahead to matchups that may not ever be. This exercise assesses the likely opponents and chances of advancing.
Arizona State faces off against Pomeroy’s 47th best team in Temple while Syracuse takes on the 93rd best in Stephen Austin. Even though Arizona State is rated higher than Syracuse, the Sun Devils are more likely to lose their opening game and therefore less likely to make the Sweet 16.
Using the numbers from the grahical Pomeroy bracket, Syracuse reaches the Sweet 16 26.2% of the time while Oklahoma reaches it 26.4%. While small, this makes just enough of a difference to make Oklahoma more likely to reach the Elite 8 even though Syracuse would win 52% of head-to-head matchups. If Stephen F. Austin had a pythag .02 worse, then Syracuse would have been projected to make the elite 8.
The bottom line is that the draw matters. Hope this helps..
Joey, the above is related. The numbers on the graphical bracket represent the chance of the team making it that far. For example, Louisville is 96.7% likely to win round 1 and 79.5% to reach the Sweet 16. Going further, Louisville makes the Elite 8 58.8% of the time and final 4 36.7%. For their semi-final and championship percentages, refer to Louisville’s line on the team table, showing a 17.6% chance to make the finals and a 9.9% chance to win it all.
Rob O'Malley
March 16, 2009
Hmm, that does make sense. But I think a bracket with head to head results included would be helpful. Look at each head to head matchup and advance the team kenpom thinks will win. This way their round one head to head matchup doesn’t effect their rounds after. I think both perspectives would be helpful. I understand it’s looking at two different things.
big ten speed
March 16, 2009
Did you take location into account in your model? The main matchup I’m thinking of is Nova – UCLA in Round two in Philadelphia, which is at least a “semi-home” game for Nova and would make the matchup a lot closer.
Joe
March 16, 2009
Erich,
Looks like your bandwidth limit has been reached.
Just as I go to fill out my bracket too…
Rob O'Malley
March 16, 2009
Also keep in mind when looking at KenPoms numbers on Syracuse that he treated their big time wins over UCONN and WVU as neutral wins. As anyone knows that watched the games, there was a huge home crowd at the garden for the Cuse. So it realistically should have at least been treated as a semi home.
Hussam
March 16, 2009
can some1 post these up somewhere, the server they were originally hosted on has run out of bandwidth.
Scott
March 17, 2009
On your “Table 3,” you have three sets of data (ML Odds, Sagarin, Pomeroy). The MLOdds and Pomeroy sets of data look identical. Can you explain the source for each set? Ultimately, I’d like to know which is the best :), but just knowing the source of each could help.
Thanks!
Erich
March 17, 2009
Scott & Big ten,
I did not take location into account, though I intended to try. The Pomeroy set runs the same formulas as ML Odds, though my original intent was to be able to accurately adjust one of them for home court advantage, though unfortunately I haven’t had time.
I have sent the files to Dave to work around the bandwidth issue. If you can host files, drop me an email ( xlssports@gmail.com ) and I’ll provide them to you, given that you link those files in these comments.
dberri
March 17, 2009
I will get all the stuff Erich sent me posted tonight.
Erich
March 17, 2009
In the meantime, one solution is to check out the following file names at
http://www.wideopenwest.com/~edoerr/
09Pomeroy.htm
09Sagarin.htm
09NCAATeams.htm
09NCAATeams.pdf
For the NIT:
09PomeroyNIT.htm
09SagarinNIT.htm
09NITTeams.htm
09NITTeams.pdf
Nick
March 17, 2009
Can you explain how you translated the Pomeroy rankings into the odds?
Erich
March 17, 2009
Nick, this article describes how to convert Pomeroy ratings to a win probability
Erich
March 20, 2009
A pomeroy update of the remaining 32..
Memphis 19.00%
Louisville 12.19%
Connecticut 11.66%
North Carolina 11.49%
Pittsburgh 8.67%
Gonzaga 8.38%
Duke 5.82%
Kansas 3.71%
UCLA 3.24%
mrparker
March 21, 2009
I’ve been keeping my eye on kenpom.com over the past week. Those numbers are extremely volatile. I made the mistake of not realizing this as I chose UCLA to be a final 8 participant. I’m sure tomorrow the numbers I was looking for UCLA to have will have changed to unfavorable numbers. I hate that I have to wait until next year to try and get this thing right.
Erich
March 22, 2009
As noted previously, there probably should have been an adjustment downward on UCLA due to Villanova’s practical home court edge. I resisted, because I’d prefer not to fudge up the numbers without firm stats to back me up.
UCLA was likely a pretty good pick if you are in a big pool and need to take some chances. I hope to have something more to say about game theory & brackets next year.
MP
March 23, 2009
Erich, could you post on update with odds of FF and winning it all for the remaining 16 teams? Thanks.
Scott
March 23, 2009
Do you have an updated ranking of the final 16 teams for the elite 8, final 4, 2, 1?
Erich
March 24, 2009
MP & Scott,
I submitted material to Dave on Sunday night, though his latest post deserves some time in the spotlight for its quality and lengthy comment thread. I believe Dave intends to post the NCAA updates either today or tomorrow.
As a sneak peek, Memphis & UNC are still the respective Pomeroy & Sagarin favorites, though the odds have certainly shifted a bit…