Jamie Vann Struth is an economist based in Vancouver, BC. He owns an economic development consulting firm and crunches sports statistics for fun. His history with the Grizzlies goes back to the beginning, when they were born in 1995 at the exact time that he arrived in Vancouver to attend graduate school at Simon Fraser University. He has continued to follow the Grizzlies in Memphis and waits patiently for the NBA’s inevitable return to Vancouver.
This is part one of a two-part post on the Memphis Grizzlies. In this first part, Jamie will review the 2009-10 season. In part two, he will address the progress of the team’s young players and whether or not the Grizzlies should re-sign Rudy Gay.
The Memphis Grizzlies appeared to take a major step forward in 2009-10, adding 16 wins to their total and nearly reaching the .500 mark with a record of 40-42. Only the Oklahoma City Thunder, with a 27-win improvement, made a bigger step.
The Grizzlies’ appear to have made significant progress in Year Two of owner Michael Heisley’s Three-Year Plan to achieve playoff contention. Was this improvement expected, and does it foretell even greater things to come?
Expectations, or Lack Thereof
Entering this past season, the Grizzlies were upstanding members of the NBA’s laughingstock class. Along with the Clippers, Warriors and Timberwolves (and maybe a few others), they were viewed as a frugal and dysfunctional organization that was largely ignored both nationally and by their own fans (ranking 29th in NBA attendance the previous year). This reputation was based on performance (after 15 years in the NBA, the franchise is still seeking its first playoff victory) as well as a series of player transactions in the last several years that were widely ridiculed and often seemed motivated more by financial considerations than helping the team win.
The key personnel moves entering 2009-10 included the following:
- The drafting of center Hasheem Thabeet of Connecticut with the #2 pick in last year’s draft. The scouting report on Thabeet showed a potentially dominant defensive center in the mold of Dikembe Mutumbo who was acknowledged to be much less NBA-ready than other top draft picks (including the Memphis college product Tyreke Evans, who was chosen 4th in the draft and became NBA Rookie of the Year). Chad Ford, ESPN’s draft guru, called Thabeet the most likely high draft pick to be a bust in the NBA.
- The acquisition of veteran power forward Zach Randolph in a trade from the Los Angeles Clippers.Randolph had a history of excellent production, but also a very poor reputation for off-court problems as well as selfish and apathetic play. This trade was possible due to the cap space the team acquired in probably the most ridiculed trade in NBA history. On February 1, 2008, the Grizzlies traded Pau Gasol, their top win-producer in the previous several seasons (although not that year), to the Los Angeles Lakers. Gasol has since contributed greatly to the Lakers winning two straight Western Conference titles (and counting). Why, the critics pointed out, would the Grizzlies agree to pay Randolph $16 million per year after trading away a similarly-paid player in Gasol who did not carry the same baggage, just over a year earlier?
- The signing, just before training camp, of veteran guard Allen Iverson for backcourt depth. There was widespread doubt about Iverson’s ability to suppress his immense ego and play a backup role behind both of the Grizzlies’ youthful starting guards, Mike Conley and O.J. Mayo. Of course, Iverson is famous in Wages of Wins history as perhaps the best example of a player whose reputation vastly exceeds his actual production of wins. Iverson ended up getting hurt early in training camp and missing the start of the season; at which point he made several appearances off the bench, complained publicly about not starting, and quit the team only 67 minutes into his Grizzlies career.
The 2008-09 team went 24-58 and national expectations were for little change. The average prediction of a pre-season panel of 10 ESPN “experts” was for a 13th-place finish (which ended up being the Golden State Warriors with a record of 26-56). Some of the local media observers of the team were more optimistic, projecting win totals into the 30s, but there was no expectation of playoff contention.
The early-season performance of the team was reviewed in this forum in early January, when the Grizzlies were a surprising 16-16 and still very much in playoff contention. Zach Randolph and Marc Gasol were identified as the key contributors to the team’s better-than-expected start. The following table updates the data in that post to include the entire season. (Please note that the basic WP48 calculations were provided by Andres Alvarez, and then modified to improve the allocation of players by position. This was done using lineup data at 82games.com and personal observation of the team.)
Table 1 – Memphis Grizzlies in 2009-10
Based on their off-season moves, the Grizzlies could be expected to modestly improve from 24 wins to just over 30 wins this season. The rookies contributed only 1 win, as did the two players added in mid-season (Jamaal Tinsley and Ronnie Brewer). The boost in expectations was entirely due to the addition of Zach Randolph (0.150 WP48 last year) in place of Darko Milicic (0.057 WP last year).
The team ended up producing 36.4 wins, an improvement of 5.2 wins over expectations. As stated earlier, the biggest boosts came from Zach Randolph and Marc Gasol, who together produced 9.3 more wins than expected. O.J. Mayo and Rudy Gay had modest improvements over the previous year, while Mike Conley regressed significantly with only 3.7 Wins Produced, less than half his expected production.
Evaluating the Team’s Improvement
As the Grizzlies enter Year Three of the Three-Year Plan, it is essential that team management correctly assess the reasons for improvement and whether future gains can be expected. After all, a 16-win improvement is great, but 40-42 is not.
The first thing to understand is the team’s record indicates some random good fortune. Their efficiency differential (the average points difference between the Grizzlies and their opponents, per 100 possessions) was -2.4, which is consistent with a record of 37-45. The Grizzlies were therefore “lucky” to win three extra games. This balances the previous season, when the team went 24-58 with the efficiency differential of a team going 26-56. So while the team improved by 16 wins, its improvement in efficiency differential is only consistent with an improvement of 11 wins.
The second key factor is health, and more specifically the team’s remarkable good health. The youthful core of Gasol, Gay, Mayo and Conley missed a total of 17 games (13 of these by Gasol at the end of the year). But even more important was Zach Randolph playing 81 games, after averaging only 61 games in the previous five seasons.
If Randolph had played only 2,115 minutes, which was his average in the previous 5 years, and he was replaced by some combination of DeMarre Carroll and Darrell Arthur (average WP48 of -0.021), the team would have been worse off by 4.9 wins. In reality, having Randolph miss an extra 20 games likely would have pushed both Rudy Gay and Marc Gasol to play more power forward while Thabeet and Haddadi played more center (and Young played more small forward), in which case the drop-off from Randolph may not have been quite this severe. The key point, however, is the Grizzlies bench was very weak and not only didn’t include a single above-average player, but featured several players who made a negative contribution to winning.
Third is the improvement in Randolph’s performance. He came to Memphis highly motivated to restart his career and ultimately appeared to mature into a solid citizen and good teammate. Improved production followed as compared to the previous year. Specifically, Randolph improved his rebounding, shot-blocking and foul shooting, and committed fewer turnovers. He also took far fewer 3-point shots, which helped improve his shooting percentage, but his effective field goal percentage (which takes the extra value of 3-pointers into consideration) was unchanged.
Can Zach keep it up? It’s certainly possible as he just turned 28 and could have a few more years at nearly the same level of production (although once he moves far past 30 we can expect production to decline noticeably). But one possible red flag is his public demand for a new contract as he enters the final year of his current deal. Will he be able to ignore this possible off-court distraction next season? That is something only those much closer to the situation can determine.
The final issue in evaluating this team is the progress of the team’s young players. That issue, though, will be addressed in the next post on the Memphis Grizzlies.
– Jamie Vann Struth
The WoW Journal Comments Policy
Lineupsmatter
May 21, 2010
Conley- Mayo- Gay- Randolph- Gasol was +225 for the season and was the most used lineup in the league at nearly 1500 minutes.
Every other Griz lineup combined for the season: -349.
Lineups matter and making that lineup of 5 players who filled roles in a classic way possible was good GM work.
The sum is more important than what metrics say about the individual parts.
Using it a ton, far more than what almost all other coaches did with their top lineup, by itself, was good coaching.
ilikeflowers
May 21, 2010
Lineupsmatter,
indeed, those five players are the Grizzlies’ most productive (wp48-wise) at each position, no surprise that they’d be the best lineup. The coaches of the Grizzlies did a great job of playing their best players. Plus-minus and wp48 agree here, where they disagree I’d trust wp48 as less noisy.
dberri
May 21, 2010
After that line-up, the next most used line-up for Memphis played together 260 minutes (Thabeet in for Gasol). Every other line-up on this team played less than 100 minutes together. What does a performance in under 100 minutes actually tell us? That is one problem with line-up studies. Most line-ups are not together for very long. So how is a decision-maker supposed to use that information?
Lineupsmatter
May 21, 2010
It does appear that there were no tough or confusing leading role by position choices here.
Will have to look for other places to see where the metrics disagree and then think about how to react to that contrast.
Lineupsmatter
May 21, 2010
Use 4 of 5 and the split of good and bad results at least among the most used lineups seems to fall to 50/50 and the team +/- seems to fall way more than the change in sum of lineup WP projects.
Use 3 or less it and seems to fail a high % of the time but whether the average size of the fall was about as much as expected by WP or more would take more research.
Lineupsmatter
May 21, 2010
Actually if you look at both good and bad lineups with 4 of the 5 I am not immediately sure if the fall was more or way more than expected by WP. My eye went to the bad ones and that was a temporary slip.
To some extent I think you can look at triplets and quads which have more minutes than unique 5 man lineups for some clues but there of course are issues to address and limitations with that.
I support trying enough 5 man lineups to at least the 200-300 minutes level to have some semi-decent data on them and some vague idea of how they work and which ones to favor or put away before the hoped for playoffs.
Lineupsmatter
May 21, 2010
Having sum of lineup WP or net of what is average available from a central source to compare against lineup +/- results would be useful.
dberri
May 21, 2010
Not sure you ever addressed my question. The sample for most line-ups is very small. Isn’t that a problem for someone using this information?
Another question…
Why do people sympathtic to APBR methods tend to post anonymously? Is this necessary to be part of the group? Maybe the hostility towards the Wages of Wins is entirely motivated by the fact I use my name when I post :)
Lineupsmatter
May 21, 2010
I could answer more, but I’ll say that I’ll answer what I want… like you.
Raw +/- is not an APBR method.
Lineupsmatter
May 21, 2010
At least in my view.
Lineupsmatter
May 21, 2010
I actually gave you my 2 main replies to your question.
Lineupsmatter
May 21, 2010
But to go further, I would not put much weight on any data from a lineup used less than 200 minutes but it is data from real action and I would sift it for information to consider and compare it to what I found elsewhere by all other means and make judgments based on all available data and analysis.
Lineupsmatter
May 21, 2010
Looking at 10 lineups with variations of 4 of the 5 main / best Griz used over 30 minutes they average a team +/- performance of about -5.5 per 48 minutes. Given that the starting lineup with all 5 pulled a +6.5 per 48 minutes I’d say that the full 5 man lineup had some significant power and probably a lot more than net change of WP for the one substitution, though I haven’t got the conversion done exactly yet.
Lineupsmatter
May 21, 2010
There was almost 750 minutes in this 4 of 5 guys test sample. Reasonable I think to compare to the 1500 minutes of the full 5.
That is one way you can use the information to address the question.
Lineupsmatter
May 21, 2010
In this case, the full 5 is the way to go and by this method apparently greater than the sum of the parts.
Lineupsmatter
May 21, 2010
Roughly speaking it looks like a third of a point added to team’s +/- or point differential (a metric worth enough to be widely consulted or relied upon in the ESPN Smackdown and in some cases apparently instead of other metrics) over a season is worth about 1 more win.
Looks like the average WP change for a substitution amongst the Griz starting lineup across all positions is about 6 wins on the WP scale.
So a strict linear expectation might be that keeping 4 of the 5 starters on the court on average across lineups might just cost about 2 points on team point differential per 48 minutes compared to what all 5 together did.
The lineup method employed above showed a much bigger actual change compared to the full 5.
Lineupsmatter
May 21, 2010
Probably one of the more dramatic cases, but using different tools can be helpful.
Evan
May 21, 2010
i don’t pay attention to people who post 8 times consecutively. sorry.
John Giagnorio
May 21, 2010
Rosenbaum must be “writing a paper” again :-\
Lineupsmatter
May 21, 2010
That is your choice Evan.
I think detail is helpful and if it comes to me later I’ll post it later. Especially if additional detail is requested by another.
Incorrect guess JG.
ilikeflowers
May 21, 2010
This magic sifting process that turns small samples into less noisy data will revolutionize the field of statistics and be a great boon to science. Can’t wait! Yay!
Lineupsmatter
May 21, 2010
If you want something beyond my response to the topic and the questions put to me, then anyone is free to add them.
Lineupsmatter
May 21, 2010
Better than the alternative of not doing it in this case ilf, in my opinion.
James
May 21, 2010
Lineupsmatter,
What is the argument you are making?
ilikeflowers
May 21, 2010
He brought it down from the mountains and it was consistent and it was good and written upon its brow was the name…….Yay Sifting! And the people rejoiced.
Lineupsmatter
May 21, 2010
My previous posts make the points I wanted to make.
I am not inclined to re-state them at this time.
James
May 21, 2010
You just babble and don’t make sense. You never state a clear argument, so now you’re just making it obvious that you don’t want any rebuttle. The only thing I could sift out of your babbling is that ESPN uses +/- and not Wins Produced, so it MUST be better. Appeal to authority. Brilliant.
By the way, Jamie, this is a terrific article.
Lineupsmatter
May 21, 2010
Whatever ilf. That kind of comment has no impact on me. I spoke to the team under study using tools that I thought added something to the discussion. If you want to try to turn it into a silly juvenile war of sarcasm or flame, I won’t go that way.
Lineupsmatter
May 21, 2010
I’m sorry you can’t get anything out of it James.
Lineupsmatter
May 21, 2010
I don’t mind rebuttal.
Meaningful rebuttal requires comment on the detail. There has been some and I’ve responded each time there was.
You want restatement? Ok, just for you James:
Lineups matter and the Griz starting lineup with 5 players who each filled roles in a classic way was good work by the GM and coach. The sum of lineup production is more important than what metrics say about the individual players. Using it a ton, far more than what almost all other coaches did with their top lineup, by itself, was wise.
I would not put much weight on any data from a lineup used less than 200 minutes but it is data from real action and I would sift it for information to consider and compare it to what I found elsewhere by other means.
To address the challenge of limited minutes by most lineups. you can look at triplets and quads which have more minutes than unique 5 man lineups for some clues. And I support trying enough 5 man lineups to at least the 200-300 minutes level to have some semi-decent data.
Having sum of lineup WP or net of what is average available from a central source to compare against lineup +/- results would be useful.
Looking at 10 lineups with variations of 4 of the 5 main / best Griz used over 30 minutes they average a team +/- performance of about -5.5 per 48 minutes. Given that the starting lineup with all 5 pulled a +6.5 per 48 minutes I’d say that the full 5 man lineup had some significant power and probably a lot more than net change of WP for the one substitution. A strict linear expectation might be that keeping 4 of the 5 starters on the court on average across lineups might just cost about 2 points on team point differential per 48 minutes compared to what all 5 together did. The lineup method employed above showed a much bigger actual change compared to the full 5.
Lineupsmatter
May 21, 2010
I look forward to your rebuttal on the restated arguments James.
“The only thing I could sift out of your babbling is that ESPN uses +/- and not Wins Produced, so it MUST be better. Appeal to authority. Brilliant.”
You entirely missed my point on this side issue James, so I’ll clarify. Several participants in the ESPN Smackdown have said they relied on point differential which is team +/- per game, including Dave Berri. I mentioned that in passing, after +/- was called an “APBR” method. I just found those two different takes on +/- in conflict and mildly interesting.
Dre
May 21, 2010
Lineupsmatter,
Sorry about some of the ribbing. I did like some of your points, and they are interesting. Next time you may want to get your full argument down and in one post.
Anyway the basic point I think many of us have is a simple, predictive model. Is optimizing a 3,4 or 5 person lineup a good thing to do? Absolutely!
However, given injury, trade, constraints it is often hard to find steady lineups. Also, as has been noted several times, the lineup times tend to be low. I bet often because a coach may “shake things up” to try and improve a team. So anyway as a model, this is a hard thing to mimic or find.
Now here’s something cool though. Basically Memphis just played their best player at each position as much as they could AND it worked by the five person model.
Finally the sum being greater than the whole? Just to point out that WP48 is predictive to an average of 2.7 games within actual performance for teams. So to convince to really use a multi-person model, I would have to see dramatic improvement from the simpler model.
Not saying it’s not worthwhile or cool, just harder and maybe not that much additional added.
ilikeflowers
May 21, 2010
lineupsmatter,
when I see gibberish I treat it as such. I’m glad to see that you have improved upon your initial posts and are now fleshing them out. Still, your sifting mechanism which is apparently your main tool is undefined. Hence, Yay! Sifting!
Lineupsmatter
May 21, 2010
Agreed, full argument down in one post would be better: but, in between doing other things, bit by bit can happen. I might try to change but incremental conversation is not that unusual on the net.
The sum being greater than the whole statement was vague, especially when put in isolation, and I could have had a better sentence at that point.
But the fact that when Memphis just played 4 of their best 5 players the results were way different than the full 5 or what WP would predict for 4 of the 5 seemed worth adding to the conversation.
It was not really intended as argument for one model “over” another. I want both and more. Predicting at the season level and even the season level for a lineup are very different things. I’ve said that different tools are each useful in their own way and in one view or another.
Arturo
May 21, 2010
ilikeflowers,
QFT
Lineupsmatter,
I think +/- is a good tool for optimization in the short term (i.e. boy Shaq should not be out there or Jamison on Garnett =bad times) . But too many factors can affect it. I’ve yet to see any data showing high correlation in the long term. It’s meant as a slice or a sample but is not a representative sample.
Lineupsmatter
May 21, 2010
ilf,
Everything I re-stated was in the original posts.
I am glad you have enhanced your view of those same statements, or the presentation of them, or your comprehension of them.
Yay! enhanced view by one reader!
Lineupsmatter
May 21, 2010
“Still, your sifting mechanism which is apparently your main tool is undefined.”
My sifting mechanism was clearly stated as ” lineups with variations of 4 of the 5 main / best Griz used over 30 minutes” (for the season) at 4:40 pm.
Lineupsmatter
May 21, 2010
It was a rough cut that fit the need pretty well and the time I was willing to give it.
Lineupsmatter
May 21, 2010
Arturo:
“I think +/- is a good tool for optimization in the short term … But too many factors can affect it.”
I agree with your sense of possible usefulness and need for caution.
I am not putting all my faith in the results of any one tool.
Lineupsmatter
May 21, 2010
Year to year lineup performance is something that needs to be looked at further.
I have looked at just a bit so far in a handful of cases, and it does give reason for caution; but I intend to look at it more because some lineups do perform consistently and that might be useful to consider and try to learn more about why.
Lineupsmatter
May 21, 2010
Sorry ilf, my sifting “criteria” was ”lineups with variations of 4 of the 5 main / best Griz used over 30 minutes” (for the season)”. The implicit “mechanism” was adding up the team +/- for these qualifying lineups from tables provided by basketballvalue and finding the average per 48 rate for that set of the biggest lineups with 4 of the 5 main Griz and making comparisons.
I thought the basic approach and findings would be pretty clear from I said at 4:40, though I probably should have said …Looking at “the” 10 lineups… and maybe a little more.
Tindall
May 22, 2010
4 replies when I checked in earlier today. 40 replies now. This is confusing, annoying, and a bit hilarious. In the future please consolidate your posts, lineupsmatter, or take some lithium to stop the mania.
Please consolidate your posts, lineupsmatter.
Tindall
May 22, 2010
Consolidate.
Leon
May 22, 2010
So a case where there was no “stumbling”. Interesting to note that Randolph knew what he had to do to get better (and it worked without having to shoot less efficiently, earning himself an all-star berth). The coach knew what he had to do, play his players as much as possible (something we’ve seen Larry Brown do often).
So perhaps the correct decisions are able to be made, but the people making the decisions are unwilling to make them.
Arturo
May 22, 2010
Leon,
The interesting question that’s raised if you’re running a team is about Randolphs. Randolph improved his rebounding, shot-blocking and foul shooting, and committed fewer turnovers without reducing his shooting efficiency. If these skills (which directly correlate to winning) can be improved with training then I don’t see why a team wouldn’t focus a their efforts on coaching and talent development.You could sign young, athletic players who are weak in trainable areas to long term low money deals and just work on developing them. Now that I think about it isn’t that what the Spurs are doing with their d-league team?
Lineupmatter
May 22, 2010
Ok, so you can’t handle it Tindall. Noted.
Good points Leon.
dberri
May 22, 2010
Leon asked this question:
“I have a question that is a bit off topic Dr. Berri, but I figure perhaps this is as good a time as any to ask. At what point does a player outweigh the advantages of scoring against the advantages of being more productive and thus helping your team become a more serious contender?”
A quick answer…. I really think when players focus on scoring that many think they are also focused on winning. In fact, I think Iverson said something like this. I can’t find where I saw the quote, but once Iverson argued that his teams do best when he scores the most. My sense is that many players believe their scoring does lead to more wins, and they do what others do when this doesn’t work out (i.e. blame team chemistry).
Marparker
May 22, 2010
Lineupsmatter,
Your posting style reminds me of when I decided I wasn’t going to shower for 3 months. My smell wasn’t really hurting my roommates but ultimately it was just common courtesy to treat a space that had to be shared with a certain respect.
ilikeflowers
May 22, 2010
lineupsmatter,
at 4:24 you said
But to go further, I would not put much weight on any data from a lineup used less than 200 minutes but it is data from real action and I would sift it for information to consider and compare it to what I found elsewhere by all other means and make judgments based on all available data and analysis.
at 4:40 pm
Looking at 10 lineups with variations of 4 of the 5 main / best Griz used over 30 minutes they average a team +/- performance of about -5.5 per 48 minutes. Given that the starting lineup with all 5 pulled a +6.5 per 48 minutes I’d say that the full 5 man lineup had some significant power and probably a lot more than net change of WP for the one substitution, though I haven’t got the conversion done exactly yet.
then
There was almost 750 minutes in this 4 of 5 guys test sample. Reasonable I think to compare to the 1500 minutes of the full 5.
You’re not going to find anything useful from sifting those low samples sizes that is reliable. For every predictive success you’ll have two or three failures – short term and long. How do you tell when it fails from when it’s successful? For this reason the sifting is useless.
Once you get to a high enough sample size then plus-minus is likely useful. With your almost 750 minute example (still a low sample size btw) – it’s going to tell you something about the 4 players held constant and the fifth ‘main’ player who was removed, it’s not going to tell you anything useful about any of the other 5th players for the same reason that I mentioned above. All it’s going to do really is lessen the predictive power of that 750 minute sample since now on top of a still smallish sample size your changing one of the players. The noise from the small-sample backups remains ever present.
I’m guessing that if you want to apply it to a specific player then you have to start with a 5 man unit, remove one of the players and see what happens without that player. You’ll have some sort of marginal measure for the removed player but it’s going to be very susceptible to noise from the varying nature and low sample size of the backups still.
I don’t see how your approach is measuring anything very well. No one says that +/- isn’t good given a large enough data set, it’s just hard to get these data sets for most of the combos and individual players by its very nature.
If someone forced me (they’d have to) to incorporate +/- into evaluating a player/lineup, I’d only use it where the sample size was very large (meaning that for most player combos it would be entirely useless). I would also use it in small sample scenarios as part of a two-of-three observational approach. That is, one measure is via eyeball, one measure is via wp48, one measure is via +/-. If all three (or maybe 2 of 3) say that a given lineup or player is playing very well or poorly then maybe I pay attention. Still, even in this case small sample size problems dominate. Who’s to say that even if all three approaches say the same thing that they are better measures of what’s going to happen next than a full-season measure or at a least much larger sample size measure?
dberri
May 22, 2010
I am willing to say that +/- still has problems even when the sample is quite large. For the five-year numbers about half the players have coefficients that are statistically insignificant. So even when you expand the data set tremendously (well beyond one year), the model has problems.
I will add this as well… the correlation between +/- and Wins Produced increases as you add more years to +/-. So it appears that these measures converge as you add more and more data to the +/- analysis. Perhaps with many more years, the numbers would be about the same.
Not necessarily
May 22, 2010
People that complain about too many posts would probably complain about one long post. People that complain about too much detail would probably also complain about not enough detail or an argument not carried to completion.
I think it takes a certain respect to go to the trouble to make a case and to pursue a case farther.
Long sections of writing are not uncommon here.
The way it looks may be a bit unusual. But the content is there, if you care to focus on that.
Lineupsmatter
May 22, 2010
ilf, I hear your perspective but we see it differently.
The difference in the results of the two lineup samples of 4 of 5 compared to all 5 is so large that it is statistically significant.
It is useful information to me. It shows just how demanding the conditions of success for this team were and how fragile the success was.
Lineupsmatter
May 22, 2010
Removing a specific player and checking the results would have additional value.
I was trying to show a different way to manage with the small sample sizes. It is not the only way. I wasn’t focused on the impact of one player, I was focused on variation from one specific lineup.
Lineupsmatter
May 22, 2010
Adjusted +/- for players and raw team +/- for lineups are different things. They are related but I’ll note that nothing I said above used Adjusted +/- for players.
John Giagnorio
May 22, 2010
Dave,
What do you think the 5 year estimates of adjusted +/- converge with? A player’s average WP48 over those 5 years?
I have a hard time taking these multiple season APM numbers seriously. They take a long time to accumulate, and once you have the number it’s unclear how to interpret it. Does a high 5 year APM mean that a player was good every year? That he had a poor first 2 and improved greatly the last 3? That he was great early but declined in the later years?
I’m not sure why having some statistic that tells you how a player performed over a 5 year sample is helpful for decision making, although it might tell you who belongs in the Hall of Fame.
ilikeflowers
May 22, 2010
Lineupsmatter,
Even if it’s a statistically significant difference what’s the range for the true value? If the range for the true value is large (due to small sample size) then what’s the use of such data? The existence of some statistically significant effect when you change the lineup isn’t necessarily telling you much. Any change in the lineup is likely to produce a real (i.e. statistically significant) effect in performance. How precise the range for the effect is what is important. The range for the true value has to be small enough to be useful.
ilikeflowers
May 22, 2010
I think that I implied that a real effect will necessarily be measurable in a statistically significant way. This is not necessarily true. Also a statistically significant effect is also not necessarily real.
I’m not a stats expert (just a technician/hack) so someone correct me if I’m wrong here.
ilikeflowers
May 22, 2010
professor,
is the convergence mostly one way? I’m assuming that +/- converges more towards wp48 than the other way around. Is this correct?
Lineupsmatter
May 22, 2010
What’s the range for “the true value” for anything in basketball by any method?
This is an estimate by any means based on sample of one size or another.
Lineupsmatter
May 22, 2010
Randolph wasn’t in Memphis last season but I will watch to see what the 5 man starting lineup with him does in the sample of next season and how it compares to 4 out of 5 in general and 4 out of 5 without him in particular and compared to what the sum of WP from last season or the current season or career suggests.
Lineupsmatter
May 22, 2010
In another example, the Celtics starting lineup has been consistently strong in the samples of 3 regular seasons and the 2 playoffs it was available. There are a few 4 out of 5s of modest size that were stronger but they were few and noise can play a role.
ilikeflowers
May 22, 2010
Lineupsmatter,
The difference in the results of the two lineup samples of 4 of 5 compared to all 5 is so large that it is statistically significant.
What are the results for the two samples and what are the sample sizes for the two samples?
ilikeflowers
May 22, 2010
That should read, what are the sample sizes for the two groups.
ilikeflowers
May 22, 2010
Lineupsmatter,
nevermind, I see the numbers in one of your previous posts.
ilikeflowers
May 22, 2010
Lineupsmatter,
[1] Ascertaining the ‘true’ value:
10 lineups, -5.5 average +/-
1 lineup, +6.5 average +/-
The confidence interval for ‘true’ +/- value of the 10 lineups is unknown: no standard deviation data provided.
The confidence interval for ‘true’ +/- value of the 1 starting 5 lineup is unknown: no standard deviation data provided.
[2] Ascertaining the significance of the +/- difference:
Assuming that both group’s +/- values are the ‘true’ values (a bad idea)
10 lineups – 750 minutes total (one group)
1 full 5 lineup – 1500 minutes total (one group)
This looks like two groups each with a sample size of 1 to me. We can’t really determine whether the observed difference is meaningful or not.
We could use the per-game +/- of the 10 lineups group and the starters group to set the sample sizes and standard deviations. This would allow use to generate confidence intervals and determine whether or not the effect size is big or small.
Without the standard deviation and sample size data how are you determining the significance of the effect (difference in +/- between the two groups)? What am I missing?
LM
May 22, 2010
The sample sizes are 750 and 1500 minutes.
The per-game +/- (or per 48 as stated) of the 10 lineup group and the starters were the results.
When Adjusted +/- is calculated (something I wasn’t doing), the standard deviation of a 1500 minute sample is listed at a bit above 3 points. The standard deviation of a 750 minute sample is listed at a bit above 4.5 points.
The difference in results is 12 points. If the standard deviation is similar for this use then the difference in results may be 2 or more standard deviations. Even if it is a bit less, that is still sizable.
LM
May 22, 2010
I should have said …”I’d think it is very likely” that it is statistically significant.
But meeting the full statistical significance test is not essential to the idea that the very different lineup results are worth considering in the overall mix.
ilikeflowers
May 22, 2010
LM,
Without regard to sigfigs we have 95% confidence intervals of:
Group 1: -2.5 to -8.5
Group 2: 2 to 11
So the 95% of the time the diff is from 4.5 to 19.5, that’s a large spread.
Regarding whether the diff is significant though:
If the sample size for group 1 is 750 then we need the 750 1 minute observations of +/- to determine the sample size. I really doubt that this is what was done. More likely the samples are by games that group 1 played in or by the number of separate instances that group 1 was used. Similarly for group 2. It sounds like the sample sizes for both groups is still an unknown and likely at least an order of magnitude smaller than 750 and 1500. Correct me if I’m wrong.
The significance of the average diff still looks like a total unknown to me.
Mark
May 22, 2010
Here’s some comparisons of 82games.com “win percentage” versus the number of games WP predicts the team would win in a season if those five players played every minute.
Memphis
Lineup: Conley, Mayo, Gay, Randolph, Gasol
+/- expected wins out of 82 games: 53
WP prediction out of 82 games: 49
Some other lineups with 1000+ minutes
Atlanta (1170 minutes)
Lineup:
Bibby, Johnson (Joe), Williams (Marvin), Smith (Josh), Horford
+/- expected wins out of 82 games: 43
WP prediction out of 82 games: 64
Boston (1179 minutes)
Lineup:
Rondo, Allen, Pierce, Garnett, Perkins
+/- expected wins out of 82 games: 55
WP prediction out of 82 games: 76
Oklahoma City (1307 minutes)
Lineup:
Westbrook, Sefolosha, Durant, Green, Krstic
+/- expected wins out of 82 games: 41
WP prediction out of 82 games: 51
(Expected wins out of 82 games calculated from Win % from 82games.com for the lineups)
(WP prediction out of 82 games calculated from WP48 multiplied by 82 games. WP numbers taken from Dre’s awesome site.)
WP is fairly consistent so I believe this indicates that line up data even lineups with 1000+ minutes have a lot of noise in them.
It might be considered interesting that only Memphis has a higher +/- expected wins than WP expected wins, but again, I think the wildly varying numbers indicates there is a huge noise component at work
LM
May 22, 2010
The lineup data is based on stints between substitutions within a game. Given a probable average of 15-20 total player substitutions per team per game (I’ve checked it a few times) that would make stints on average about 2.5 to 3 minutes in length. So if you went with that I guess you are talking up to around 600 stints vs 300. That has its importance if you are trying to prove Statistical Significance with a capital “S”. But that is against the nebulous concept of true value and I am not really focused on that comparison.
A 5 point minimum difference in the confidence interval at the lower extreme really isn’t the right measure.
I did throw in a fair number of qualifiers along the way though including
I was trying to show a different way “to manage” with the small sample sizes.
Not “prove” the exact significance level. Manage within what is available. I’d rather base my decisions on an overall evaluation than included some consideration of lineup data than not.
LM
May 22, 2010
I called for comparisons like the ones Mark just posted. Regardless of what conclusions you draw from it I think it might be helpful to look at such comparisons. That was and is my basic message. Everyone will do it different or not do it and that is all fine, to me.
But to respond to the issue of noise in lineup win estimates, I would say at this point you could potentially reduce the noise by accounting for home court and quality of opponent and maybe game situation (i.e. possibly addressing garbage-time). But then you are into the value of Adjusted +/- lineup results debate. I was stopping short of that earlier. I wouldn’t completely shy away from it but I am not sure if I want to go for it right now.
I was making fairly limited, specific observations & conclusions earlier.
LM
May 22, 2010
In Mark’s comparisons it is actual scoreboard results projected to 48 minutes vs win projection from sum of player ratings with team adjustment.
LM
May 22, 2010
actual scoreboard results projected to 48 minutes… “and then to wins based on point differential”
LM
May 22, 2010
Mark said “WP is fairly consistent so I believe this indicates that line up data even lineups with 1000+ minutes have a lot of noise in them.”
I would probably say … the WP estimate varied from the projection out to wins of the actual scoreboard results in the 4 cases cited by an average of about 25-30%.
LM
May 22, 2010
WP is based on all minutes- that lineup and all the rest- so this is not surprising, but worth considering / keeping in mind I think.
LM
May 22, 2010
Correction: some of the substitutions are simultaneous so the number of stints is smaller than I estimated, but I don’t know how much smaller.
LM
May 22, 2010
I kind of got turned around on this; but looking at it again, it is probably not more than about 20% lower.
LM
May 22, 2010
I think you can say both Mark’s statement and mine and not have to reject the other.
And I think you can say: there is or can expected to be performance variation (or call it noise) in player and lineup data. In segments of actual data and when season averages are used as a basis for projections in a particular smaller minute circumstance.
LM
May 22, 2010
The variance comes from individual players and the impact of lineups that constitute a production unit that the scoreboard measures. How much lineups matter is a subject worthy of further study.
Italian Stallion
May 23, 2010
In horse racing, when I am looking for profitable situations, the primary problem is that the sample sizes are often too small for me to be certain about anything.
However, what I’ve learned over time is that if I find a lot of things like that, some of them turn out to be legitimately profitable and almost none of them turn out to be negative.
Applied to basketball, if a lineup is performing exceptionally well based on +/-, it may be random, but it you always go with lineups like that at least a portion of the time it will prove to be a very positive thing and it will almost never turn out to be a bad thing. (neutral or good)