mrjames
Professor
Posts: 6062
Loc: Montclair, NJ
Reg: 11-21-04
|
02-18-12 06:39 PM - Post#121170
In response to Brian Martin
There are going to be a lot of teams that have consistently good and bad 3PT% allowed. The point is that if teams could really control 3PT% allowed, you'd at least see some weak correlation.
You can make an argument that teams who have poor 3PT% allowed make a more concerted effort to guard the 3PT line during the second half of conference play and vice versa. But even then, you'd still expect a weak correlation between first and second half. That there is ZERO correlation is totally striking. Especially when there's a high correlation of 3PT attempts between the first and second half. If teams changed their defense, you'd expect both the rate of makes to rise and fall and the rate of attempts.
This also meshes pretty well with the argument of why teams that are reliant on 3PT shooting might be high variance. If you can't control your 3PT percentage that much, and you rely on many of your points from there, you're going to get vastly different outcomes night after night. To that extent, it makes sense.
I find this to be fascinating, and I'm working up a new metric that incorporates this and FT defense. I think it has the ability to enhance the predictive nature of college basketball models.
|
Brian Martin
Masters Student
Posts: 963
Loc: Washington, DC
Reg: 11-21-04
|
02-18-12 08:37 PM - Post#121189
In response to mrjames
Please do me a favor. If you have the data handy, plot the Ivy League teams on their 3-pt fg defense in the 1st half and 2nd half of league play and tell me if your chart looks like Pomeroy's.
I have added up the numbers for four teams from last season and I have not yet found a team with a significant difference between the 1st half and 2nd half.
Harvard 37.1, 36.7
Dartmouth 38.1, 38.4
Penn 36.2, 36.6
Princeton 28.6, 31.0
Princeton's 2.4% difference is the greatest, but Princeton played 5 of the first 7 at home, so some of the better shooting by opponents in the second half may be due to shooting at home vs. shooting at Jadwin.
|
SRP
Postdoc
Posts: 4910
Reg: 02-04-06
|
02-19-12 06:21 AM - Post#121317
In response to Brian Martin
Here's an interesting question: Is 2-pt. FG% defense more correlated across season halves than 3-pt FG%, and if so, why? I mean, it would be odd for defense to have no consistent effect from three but to have a big effect from just inside the arc.
One hypothesis would be that FG% defense is genrally more consistent as you look at shots closer to the basket. That would be a very interesting thing, if true.
|
mountainred
Masters Student
Posts: 513
Age: 57
Loc: Charleston, WV
Reg: 04-11-10
|
02-19-12 11:16 AM - Post#121337
In response to SRP
Looking at the Penn, Princeton and Columbia games -- Cornell's 3pt D was 32.7% the first time through and 40% the second time. That's the difference between being top 100 and bottom 10 nationally if maintained for an entire season.
As for the Cornell/Princeton game -- it proved to be as bad a match-up as I feared. The Tigers shot the lights out and dominated the glass, at least until the game was completely out of reach. As I said on another forum that will go nameless, this isn't the first time, nor will it be the last, that Cornell ran into a buzzsaw at Jadwin. And it has happened to better teams than this year's edition.
|
Brian Martin
Masters Student
Posts: 963
Loc: Washington, DC
Reg: 11-21-04
|
02-19-12 04:32 PM - Post#121376
In response to mountainred
I looked up the last two Ivy seasons and 11 teams were consistent at defending threes in league play in the first half and second half of Ivy play (4% or less difference between 1st and 2nd half d3fg%).
Generally, those teams had a few games where they held opponents under 25% and a few in which opponents exceeded 45%, and more games in between, but the highs and lows were fairly evenly distributed between early and late games in the league schedule.
Of the five outliers, the biggest by far was the 2010 Brown team: 26% in the 1st 7 games and 45% in the last 7 games.
Brown 2010 had its three best or luckiest defensive games against threes in the first seven games: Yale 3 for 18, Penn 3 for 15, and Dartmouth 0 for 7. They had their three worst or unluckiest defensive performaces against threes in the last two weekends of the season: Harvard 12 for 19, Cornell 20 for 30, and Columbia 6 for 10.
Six of the seven Ivy opponents shot better from three in the second game against Brown than they had in the first game. I do not know what that proves. While their may have been some difference in Brown's defensive and opponents' offensive strategies, a fair amount of the difference between the two halves was the fluke of not having a hot opponent earlier in the season and not having a very cold one later in the season.
Seven games is a small sample that can be distorted by an extreme performance or two. If Cornell had made 20/30 in the first meeting and 7/25 in the second meeting instead of the reverse, Brown's splits would have been 35.8% and 37.0%.
This does not show that defense has no effect on opponents three point shooting. Just because there is high variability game to game does not mean that the season average is random. A team that holds opponents to an average of 30% from three is not just luckier than a team that holds opponents to an average of 40%.
|
mrjames
Professor
Posts: 6062
Loc: Montclair, NJ
Reg: 11-21-04
|
02-19-12 05:23 PM - Post#121387
In response to Brian Martin
You still don't have anywhere near enough data to come close to refuting KenPom's findings. I'm sorry. I don't know how to explain this any clearer.
|
mrjames
Professor
Posts: 6062
Loc: Montclair, NJ
Reg: 11-21-04
|
02-19-12 05:28 PM - Post#121389
In response to mrjames
I'll elaborate: if someone shows you a huge sample of data, you can't take a fractional subset of that data which implies a different result and deem that the latter trumps the former. You need to either take the same dataset or produce an equally or more robust dataset with a new previously omitted variable that explains away some of the findings from the first dataset.
|
mountainred
Masters Student
Posts: 513
Age: 57
Loc: Charleston, WV
Reg: 04-11-10
|
02-19-12 05:56 PM - Post#121400
In response to Brian Martin
I looked up the last two Ivy seasons and 11 teams were consistent at defending threes in league play in the first half and second half of Ivy play (4% or less difference between 1st and 2nd half d3fg%).
In 2011, the difference between a top 100 3 pt defense and a bottom 100 3pt defense was 2.5%. So, a 4% swing is actually fairly significant. The bell curve for that stat is pretty steep.
I'll be curious if KenPom runs the numbers on 2012 as well and gets the same results.
|
mrjames
Professor
Posts: 6062
Loc: Montclair, NJ
Reg: 11-21-04
|
02-19-12 05:58 PM - Post#121401
In response to mountainred
I'm going to (try to find the time to) run the numbers on Ivies historically going back as far as I can. We'll see what shakes out (and how far back I can get).
|
SRP
Postdoc
Posts: 4910
Reg: 02-04-06
|
02-19-12 05:59 PM - Post#121402
In response to mrjames
You shouldn't do data mining to find subsets that differ from the overall, but if you have a prior causal theory (such as "Ivy League teams are more likely to have consistent 3-pt% defense") then you can perform statistical tests to see if the subset is non-randomly different from the overall.
Our Bayesian friends have a whole other way of looking at it, of course...
|
mrjames
Professor
Posts: 6062
Loc: Montclair, NJ
Reg: 11-21-04
|
02-19-12 07:00 PM - Post#121411
In response to SRP
That's absolutely true. But to test that point you need more than 16 data points. That's why I'm going to try to dig back into the past. If I can get 15 years or so, then we'll be on our way to at least a decently sizeable dataset.
|
mrjames
Professor
Posts: 6062
Loc: Montclair, NJ
Reg: 11-21-04
|
02-19-12 08:25 PM - Post#121427
In response to mrjames
I just ran this for all Ivy teams back to 1997. The R^2 I get on first half 3PTFG% vs. second half 3PTFG% is 0.02. The R^2 I get on first half attempts vs. second half attempts is 0.45.
That's pretty close to what KenPom got in his blog post, so I'm prepared to reject the hypothesis that there are reasons to believe the Ivy style of play doesn't lend itself to the same lack of control of 3PTFG%.
|
SRP
Postdoc
Posts: 4910
Reg: 02-04-06
|
02-19-12 10:55 PM - Post#121453
In response to mrjames
On reflection, the first-half v. second-half test used byKenPom is invalid for the purpose of the inference he wants to make. What we want to know is whether Team A is significantly (and importantly) different from Team B in 3 pt.% defense. A lack of correlation for the same team across season halves is suggestive but does not speak to that cross-sectional difference. Maybe a look at the correlation of the order statistics across season halves would be okay, but a test of whether two teams's performances are distributed with the same parameters would be more direct.
I also think you have to be careful about proving too much. If a similar correlation analysis appeared to show that overall FG% defense was random, that would strongly indicate a flaw in the procedure. Plausibility would go up if teams showed smaller cross-sectional variation in 3pt % defense than they did in 2pt% defense.
The initial finding is certainly provocative, though.
|
Brian Martin
Masters Student
Posts: 963
Loc: Washington, DC
Reg: 11-21-04
|
At Penn/Princeton 02-20-12 12:14 AM - Post#121461
In response to SRP
That was my original objection. How does 1st half of league play compared to 2nd half of league play prove that defense has no effect on 3 point percentage? It proves that 3 point shooting is highly variable. If you looked at offensive three point shooting percentage for teams you probably would find similar lack of correlation. Would that mean that the offense also has no effect on 3 point shooting percentage? No. It means that 3 point shooting is a highly variable statistic.
The question that matters is whether Harvard (30.5% in league play) is better at defending threes than Brown (55.0%)?
Does anyone really believe that Pomeroy has proven that Harvard is only much, much luckier than Brown and has had no effect on its opponents' three-point shooting?
Edited by Brian Martin on 02-20-12 12:16 AM. Reason for edit: No reason given.
|
Brian Martin
Masters Student
Posts: 963
Loc: Washington, DC
Reg: 11-21-04
|
Re: At Penn/Princeton 02-20-12 01:08 AM - Post#121463
In response to Brian Martin
Correction: Brown's league opponents shoot 44% from three, not 55%.
That was my original objection. How does 1st half of league play compared to 2nd half of league play prove that defense has no effect on 3 point percentage? It proves that 3 point shooting is highly variable. If you looked at offensive three point shooting percentage for teams you probably would find similar lack of correlation. Would that mean that the offense also has no effect on 3 point shooting percentage? No. It means that 3 point shooting is a highly variable statistic.
The question that matters is whether Harvard (30.5% in league play) is better at defending threes than Brown (44.0%)?
Does anyone really believe that Pomeroy has proven that Harvard is only much, much luckier than Brown and has had no effect on its opponents' three-point shooting?
Edited by Brian Martin on 02-20-12 01:09 AM. Reason for edit: No reason given.
|
Tiger69
Postdoc
Posts: 2814
Reg: 11-23-04
|
At Penn/Princeton 02-20-12 09:49 AM - Post#121480
In response to SRP
You guys could do some serious numbers-kicking in a sports bar.
Let's have a good "Yo' Mama" algorithm joke.
Edited by Tiger69 on 02-20-12 09:53 AM. Reason for edit: No reason given.
|
1LotteryPick1969
Postdoc
Posts: 2272
Age: 73
Loc: Sandy, Utah
Reg: 11-21-04
|
Re: At Penn/Princeton 02-20-12 10:09 AM - Post#121485
In response to Brian Martin
I am hesitant to weigh in on this because I am no statistician. The observation is intriguing, but...
I agree with Brian on this: Pomeroy merely proved that he can't MEASURE three point defense, not that it doesn't exist. In other words, his premise was flawed.
Three point defense does not exist as a separate parameter. It is one component of team defense, and team defense has many aspects, most of which are difficult to quantitate and very interrelated.
Also, you are looking at a number (3-pt shooting percentage) which has a high coefficient of variabilty, and the sample size is small.
We have much the same problem in clinical medicine, trying to measure benefits of treatment on parameters which have high coefficients of variability. The solution is make more measurements before and after treatment, but that is not an option in this case.
|
1LotteryPick1969
Postdoc
Posts: 2272
Age: 73
Loc: Sandy, Utah
Reg: 11-21-04
|
Re: At Penn/Princeton 02-20-12 10:14 AM - Post#121486
In response to Tiger69
Let's have a good "Yo' Mama" algorithm joke.
Here's my lame attempt:
Yo' mama so ugly, her love life is a Bortkiewicz distribution.
|
mrjames
Professor
Posts: 6062
Loc: Montclair, NJ
Reg: 11-21-04
|
02-20-12 11:31 AM - Post#121496
In response to SRP
Now we're on the right track. The questioning of the data is ridiculous, but the questioning of why is important.
The problem that I have with saying that conference play is too small of a sample is that in the grand scheme of things, the whole season isn't that much larger of a sample. So, the question is whether even a team's full-season 3PT% allowed is actual ability or random variation around the weighted average of what their opponents have shot for the season.
The 3PT%/2PT% analysis is an important one, and I'll get to that next. It would make sense, to me at least, that 3PT% defense would be more controllable than FT% defense, but far less controllable than 2PT% defense. This just has to do with ability to defend. You can't defend the FT line. You can't always defend threes. But two pointers are easily the most defensible of the three.
I'm not trying to say that 3PT defense is completely uncontrollable, especially since forcing people off that line is an important element of defense. My point is that Pomeroy's data would seem to indicate that it isn't very controllable. Not just that it can't be measured, but that even the full season result might be a lie - or, otherwise stated, the best predictor of a teams 3PT% defense is the weighted average of its opponents' 3PT shooting percentages.
Most importantly this might be an important part of the variance piece that comprises a team's profile, so it's worth digging into deeper to understand.
|
Silver Maple
Postdoc
Posts: 3770
Loc: Westfield, New Jersey
Reg: 11-23-04
|
Re: At Penn/Princeton 02-20-12 11:53 AM - Post#121502
In response to 1LotteryPick1969
Let's have a good "Yo' Mama" algorithm joke.
Here's my lame attempt:
Yo' mama so ugly, her love life is a Bortkiewicz distribution.
Yo mama is so mean that she has no standard deviation.
or,
Yo mama is so fat that her derivative is strictly positive.
|