October 14, 2009 at 12:00 pm by Capitol Avenue Club under Research Studies, Statistical Analysis
Updated 10/15/2009 : See Bottom
In an effort to deviate from prospects (I’ve been doing them constantly for awhile, now), I’ve written an article on “Beane Count”. Enjoy.
Sabermetricians are constantly talking about walks and homers. They are, perhaps, the two most important secondary offensive skills. We frequently judge players based on their slash statistics–their batting average, their on-base percentage, and their slugging percentage. In order to produce secondary offense (offense independent of batting average), hitting homers and drawing walks are hugely important. In addition to boosting one’s secondary offensive production, walks and homers are also relatively free of statistical noise. That is, their aren’t a lot of outside influences that distort the meaning of the data. When a player draws a walk, he’s drawn a walk. There’s no ball put in play and it’s simply a function of three things: plate discipline, pitchers’ control, and umpires. While a batter doesn’t have any control over a pitcher’s ability to throw strikes or an umpire’s ability to call balls balls and strikes strikes, these two things mostly even out over the course of a season, leaving only the third thing–plate discipline. The same can be said about Home Runs. There’s relatively little luck involved with the ability to draw a walk or hit a homer. The opposite is the case with batting average, a metric that is much more luck-dependent.
Eight or nine years ago, I came up with a silly little thing called Beane Count, which was a way of looking at how teams fared in a couple of sabermetrics-friendly measures: home runs and walks. How many you get, and how many you give up.
The metric is calculated on a ranking scale. If you rank 16th in the league in home runs hit, you get 16 points. The same for home runs allowed, walks drawn, and walks allowed. It’s scored by the golf method–lower is better. Here’s what the final 2009 Beane Count standings look like:
I love the simplicity and good intentions of the metric, but how well does it predict the ability for a team to win baseball games?
We’ll take a look at how well Beane Count did in 2009.
For the purposes of this study, I used two sets of data. First, I compared the opposite of a team’s winning percentage to their Beane Count. I used the opposite of winning percentage (1-W%), rather than the actual winning percentage to generate a positive correlation, rather than a negative one of the same magnitude. Since Beane Count works like golf (lower is better) and winning percentage works the other way, 1-W% also works like golf. I ran three sets of data. One using only the AL’s numbers, one using the NL’s numbers, and one using all 30 teams’ numbers. The following charts should be self-explanatory.
- Some things of note: the coefficient of determination (R squared) for the NL study was 0.373. It was 0.497 for the AL and 0.418 overall. This number attempts to tell us what percent of the change in our 1-winning percentage (also winning percentage) can be explained by change in Beane Count. Of course, since Beane Count influences winning percentage (aka the variables are not completely independent), the data isn’t perfect. And we aren’t working with an extremely robust sample size here. But still, I’m shocked at how well the two correlate.
I previously mentioned I used two sets of data. The second set was not the teams’ actual winning percentage, but their 3rd order winning percentage. This gives you a better snapshot of how good each team is. It attempts to answer the question: what would this team’s winning percentage be in a luck-neutral environment. The three graphs that followed:
- Some more things of note: the coefficient of determination for this data set is even stronger. 0.539 for the AL, 0.445 for the NL, and 0.460 overall. Again, the sample size is small, there is some statistical noise in the data, and I didn’t preform any sort of test of significance. Still, this is a fairly shocking result. 0.460 isn’t a particularly strong coefficient of determination, but we’re not shooting for 1.0 here. If it were really 1.0, Adam Dunn would be paid $40 million a year*.
*An exaggeration, of course. But maybe not.
Interpreted literally, this would mean that 46% of a team’s ability to win baseball games is derived only from their ability to draw walks, hit homers, and limit the opponents from doing the same. We’re not including the ability to hit singles, doubles, triples, steal bases, run the bases well in general, field the ball, for pitchers to get ground balls, for pitchers to strike batters out, etc.. We’re talking about two things here, walks and homers.
I’m sure there are other metrics that correlate just as well, if not better, with winning percentage. The thing is, I don’t think you’ll find another metric with as little statistical noise as walks and homers that correlates as well with winning percentage as Beane Count does.
In addition to being a “silly little thing” (as Rob calls it), it’s actually a pretty important thing to maximize (or minimize, golf scoring, remember) if you’re concerned with winning baseball games.
Final Thought: I don’t think I’ve discovered anything new here. But quantifying it makes it that much more real, I guess.
Final Thought #2: The eight playoff teams among the top-12 teams in Beane Count. The 4 in the top-12 that missed the playoffs? The White Sox, Rangers, Rays, and Braves. The latter three teams came very close to making the playoffs and were very good teams. The White Sox probably stumbled on a lot of bad luck.
The p-values for the study have been requested. P-values are stated as probabilities. In this case, they indicate the probability that the correlation demonstrated in the data is a result of random chance and no underlying correlation exists:
June 6, 2009 at 12:50 am by Capitol Avenue Club under Atlanta Braves, Research Studies, Statistical Analysis
I just finished up a little study that I’m calling “Stinginess Ranks”. I got the idea for this study from watching Braves games all season and getting angry when we do stupid things or just incompetent things in general that either a) cost our team outs on the offensive side of the ball or b) give the opposition free outs. Outs are the currency of baseball and recently teams have been shifting towards a philosophy of treating them with more respect than any other thing in the game. Using an out is bad, not using one is good. Making an out on defense is good, not making one is bad. Every out a hitter uses decreases the probability that his team will win (unless of course a run scores, but a non-out run increases the probability more, so in general this statement is true). Every out a defense converts increases the probability that their team will win. Giving them away for free is never a good thing. I wanted to see just how bad my Braves are at giving away free outs.
The formula for this isn’t too complex. It is 1 minus on-base percentage (percent of PA’s that result in outs), plus 1 minus defensive efficiency (percent of balls put in play that a defense fails to convert into outs), plus 1 minus another metric I created, baserunning stinginess. The formula for baserunning stinginess is (Runs plus Runners left on base) divided by (hits plus walks plus intentional walks plus hit batsmen) (the percent of baserunners that aren’t called out on the basepaths for double plays, caught stealing, or other stupid or incompetent errors). Add the 3 up and you have the opposite of stinginess rating. Rank them, subtract their ranks from 31, and you have a ranking of a team’s stinginess. Here’s how it came out:
4. New York Yankees
6. Tampa Bay
7. Los Angeles Dodgers
8. Minnesota Twins
12. Chicago Cubs
14. St. Louis
18. Los Angeles Angels
19. New York Mets
21. San Francisco
23. San Diego
27. Chicago White Sox
29. Kansas City
I was not at all surprised to see the Braves were near the bottom of the list. We’ve been giving outs away like candy on both sides of the ball all season. Our defensive efficiency has improved greatly, but we’re still not awesome. I’m not surprised by any of the first 9. Boston I figured would be higher because they embody the stingy philosophy. But sometimes your philosophy and execution don’t coincide. Pittsburgh is near the top largely because of their much improved defense.
It is not necessary to be stingy with outs to win. But it is a lot harder. And I hope the Braves learn this. Soon.