Fun With Beane Count

October 14, 2009 at 12:00 pm by under Research Studies, Statistical Analysis

Updated 10/15/2009 : See Bottom

In an effort to deviate from prospects (I’ve been doing them constantly for awhile, now), I’ve written an article on “Beane Count”.  Enjoy.

Sabermetricians are constantly talking about walks and homers.  They are, perhaps, the two most important secondary offensive skills.  We frequently judge players based on their slash statistics–their batting average, their on-base percentage, and their slugging percentage.  In order to produce secondary offense (offense independent of batting average), hitting homers and drawing walks are hugely important.  In addition to boosting one’s secondary offensive production, walks and homers are also relatively free of statistical noise.  That is, their aren’t a lot of outside influences that distort the meaning of the data.  When a player draws a walk, he’s drawn a walk.  There’s no ball put in play and it’s simply a function of three things: plate discipline, pitchers’ control, and umpires.  While a batter doesn’t have any control over a pitcher’s ability to throw strikes or an umpire’s ability to call balls balls and strikes strikes, these two things mostly even out over the course of a season, leaving only the third thing–plate discipline.  The same can be said about Home Runs.  There’s relatively little luck involved with the ability to draw a walk or hit a homer.  The opposite is the case with batting average, a metric that is much more luck-dependent.

Rob Neyer writes on May 6, 2009:

Eight or nine years ago, I came up with a silly little thing called Beane Count, which was a way of looking at how teams fared in a couple of sabermetrics-friendly measures: home runs and walks. How many you get, and how many you give up.

The metric is calculated on a ranking scale.  If you rank 16th in the league in home runs hit, you get 16 points.  The same for home runs allowed, walks drawn, and walks allowed.  It’s scored by the golf method–lower is better.  Here’s what the final 2009 Beane Count standings look like:

Team Beane Count
Red Sox 14.0
White Sox 21.0
Yankees 21.8
Rays 22.1
Rangers 26.0
Angels 26.7
Twins 27.0
Blue Jays 28.8
A’s 29.7
Tigers 37.0
Indians 39.1
Mariners 39.2
Royals 42.0
Orioles 45.2
Rockies 12.0
Braves 21.0
Cardinals 21.0
Phillies 25.0
Diamondbacks 27.0
Dodgers 27.5
Cubs 30.1
Brewers 36.0
Marlins 37.1
Nationals 39.0
Reds 41.0
Pirates 41.0
Padres 44.0
Giants 44.5
Astros 46.0
Mets 51.0

I love the simplicity and good intentions of the metric, but how well does it predict the ability for a team to win baseball games?

We’ll take a look at how well Beane Count did in 2009.

For the purposes of this study, I used two sets of data.  First, I compared the opposite of a team’s winning percentage to their Beane Count.  I used the opposite of winning percentage (1-W%), rather than the actual winning percentage to generate a positive correlation, rather than a negative one of the same magnitude.  Since Beane Count works like golf (lower is better) and winning percentage works the other way, 1-W% also works like golf.  I ran three sets of data.  One using only the AL’s numbers, one using the NL’s numbers, and one using all 30 teams’ numbers.  The following charts should be self-explanatory.

1WAL

1WNL

1Wboth

I previously mentioned I used two sets of data.  The second set was not the teams’ actual winning percentage, but their 3rd order winning percentage.  This gives you a better snapshot of how good each team is.  It attempts to answer the question: what would this team’s winning percentage be in a luck-neutral environment.  The three graphs that followed:

1PCT3AL

1PCT3NL

1PCT3both

*An exaggeration, of course. But maybe not.

Interpreted literally, this would mean that 46% of a team’s ability to win baseball games is derived only from their ability to draw walks, hit homers, and limit the opponents from doing the same.  We’re not including the ability to hit singles, doubles, triples, steal bases, run the bases well in general, field the ball, for pitchers to get ground balls, for pitchers to strike batters out, etc..  We’re talking about two things here, walks and homers.

I’m sure there are other metrics that correlate just as well, if not better, with winning percentage.  The thing is, I don’t think you’ll find another metric with as little statistical noise as walks and homers that correlates as well with winning percentage as Beane Count does.

In addition to being a “silly little thing” (as Rob calls it), it’s actually a pretty important thing to maximize (or minimize, golf scoring, remember) if you’re concerned with winning baseball games.

Final Thought: I don’t think I’ve discovered anything new here.  But quantifying it makes it that much more real, I guess.

Final Thought #2: The eight playoff teams among the top-12 teams in Beane Count.  The 4 in the top-12 that missed the playoffs?  The White Sox, Rangers, Rays, and Braves.  The latter three teams came very close to making the playoffs and were very good teams.  The White Sox probably stumbled on a lot of bad luck.

Update

The p-values for the study have been requested.  P-values are stated as probabilities.  In this case, they indicate the probability that the correlation demonstrated in the data is a result of random chance and no underlying correlation exists:

1-W%Both
R 0.647
Rsquared 0.418
P-Value 0.000112
1-W%NL
R 0.611
Rsquared 0.373
P-Value 0.011924
1-W%AL
R 0.692
Rsquared 0.479
P-Value 0.006103
1-Pct3 Both
R 0.678
Rsquared 0.460
P-Value 0.000038
1-Pct3 NL
R 0.667
Rsquared 0.445
P-Value 0.004767
1-Pct3 AL
R 0.734
Rsquared 0.539
P-Value 0.002802

8 Responses to “Fun With Beane Count”

  1. Keith R says:

    What if you did the same analysis but with pitching? In other words how does a team’s pitcher’s not giving up HRs and walks correlate with winning?

  2. Beane Count includes walks drawn, homers hit, walks allowed, and homers allowed. Though if I used only walks allowed and homers allowed, my belief is that the coefficient of determination would be about half of what it was for Beane Count. Since Beane Count, like baseball, involves half hitting and half pitching. If you’re only using the pitching side, you’re correlation is going to be about half of what it was using the entire picture.

  3. Is the reverse true? Hitting HR and taking walks vs giving up homers and issuing walks?

  4. I’m afraid I don’t understand the question.

  5. Jesse says:

    What’s your p value on these variables? Unless you tell us at what level of significance the variables are accurate these graphs don’t tell us anything.

  6. Here are the P-Values:

    1-W%Both
    R 0.647
    Rsquared 0.418
    P-Value 0.000112
    1-W%NL
    R 0.611
    Rsquared 0.373
    P-Value 0.011924
    1-W%AL
    R 0.692
    Rsquared 0.479
    P-Value 0.006103
    1-Pct3 Both
    R 0.678
    Rsquared 0.460
    P-Value 0.000038
    1-Pct3 NL
    R 0.667
    Rsquared 0.445
    P-Value 0.004767
    1-Pct3 AL
    R 0.734
    Rsquared 0.539
    P-Value 0.002802

    With that said, it’s not the point. Like I said, I didn’t perform any test of significance. That’s not what we’re after, here.

    A test of significance would attempt to answer the question “does Beane Count determine winning percentage”. Of course it doesn’t. This study would fail any test of significance I attempted to perform on it at a reasonable confidence interval.

    But it’s a systematic problem. Of course Beane Count doesn’t determine winning percentage, we’re talking about walks and homers in isolation.

    I’m not attempting to shock the world with a revelation. If I were, a test of significance would certainly be in order. Just making an amateur attempt to quantify the importance of Beane Count as it relates to winning baseball games. And no, the data isn’t particularly meaningful. But like I said, I’m not testing pharmaceuticals here.

    The correlation isn’t powerful, but it is somewhat useful if interpreted carefully and correctly.

  7. Matt C says:

    “The eight playoff teams among the top-12 teams in Beane Count.”

    um, not quite right… the D-backs and Twins are actually tied for 11th, ahead of the Dodgers at #13.

    not saying BC is meaningless, but what does it mean that the D-backs (a terrible team) are virtually tied with the Dodgers (best team in the NL)?

  8. Good catch, Matt. You’re right. My bad.

    What that means is the Diamondbacks were bad at a number of things other than walks and homers. I want to be clear, I’m not asserting that maximizing Beane Count is the only thing you have to do to win baseball games. That’s certainly not the case. But it does help.

Leave a Reply