Statistics, Metrics, and Statistical Analysis
August 3, 2009 at 12:55 pm by Capitol Avenue Club under Atlanta Braves
Talking about sports-writers. Some writers you read because they’re entertaining. Some writers you read because they’re informative. Some writers you read because they stimulate your brain (or possibly something else). But there’s another breed of writers, a breed that has a certain way with words and thoughts. A breed of writers that can take what you’re thinking and put it onto paper in a way that–no matter how much thought, effort, and time you spent trying to do so–you simply can’t. These are the truly talented ones. The ones that form a perfect marriage between thought and art.
Do you ever read a piece and think to yourself, “This piece basically sums up my entire position on an issue, but I’m not nearly talented enough to express it the way he just did”? I do. Rather frequently. And I had one such moment this morning when I read a brilliant blog post from Joe Posnanski*. And I quote**.
*Joe Posnanski qualifies as part of the select group I mentioned, but he transcends it. Because he’s able to mix in all of the above. He’s funny, he’s entertaining, he’s informative, he’s just a brilliant writer. One of the best–if not the best–pure writers in the sports world. Yet he is as sports-savvy as they come. A truly rare combination of attributes. A sort of “once in a generation” writer.
**This blog post is entitled “Yuni Watch 7/31″. He’s been doing a “Yuni Watch” as a sort of comical critique of the Yuniesky Betancourt acquisition that Dayton Moore made. An incredibly stupid move made by a GM who, by all intelligent accounts, should be out of a job right now.
Today’s Yuni watch is a lot about defense and statistics and is brought to you by the Detroit Tigers, who always seem to do something reasonably big at the trade deadline. I don’t know if Jarrod Washburn wraps up the American League Central — you can’t ever count out Gardy’s boys — but as a fan it has to be awesome to know that you’re team is going to be aggressive and go for it the last couple of months.
Yuniesky Betancourt on Royals:
49 plate appearances
.130 average
.128 on-base percentage
.174 slugging percentage
1 extra base hit
0 walks
2 sac hits
2 grounded into double plays
-20 OPS+Well, a 1-for-4 day in Baltimore raises the ol’ batting average 11 points. I guess it was a pretty weak hit that should have been barehanded by Melvin Mora, but hey at this point I would never take hits away from Yuni. He’s hit a couple of line drives at people the last couple of days, and frankly I would like to give him credit for those.
The defensive question with Yuni, meanwhile, is fascinating. Thursday’s Royals game, unlike the revolution, was not televised and so I did not see this myself: But apparently Nick Markakis hit a scorching ground ball to short that (at the very least) COULD have been fielded by a top defensive shortstop. Among those who saw it, a couple of people thought it actually was an error and a play that should have been made. More thought it was a difficult defensive play but one that could have been made.
Betancourt did not make the play, of course, and that was a key hit in a four-run rally and another loss.
Now, I bring this up because I’ve heard from a couple of people I respect who watch Betancourt play live every day and are actually reasonably upbeat about his defense. He seems to have made a couple of nice plays — especially coming in on slow ground balls — and he has only made one error, and he seems so far to be pretty solid making the routine play, something that does not go unnoticed in Kansas City where the routine has so often turned into the comical.
And while I obviously do not have anything approaching complete trust in eyeball evaluations — more on that in a minute — there are a couple of defensive stats that sort of back this up. According to ESPN’s stats, Yuni is fifth in baseball among shortstops in Zone Rating which is a rather simplistic but not useless way to measure the percentage of balls a player fields in his zone. And his range factor the last couple of years has been right around league average (it’s a below average this year but, hey, work with me here). The Gold Glove talk that some have connected to Betancourt is pretty much indefensible sky-is-purple-polka-dot nonsense, but on a day-to-day basis watching … I could see how he might look to be OK defensively if you squint hard enough.
This is why the defensive question is fascinating … because as we all know, a couple of the more advanced stats show that Betancourt is a horrendous defensive shortstop — worst-in-the-game bad — and has been horrendous even in his short stay in Kansas City. According to the Dewan plus/minus he’s ALREADY minus-3 in Kansas City, meaning that (using extreme video study) Yuni has made three fewer plays than the average Major League shortstop would have made.
Then there’s Ultimate Zone Rating … his UZR is already -2.3 in Kansas City, which means he has ALREADY allowed two more runs than the average big league shortstop because of his lousy defense.
So what gives? Average or dismal? Promising or depressing? You probably remember Bill James famous point that the difference between a .275 hitter and a .300 hitter over 600 at-bats is 15 hits a year. That’s about 2.5 hits per month over a full baseball season. That’s about one extra hit every 10 or 11 games. Bill asks: Would you notice that?
It’s easy to say you would … but you probably wouldn’t. Or anyway, I wouldn’t. First off, to notice it you would pretty much have to watch EVERY SINGLE GAME because if you watched only, say, 125 games, there’s a chance you would see the .275 hitter have more hits than the .300 hitter. You would also have to watch every inning of every game because some of those extra hits might actually come when you’re off mowing the lawn or shopping for razors or flipping channels to see who is winning the golf tournament.
And even then — even if you watched every inning of every game and were paying close attention, I would suggest you STILL would not be able to tell the difference because the .275 hitter might hit with more power. He might have a sweeter looking swing. He might get a few of his hits in clutch situations that burn in the memory. Seems to me that we often talk about how baseball is a long season, but we don’t always consider what that means. It means that in baseball we enjoy the moments, and we’re swayed by the moments, and we long for the moments. But context? We get our context from the numbers. It’s simply too long a season to process.
Now, all that revolves around something really simple — batting average. Hits divided by at-bats. Simple and stark stuff — there are few vagaries or complexities in those numbers (OK, yes, there are a couple of complexities — errors, sacrifices, walks and so on, but generally speaking it’s pretty simple). But defense is much, much more complex. A defensive play involves a thousand tiny pieces — positioning, pre-pitch reaction, post-pitch reaction, speed of the ball, spin on the ball, situation on the field, quality of the field, luck of the bounce, brightness of the sun, glare of the lights, ability of teammates, speed of the runner, sound of the ball hitting bat and a bunch of other stuff.
So it’s much, much more complicated. And it’s much, much more subjective. Look there was just one Yuni play on Thursday, and I talked to seven people about it and two thought it was a terrible defensive play, and four thought it was a really tough play, and one thought it was an impossible play. That’s just ONE PLAY in a long, long, long season.
So my feeling is this: if you had a three big league shortstops (so obviously — based on them being big leaguers — you know all three are gifted in their own way), watched them closely for 162 games, I have no doubt you would be able to tell certain things. You surely could tell which one has the strongest arm, who makes the most diving plays, who seems the most sure-handed, who seems to go better to his left, who seems to go better to his right, who seems to have the best balance, who seems to stand in better on the double pay and a bunch of others things. I don’t think Dayton Moore is wrong — I do think you can tell who CAN play defense well by watching. But would you really be able to say who had the best defensive year? I say there’s no way. I say it’s like trying to pick between the .275 and .300 hitters … multiplied by about 100.
I say that if pushed to make that choice without access to any statistics you would do it one of two ways:
1. You would go by some sort of aesthetic opinion based on style and form and tools, which (it seems to me) would tell you who SHOULD be the best of the three, but not necessarily who IS the best of the three.
2. You would end up counting in your head … you would count “errors” or you would count “diving plays” or you would find yourself swayed by “clutch defensive plays” — this guy made a great stop with the bases loaded with two outs in the ninth, that guy bobbled a grounder with the winning run on third in the eighth — you would try to figure out who makes the most plays. And your numbers, no doubt, would be off or too subjective.
And that’s why I look to the advanced fielding statistics. They’re not as good as they could be or will be … I think everyone would agree about that. But they try (and often succeed) and taking an objective look at how effective a player performs defensively. Those numbers say Yuni is a disaster at shortstop. Flawed numbers or not, I would tend to believe those over my lyin’ eyes.
I don’t really care about the Yuniesky Betancourt rant. That’s sort of a Royals-fans-only thing. It’s funny, but not something I really care about. What I love about this piece is he uses Yuni to make a general point. And a very good one. We can not possibly know enough from observation alone to make informed decisions. It isn’t possible. And that is why I rely on statistics.
But this sort of begs the question, what are statistics and why are they useful? I found a lot of definitions of the word “statistics” around the web, but I think the best one I found was from Merriam-Webster Online:
statistics: noun. a collection of quantitative data
What I like about the definition are the words “quantitative” and “data”. What does quantitative data mean? Quantitative practically means numeric. Short and to the point. Easy to define as most adjectives in definitions are. But data is a little bit tougher to define. And we’ll again use the help of Merriam-Webster Online:
data: noun. factual information (as measurements or statistics) used as a basis for reasoning, discussion, or calculation
The key words here are “factual” and “information”. I could keep going, posting the definitions of these words, but what I take from all of these definitions, and as it relates to baseball, is that statistics are a record of what happened presented in a way that is easy to digest and understand. Through score-keeping and box scores or even video studies (as they use for Dewan’s +/- system), what actually happened on the field is recorded as information. It actually happened, so it is factual by definition. Recorded in numeric form, so quantitative. And presented in a way that is easy to digest and understand.
I like Ron Gant, so we’ll use his 1990 season as an example (won’t be the first time I’ve talked about Ron Gant’s 1990). During his first game of the season (April 11, game 1), he appeared as a pinch-hitter in the bottom of the 5th inning for Tom Glavine* and struck-out. This was recorded by the official scorer and translated into a box score after the game. As the season goes on, the data (i.e. factual information) from all of the Box Scores that Ron Gant appears on is extracted and converted into statistics. Things we can look at and get a glimpse of his season from. Things like his batting average, his on-base percentage, his slugging percentage, his stolen-base percentage, how many home runs he hit, etc. This way, we have a snapshot of his season in a manner that is a) meaningful (i.e. it actually means something. Unlike a sportswriter talking out of his rear about how great a player is. With the numbers, we know how great he is. Or at least we can attempt to answer the question in a meaningful way) b) concise, and c) unscathed. We know there’s a lot of statistical noise involved in things like RBI, batting average, etc. But statistics are unscathed because they don’t attempt to eliminate the statistical noise. They’re only trying to report what happened.
*Ah, yes. Maybe Bobby was a better GM than a manager. Russ Nixon pinch-hit for Glavine after 5 innings and having given up 4 runs that day (a day that would encapsulate the spirit of the 1990 season, the Braves lost 8-0 to the Giants). I don’t argue that Tom Glavine should’ve been done after 5, but to that point in the game, nobody except Tom Glavine had an extra-base hit. And for the rest of the game, nobody except Tom Glavine would record an extra-base hit. The Braves had 3 hits and 4 total bases that day. Glavine accounted for 1/3 of the hits and 1/2 of the total bases. Bobby eventually fired Nixon and replaced him with himself. Though I’m fairly confident Bobby would’ve made the exact same decision. Or sent Greg Norton up to hit. I know, I’m being picky here. I’m just having fun. Who doesn’t love scrutinizing managers?
What percent of the time did Ron Gant reach base safely in 1990? The answer is 35.7 percent. I simply looked up his on-base percentage. That’s it. Clear and concise. I know just from looking at statistics what happened. Does this mean I know how good Ron Gant was? No. I simply know what he did. And this is what statistics are useful for. Answering the question, “what happened?”.
Answering the question, “how good is Ron Gant?” is a whole ‘nother animal. That’s where metrics and statistical analysis come into play. What metrics do is they attempt to gather statistics and manipulate them such that they turn into a more specifically meaningful number. What is more meaningful, batting average or VORP? It depends on what question you’re trying to answer, but if said question is, “how good is X player” then VORP is much more meaningful. VORP doesn’t really report what happened. It takes what happened and manipulates it. That’s the difference between statistics and metrics. Statistics report what happened, metrics manipulate statistics to serve a specific purpose. wOBA, xFIP, tERA, etc.. They’re all metrics and they all do a great job, I think, of answering the questions they’re designed to answer. Metrics are useful in their own way. Who will have a better 2nd half, Javier Vazquez or Jair Jurrjens? Fire up xFIP and you’ll see that Vazquez, in all likely hood, will have a better 2nd half than Jair Jurrjens. Even though Jurrjens owned the better first-half ERA than Vazquez. Interestingly, Jurrjens got touched up for 4 runs in 5 innings last night. One of the many examples of the predictive power of xFIP and various other metrics.
With statistics and metrics, we are armed with tools to answer the tough questions. Questions we couldn’t possibly answer intelligently solely from observation. Questions that if we attempted to answer without the aid of statistics and metrics, we might as well just guess. Humans are flawed. They’re biased and often victims of selective memory. Joe Posnanski expressed the point I’m getting at better than I possibly could in his piece, so I won’t elaborate. But you get the idea, observations can’t be solely relied upon to answer the tough questions.
Statistical analysis, the use of statistics and metrics to examine a question, is the route I take when trying to answer these questions. It’s important to give the statistics and the metrics proper context and to fully understand them, but armed with them, we’re capable of much more than we ever were before. Rather than simply addressing an issue from memory, we make arguments using factual evidence. And in my mind, that’s the only way to properly address a question. Armed with facts. Armed with things that can not be disputed. Build your house upon a rock. And statistics are rocks, they’re factual and sound.
The attempt to quantify success and to answer a question with a finite, numerical entity is no new thing. In baseball or in life. How well did you do in school? 4.0 GPA. How did you do on your SAT? 1260. How much money do you make? $50,000/year. These are all attempts at quantifying success. And none of these are perfect. There’s plenty of statistical noise involved with GPA (and all the other ones I mentioned and failed to mention). Cheating, luck, etc.. But they’re an attempt to quantify success. They’re statistics. They’re metrics. They’re as American as the game of baseball itself.
I’d like to bring into this piece a quote from a Bill James article entitled: “Intro to Sabermetrics (subscription required)”:
Sabermetrics is descended from traditional sportswriting. Sportswriting consists of two types of things—reporting, and analysis. Sabermetrics came from that part of sportswriting which consists of analysis, argument, evaluation, opinion and bullshit. I can tell you very precisely when and how we parted ways with traditional analysis.
Sportswriters discuss a range of questions which are much the same from generation to generation. Who is the Most Valuable Player? Who should go into the Hall of Fame? Who will win the pennant? What factors are important in winning the pennant? If Boston won the pennant, why did they win it? If Kansas City finished last, why did they finish last? How has baseball changed over the last few years? Who is the best third baseman in baseball today? Who is better, Mike Lowell or Eric Chavez?
The questions that we deal with in our work are the same as the questions that are discussed by sports columnists and by radio talk show hosts every day. To the best of my knowledge, there is no difference whatsoever in the underlying issues that we discuss. The difference between us is very simple. Sportswriters always or almost always begin their analysis with a position on the issue. We always begin our analysis with the question itself.
If you find a sportswriter debating who should be the National League’s Most Valuable Player this season, his article will probably begin by asserting a position on the issue, and then will argue for that position. If you find 100 articles by sportswriters debating issues of this type, in all likelihood all 100 articles will do this.
What we do is simply to begin by asking “Who is the National League’s Most Valuable Player this season?” rather than to begin by stating that “Albert Pujols is the National League’s Most Valuable Player this season, and let me tell you why.” That’s all. That is the entire difference between sabermetrics and traditional sportswriting. It isn’t the use of statistics. It isn’t the use of formulas. It is merely the habit of beginning with a question, rather than beginning with an answer.
From what I gather, he’s talking about scientific process. We’re tought the scientific method in 3rd grade. Bill James is just applying it to sportswriting. Sabermetrics is a scientific field. As is statistics. When you apply science to baseball statistics, you have sabermetrics. I buy into the sabermetric point of view because it’s solid. It uses tried and true scientific process and a rock-hard foundation, factual statistics. That’s why I prefer to use statistics rather than purely observational anecdotes to answer questions. I’m not suggesting my way is superior, that’s just what works for me.
This is not to say that scouting, watching games, and forming opinions through observation is bad, either. If you try to build a successful MLB roster without any trace of scouting, you will fail. Without a doubt. There are limitations to statistics, they’re only part of the picture when managing a MLB roster. Watching games, not answering questions through statistical analysis, is what I enjoy the most.
But I’ve always thought that if you want to make a point and you want to convince people you’re right you better have some empirical evidence to back up your point. And I believe the only empirical evidence we have is numerical. What happened, written in the language of mathematics and statistics, constitutes the entire body of evidence we can use to answer questions that demand such evidence. Observation is entertainment. Observation is fun. Statistics are evidence. Statistics are facts.
A professor of mine used to always use this line in class. It really hits home when talking about this subject:
The ancient Greeks used to sit around all day and debate whether or not men or women had more hair on their head. Did they ever think to count?
We can sit around all day and debate who is the more productive player, Chipper Jones or David Wright. But if we’re more interested in the answer than the actual debate, the use of statistical analysis rather than observational anecdotes is absolutely necessary. Absolutely.








Great article, by both of you. I don’t mind sabermetrics to a certain point. There is a statistic for just about everything but you can’t judge a player completely on statistics.
Certainly not. And even if you could, what’s the fun if judging players purely based on statistics and not watching them? The whole point of the game is to entertain. Statistical analysis is good for answering questions. But you don’t get the whole picture if you only look at the stats.
Well written piece. On both accounts. Bravo
Very well written piece. The irony, if you want to call it that, is that the “numbers debate” is probably one of the most hotly contested issues among baseball fans. While there are many moderates on the issue, there are also plenty of folks who argue passionately about the value of statistics and numbers on either side of the equation. Best I can tell, those that are “purists” and downplay the importance of statistical analysis tend to do so because they hate the de-humanizing that such analysis entails. Reducing a “hero” to numbers and trying to place him among baseball’s hierarchy solely on such data seems to be really unappealing to some. The lesson Joe Posnanski makes so well is that “hey, these numbers might not be perfect, but they certainly avoid natural human bias.” There’s a lot of value in that so long as folks utilizing numbers also recognize their limits. Again, really well written. Keep up the good work.
Thanks guys.
LaRoche trade analysis is up, by the way.