Away Games: A Minnesota Twins Blog

Could Joe Mauer hit .400?

July 9, 2009 · Leave a Comment

Well, of course he could. But what we (and by we, I mean baseball nerds with way too much time  on our hands) want to know is, how likely is it that Mauer will hit .400?

The conventional wisdom is that it’s no longer possible to hit .400–or at least it’s much more difficult than it used to be. The absence of any .400 hitters since Ted Williams would seem to confirm that diagnosis. But John Bonnes, the Twins Geek, has a provocative post up today arguing that in fact, it’s getting easier to hit .400. His evidence is simply that of all the players who have come close to .400 since Williams, the majority have done it in the past 15 years. On top of that, some–like George Brett and Tony Gwynn–have come within a few hits of the achievement. On the basis of this observation, the Geek says that maybe Mauer has more of a shot than we think.

This is, on the face of it, intriguing evidence. But the Geek is a numerically astute guy, and so I was a little disappointed that he didn’t mention a sabermetric classic on this topic: the late Paleontologist Stephen Jay Gould’s essay on the dissappearance of the .400 hitter (that’s not a link to the actual essay, which I couldn’t find online). Gould was arguing against people who thought that the decline of .400 hitting was due to the declining quality of hitters. Gould argued that paradoxically, the decline of .400 hitting was due to the fact that all players were actually getting better. Because the general level of play was higher, it was more difficult for any player to be so far ahead of the pack that they could hit .400.

Gould supported this argument by showing that batting averages had become less variable. That is, there are both fewer really good hitters and fewer really bad hitters, becaue everyone is more bunched together within a narrower of batting averages. When he did this analysis originally, back in the 1980’s, he painstakingly put together the data by hand, while he was laid up in bed recovering from an illness. But today, of course, we have the statistics at our fingertips. So, using data from the Lahman database, I thought I’d extend Gould’s analysis and see what it has to say about the Twins Geek’s hypothesis.

All the graphs below are based only on players who meet the modern definition of qualifying for a batting title: 3.1 plate appearances per team games played. This wasn’t actually the rule used prior to 1957, but I applied it anyway for simplicity.

First off, here’s a picture that demonstrates what I mean when I talk about the decreasing variation in batting averages. It’s a comparison of the distribution of batting averages in 1900 and in 2000.

avgdist

You can see that the hitters in 1900 were more spread out than the hitters in 2000. There are more hitters with really high averages, but also more hitters with really low averages. Even though the average hitter had a higher batting average in 1900 (signified by the peak of the curve being farther to the right), there were still more hitters down around the .200 mark (the red line is above the black line at the left end.) Back in 1900, a “good glove, no hit” infielder could still find a starting job in a way he couldn’t today.

To get a general idea of how batting average has become less variable over time, we can look at the standard deviation of batting average by season. The standard deviation essentially measures how spread out the distribution of batting averages is. (Technically, it measures the average distance from the mean.) The higher the standard deviation, the more spread out the batting averages are.  See this graph, which is adapted from one that Gould originally produced:

sdavgtrend

You can see that up through the early 1980’s, when Gould’s analysis was done, batting average was becoming less and less variable. This happened even as the average level of batting average bounced around between “pitcher-friendly” and “hitter-friendly” eras:

meanavgtrend

But if you look at those graphs, you’ll notice that something happened in the ’90’s: batting averages went up overall, and the variation in averages also went up. Whether that was because of expansion, or the steroid era, or whatever, I can’t say. But that’s not what I’m interested in explaining. I want to know how easy it is to hit .400 these days. Higher batting averages + more variability should equal a better chance of a .400 hitter. But how much better?

Fortunately, it’s possible to get an answer to this question that’s at least reasonably precise. If you look back at the first graph, you’ll see that the batting averages of all the qualifying hitters in any one season approximate a bell curve, or what’s called a normal distribution. And the nice thing about things that are normally distributed is that we can predict the probability that a normally distributed variable will take on any particular value. If we know what the average of all batting averages is, and we know the standard deviation of batting averages, we can predict the probability that a particular player will hit .400 or above.

This means that we can predict the probability of hitting .400 in each year. In order to smooth out year-to-year fluctuations in the mean and standard deviation of batting averages, I took the average of the previous five years. Then I calculated, for each year, the probability that some hitter would hit .400 or better that year. For comparison, I also calculated the probabilities for hitting .380 and .390. Keep in mind this is the probability of any hitter getting to .400 (based on the number of people who qualified for the batting title that year), not the probability of any one particular hitter doing it.

probavg

The first thing I have to say here is that Twins Geek was really onto something. The chances of somebody hitting .400 jumped  up in the last 15 years, to levels not seen since before World War II.

That’s the good news for Joe Mauer. The bad news is that this trend seems to have reversed itself, and things are back to the way they were in the 1980’s. If you look at the charts above, you’ll see that this is not necessarily because averages have come down overall (although they have, some), but because they’ve become less variable, more bunched together.

The other bad news for Joe, of course, is that even in the batting bonanza of the late 1990’s, the chances that anyone would hit .400 were never even 0.5%. It’s just a really hard thing to do. Which is all the more reason that despite all that I’ve said here, I’m going to keep on watching and rooting for Mauer along with the Twins Geek.

→ Leave a CommentCategories: Uncategorized

Game 8: Ick.

April 13, 2009 · Leave a Comment

Well, that wasn’t much fun. Kevin Slowey had a bad night, and judging from the pitch f/x data, the culprit was his curveball and changeup. Or the lack thereof:

Kevin Slowey's pitch locations

Kevin Slowey's pitch locations

These filled circles are triangles, and the triangles are changeups.  Slowey threw a grand total of six curveballs and four changeups. And it’s probably just as well that he didn’t throw any more, since five of those ten pitches were turned into hits. But the fact remains that Slowey doesn’t have overpowering stuff, so he won’t be very successful if he has to rely on only his fastball and his slider.

Hopefully Slowey will fiind the feel for those offspeed pitches the next time around. Things look less good for Toronto’s Jesse Litsch, who came out of the game in the fourth inning with an arm injury of uncertain severity. Watching the game, it appeared that Litsch suddenly experienced some kind of pain and then immediately took himself out of the game. But look at this plot of the spin and speed of his pitches, by inning:

Jesse Litsch's pitches

Jesse Litsch's pitches

It looks like something was already wrong with Litsch after the first inning–his velocity went down, his slider wasn’t breaking as hard, and his fastball had a different spin, more like a changeup. Maybe that explains why he got hit so hard.

→ Leave a CommentCategories: Uncategorized

One Pitch

September 30, 2008 · Leave a Comment

That’s what the Twins season comes down to: one pitch. Specifically, Nick Blackburn’s 76th pitch, which Jim Thome blasted over the centerfield wall to score the only run of game #163. Below is the entire at-bat, as it appeared from the perspective of the umpire, with estimated pitch trajectories calculated from pitch f/x:

Jim Thome's fateful at-bat in Twins game number 163

Jim Thome goes deep

The pitch was a hanging changeup, and it richly deserved to be hit a long way. Still, how heartbreaking to have the season end because of that one mistake.

→ Leave a CommentCategories: Uncategorized

Series of the Year

September 26, 2008 · Leave a Comment

Obviously, sweeping the White Sox was huge for the Twins’ season. But how huge? I decided to run a little simulation, to see what the team’s chances of going to the playoffs was before and after this series. For most of the relevant games (the Twins-Royals series, the Sox-Indians series, the possible Sox-Tigers makeup game) I assumed each game is a toss-up, with each team having a 50-50 chance of winning. That’s obviously not quite right, what with pitcher matchups, home-field advantage, the difference in opponent quality, and so on, but it’s a good rough guide. To account for the fact that the Twins-Sox series was at the dome and the one-game playoff would be at US Cellular, I gave the twins a 60% chance of winning the home games, but only a 40% chance of winning a playoff on the road.

Granted, this isn’t as complex or as accurate as something like Baseball Prospectus’s Postseason Odds Report, but the overall playoff odds that come from my quick-and-dirty method are pretty similar. And doing my own simulation allows me to look a little deeper into the likely scenarios, beyond what the BP page shows.

So after simulating the end of the season 100,000 times, here’s what I got:

Before the sweep:

  • Probaility of the Twins making the playoffs: 19%
  • Probablity that the Sox have to play their makeup game: 27%
  • Probability that the season is decided by a one-game playoff: 13%
  • Most probable outcome: Sox win the division by 1.5 games (18%)

And after the sweep?

  • Probaility of the Twins making the playoffs: 61%
  • Probablity that the Sox have to play their makeup game: 55%
  • Probability that the season is decided by a one-game playoff: 27%
  • Most probable outcome: Twins win the division by one game (28%)

To say that this series saved the Twins season is putting it lightly.

→ Leave a CommentCategories: Uncategorized

Youth Movement

July 24, 2008 · Leave a Comment

Dave Studeman has an article in the Hardball Times today asking whether veteran players “know how to win”. Unsurprisingly, at least to me, there’s no evidence that older players are somehow better at handling the pressure of pennant races. But what really caught my attention was the appendix to his article reproduced below:

References and Resources
As a reference, here is a list of each team’s Win Shares Age this year. Win Shares age is essentially a team’s age weighted by the contribution of each player (as measured by Win Shares). There are several youth movements to note: the Giants are 3.2 years younger than last year’s team, the Twins are 2.5 years younger, the Dodgers are 2.3 years younger and the Rangers are 2.1 years younger.

Team        WSAge
MIN         25.7
TB          25.7
ARI         26.2
OAK         26.4
FLA         26.7
WAS         27.0
LAN         27.0
CLE         27.2
KC          27.2
TEX         27.2
ATL         27.3
PIT         27.4
LAA         27.4
COL         27.4
MIL         27.8
CIN         27.9
BAL         28.0
SF          28.1
STL         28.1
SEA         28.3
CHA         28.5
NYN         28.6
SD          28.7
BOS         28.7
CHN         28.9
DET         29.2
PHI         29.2
TOR         29.3
HOU         30.8
NYA         31.6

This “Win Shares Age” statistic is a little convoluted, but as best I can tell it measures: a) how young a team is; and b) how much the younger players are contributing to the team’s on-field success. And the table shows that the Twins are tied with the Rays for the title of “best young team”. And that’s even with Carlos Gomez batting leadoff for half the year. I have to say, this makes me optimistic about the next few seasons, even if I’m still skeptical that this is a playoff year for the Twins.

Also: oh boy, were the Giants ever old last year! They had the biggest youth movement in the majors and they’re still in the older half of the league.

→ Leave a CommentCategories: Uncategorized

Announcifying

May 17, 2008 · Leave a Comment

I’m watching the Twins-Rockies game on mlb.tv, and man is this Colorado broadcast team awful. An inning ago, the color guy talked over the play-by-play guy so he could finish some pointless story about eating on the road. And now they just got themselves totally confused about which Hernandez brother is which.

Announcer 1: “El Duque is ten years younger than him.”

Announcer 2: “What? What did you just say? This guy is ten years older than El Duque? He’s, like, 61″.

They went around like this for a while until they figured out that they had the whole thing ass backwards. Truly terrible. I didn’t think a TV baseball crew could be this bad without involving Hawk Harrelson.

For the record, the Detroit broadcast team is the best one I’ve seen while watching mlb.tv. That color guy really teaches you things about the game.

Also, I’ve always wondered when someone would do webstreaming alternative play-by-play that you could run concurrently with the video feed of a game while muting the official announcers. I feel like a lot of people would be psyched about that.

→ Leave a CommentCategories: Uncategorized
Tagged:

Gomez Update: Walk On

May 10, 2008 · Leave a Comment

I had already finished and posted my Carlos Gomez analysis when the G-man went ahead and became the winning run in tonight’s exciting game. So of course I couldn’t go to sleep without writing up a coda on this game.

As Dick & Bert noted, this was only Gomez’s fourth walk of the year(!), so that in itself is noteworthy. Beyond that, though, I wondered: how did Gomez’s performance accord with my analysis? Any good scientist will tell you, after all, that one of the best tests of a model is how well it fits new data.

First, here’s a plot of all the pitches Gomez saw tonight:

After having just spent all day with Gomez’s pitch data, this graph immediately looked really weird to me. So much so, in fact, that I went and loaded up Gameday just to make sure I hadn’t plotted the data wrong.

What’s so odd? Well, the Red Sox decided to pitch Gomez inside tonight. And if you look back at the pitch plots in my earlier post, you’ll see that virtually no-one has done that this year. It’s been away, away, away.

Other than that, though, tonight mostly seems in keeping with my analysis of Gomez’s recent transformation into a better hitter. He laid off the pitches low and away, just as he has done since April 23rd. And the Red Sox didn’t give him much offspeed stuff, which is also consistent with the recent data. When Gomez did swing at pitches out of the strike zone, they were high pitches–again, consistent with what we’ve seen lately.

But the big news, of course, was that our man worked a walk. Here’s how he did it against Jonathan Papelbon. He got nothing but fastballs; he fouled off the ones in the strike zone, and he let the other ones go for balls. Simple as that. Observe how it’s done (red means foul, black means ball):

Of course, Carlos’s new plate discipline could still be a fluke. But you can’t help but love this at bat!

Meanwhile, the other hero of the game was Mike Lamb, who blooped a single to bring in the winning runs. That was a welcome change from what Lamb has done for most of this season, which is make tons and tons of outs. In fact, just as Lamb was coming up in the ninth inning, I was thinking that I needed to start working out my next in-depth analysis, tentatively titled “Why Does Mike Lamb Suck So Much?” And I’m still planning on doing it. But maybe if we’re lucky, tonight was the beginning of Lamb’s Gomez-like transformation from scrub into impact player.

→ Leave a CommentCategories: Uncategorized
Tagged: ,

Bedtime for Bonser

May 9, 2008 · Leave a Comment

Well, after that nice game where he got shelled in the first and then buckled down, Boof came out and did some garden-variety sucking tonight. How did he approach each batter? Here’s a plot, fresh and hot out the pitchf/x kitchen. Different at bats are color-coded, and the pitch result is marked by the symbol. Click to enlarge:

What immediately jumps out at me is:

  1. Boof didn’t want anything to do with David Ortiz. And sure enough, he threw him one hittable pitch, and Ortiz hit it.
  2. In three at bats, Mike Lowell didn’t see a single pitch on the inner two-thirds of the plate! Pretty cool.

→ Leave a CommentCategories: Uncategorized
Tagged: ,

Slowey Train Coming

May 9, 2008 · Leave a Comment

Kevin Slowey made his return yesterday, and the results were uneven. I didn’t watch the game, but apparently he was doing OK until the fifth inning.

It’s crucial that Slowey contribute quality innings, especially with Pat Neshek injured. We’ll have to wait and see how things go for him.

In the meantime, here’s a chart I whipped up, comparing yesterday’s start to the one Slowey made in April. I don’t have any particular point to make with these. This isn’t an analytical post (unlike the Gomez post). This is more along the lines of, “I figured out how to get all this neat pitch data, so why not share it with the world!”

On the left you see Slowey’s location, with pitches coded according to pitch type and inning. This just use’s MLB’s somewhat inaccurate pitch classifications. On the right is a plot of pitch type using speed and “spin”, the latter calculated according to Fast & Nathan’s equation (see here for more.) I coded these pitches by their MLB classifications, and by the outcome of the pitch (ball, strike, or in play).

The graph isn’t really readable this small, so click on it to get the full size version. From what I can ascertain, Slowey is supposed to throw a fastball, a slider, a curveball and a changeup. I think I can pick them all out on the the speed/spin graph–the curveball on the lower left, the changeup on the upper left, the fastball on the upper right and the slider just down and to the left of the fastball. Or maybe that’s just a different kind of fastball, I don’t know. Either way, it looks like his pitches were a little less distinct this time than in his first start. You can also see that he hung a bunch of curveballs in the fifth, generally not a recipe for success. Not a lot to conclude from this, but it will be interesting to see what his next few starts look like

→ Leave a CommentCategories: Uncategorized
Tagged: ,

Hello world!

May 9, 2008 · 1 Comment

This blog is going to be a place for my commentary about the Minnesota Twins. Hopefully, it will be heavy on sabermetric analysis using pitch data from baseball’s pitch/fx system.

The name of the blog refers to the fact that I haven’t lived in Minnesota for almost ten years, yet I can’t quit my Twins. So as much as I love living here in New York, it always feels a little like an “away game” to me.

Check out the next post for my first attempt to crunch some numbers!

→ 1 CommentCategories: Uncategorized