Posted by: Peter | April 11, 2011

Amazing new baseball ID database

From Tango comes news of something amazing: a regularly updated cross-reference table for all the different player ID numbers out there. I immediately added this to my baseball database (which already includes Baseball Databank, Retrosheet, and Pitch f/x data). Since I already have a script that updates the pitch and player-level data every night, I figured I might as well add the ID data as well. I use R to do it, even though something like Python would probably be more appropriate, simply because I know R better so it was quicker to do it this way. The way the script is written assumes a Linux system and a MySQL database backend, but it could be adapted for other scenarios.

This script will download the latest ID database, unzip it, set up a MySQL database based on the fields that are present in the ID files, save the ID data into new files that are formatted for MySQL import, and then load the data files into the newly-created database. Note that the database “baseballid” must already exist, but it doesn’t need to have any tables in it.

Now I can just run R CMD BATCH update_baseballid.R every night, and I get a freshly updated database in the morning.

tmp <- paste(tempfile(),'zip',sep='.')

register <- read.csv(pipe(paste("unzip -p",tmp,"baseballid*/register.csv")),
rosters <- read.csv(pipe(paste("unzip -p",tmp,"baseballid*/rosters.csv")),

sqline <- function(var,varname) {

 if(class(var)=="character") {

 line <- paste("`",varname,"`"," varchar(",max(nchar(var)),") NOT NULL default ''",sep="")

 } else {

 line <- paste("`",varname,"`"," int(",floor(log10(max(var,na.rm=TRUE)))+1,") default NULL",sep="")




register.query <- paste("CREATE TABLE register  (",
paste(lapply(names(register), function(x) sqline(register[[x]], x)), collapse=',\n'),
", PRIMARY KEY (`key_uuid`)",

rosters.query <- paste("CREATE TABLE rosters  (",
paste(lapply(names(rosters), function(x) sqline(rosters[[x]], x)), collapse=',\n'),
", PRIMARY KEY (`key_uuid`)",


drv <- dbDriver("MySQL")
con <- dbConnect(drv, username="<your username>", password="<your password>", dbname="baseballid", host="localhost")

dbGetQuery(con,"DROP TABLE IF EXISTS register")
dbGetQuery(con,"DROP TABLE IF EXISTS rosters")


for(v in names(register)) if(class(register[[v]])=="character") register[[v]][which(register[[v]]=='')] <- NA
write.csv(register, '/home/pefrase/Misc/pitchdb/tmp/register.csv', row.names=FALSE, na='NULL')

for(v in names(rosters)) if(class(rosters[[v]])=="character") rosters[[v]][which(rosters[[v]]=='')] <- NA
write.csv(rosters, '/home/pefrase/Misc/pitchdb/tmp/rosters.csv', row.names=FALSE, na='NULL')

dbGetQuery(con,"LOAD DATA LOCAL INFILE '/home/pefrase/Misc/pitchdb/tmp/register.csv'
 INTO TABLE register

dbGetQuery(con,"LOAD DATA LOCAL INFILE '/home/pefrase/Misc/pitchdb/tmp/rosters.csv'
 INTO TABLE rosters

system("rm /home/pefrase/Misc/pitchdb/tmp/*")




Posted by: Peter | April 3, 2011

Game Number 2: The Misery Continues

Well, this is certainly a poor start to the season. I’m not too worried about two losses. What I am worried about is this:

Liriano Fastballs, 2008-2011

It looks like Liriano had less fastball velocity yesterday than at any point last season–he looked closer to the bad Liriano from the injury-recovery season of 2009. I’m not going to come to any firm conclusions until we see a few more starts–the fastball speed may pick up as the season progresses, or it may be that this is related to some kind of pitch f/x calibration issue. But it’s certainly something to keep an eye on.

Posted by: Peter | January 18, 2011

Rating the Raters: Bill James 1994

Over at FanGraphs, Carson Cistulli has a nice couple of posts where he looks back to some Bill James player forecasts from 1993 and 1994, in an attempt to see how the predictions of player value held up. Tom Tango adds some additional insight based on running a regression of career WAR on the “dollar value” James assigned to players. Tango ends up using a polynomial fit of the data, resulting in the following equation:

This is good stuff (although I’m not sure why Tango decided to fix the intercept at 0 in that regression). However, I felt like both posts could have really benefited from some kind of graphical display of the data. Especially because I think that looking at a graph changes one’s interpretation of the results somewhat. Fortunately, Carson was nice enough to supply his data. So here are the players from 1994:

Bill James player ratings 1994, and career WAR

The color codes reflect James’s letter grade, or “Z” for ungraded players. The line is a locally-weighted regression (“LOESS“) line that shows the general trend.

What does this picture say? Well, the way I read it, it tells us that for the top prospects–those with A or B ratings–James’s dollar values really tell us something. For players who got valuations over $20 or so, there is a clear positive relationship between the dollar ratings and eventual career WAR. What’s more, there is a clear floor under these players: beyond the $25-30 threshold, none of them ended up being total busts.

On the other hand, it doesn’t look like the dollar values below $20 really contain any information. Among all those C and D grade prospects, James’s ratings don’t help us distinguish between surprise stars like Jim Edmonds (valued at $15, 68.1 career WAR) and total washouts like Stanton Cameron (also $15, 0 WAR). If I run a linear regression using only the players valued over $20, I get:

I think this is probably the more useful relationship, since it throws out all the fringe prospects who aren’t really adding any information.

I can only add my ringing endorsement to Tango’s call for forecasters to release more of this kind of information. I’d love to know whether this pattern is a fluke of this data, or of Bill James as a forecaster, or whether it generalizes. If forecasts are useful for top prospects but not marginal ones, that would be very important to know.

Posted by: Peter | August 12, 2010

Now that’s what bad umpiring looks like

I see that Twins Geek is complaining about umpire Mike DiMuro’s strike calls in yesterday’s game. After my last post, I was prepared to make yet another post on rulebook-versus-true strike zones and the bias of fans. But this, time, you’ll get no argument from me. This is legitimately some BS right here:

(These charts are from the catcher’s perspective.)

The Twins pitchers were getting squeezed all over the strike zone, to right handers and left. Look at those two inside strikes Danks got on righties. Or those high strikes to lefties. The Twins seriously couldn’t catch a break. That doesn’t mean DiMuro has some kind of pro-White Sox bias–we know umpires are inconsistent, and this could have been a particularly unlucky night for the Twins.

Anyway, count me as another vote for the robot strike zone.

Posted by: Peter | August 8, 2010

Balk, then Run

This post came about by a very circuitous and meandering process, as then end result of one of those late-night random Internet-surfing sessions. Sometimes after I watch a Twins game, I’ll switch to a late game from the west coast and keep it on in the background while I do something else.  When I do, I often turn on a Dodgers game in order to listen to Vin Scully’s play-by-play. Doing this recently led me to do a little reading up on Scully, and in the process I stumbled onto this wonderful old Sports Illustrated article from 1964, profiling a 36-year-old Scully.

There are all kinds of great little nuggets in that article, but this is the thing that sent me off on another Internet-trawling expedition:

The National League had told its umpires to enforce strictly the balk rule, which provided that with men on base a pitcher had to stop for one full second in the course of his windup before throwing the ball to the plate. Many pitchers were violating the rule unintentionally, and the umpires soon made so many balk calls that they sounded like a flock of crows in a cornfield. The league office eventually backed down and everything became serene again, but before that happened one of the real crises of the Great Balk War occurred at Los Angeles during a game between the Dodgers and the Cincinnati Reds. The Reds, the Dodgers and the umpires became embroiled in a loud, long discussion on the question of whether or not a pitcher had stopped for one full second. The argument went on and on, and up in the broadcasting booth Scully was obliged to keep talking. He reviewed the balk rule, the National League‘s effort to enforce it, the numbers of balks that had been called thus far in league play compared to the number of balks called in previous seasons, and so on. Finally, with the argument still dragging on down below, Scully brought up the obvious but intriguing fact that one second is a surprisingly difficult length of time to judge. He asked his audience if they had ever tried to gauge a second precisely. He said, “Hey, let’s try something. I’ll get a stopwatch from our engineer…” And with thousands of spectators watching him as he sat in the broadcasting booth, he reached up and back and took a watch from the engineer. “…I’ll push the stopwatch and say, ‘One!’ and when you think one full second has elapsed you yell, ‘Two!’ Ready? One!”

There was a momentary pause and then 19,000 voices yelled, “Two!” The managers, the umpires, the players, the batboys, the ball boys all stopped and looked around, startled. Scully said into the microphone, “I’m sorry. Only one of you had it right. Let’s try it again. One!” And again, a great “Two!” roared across Dodger Stadium and out into Chavez Ravine. The ballplayers were staring up at the broadcasting booth, and one of them got on the dugout phone, called the press box and asked, “What the hell is going on?” The crowd, immensely pleased with itself, waited patiently for the argument on the field to end.

The stuff about Scully playing with the fans is pretty great, but what struck me was the stuff about the balk. I’ve always thought that balks were one of the most ambiguous, complicated, and subjective calls that an umpire can make. I can see why the balk rule is necessary–without it, base-stealing would be nearly impossible–but the existing rule is not a very elegant solution. There are 16 different ways to balk, and many of them rely on judgement calls by the umpire. It often seems that balk calls are random and arbitrary, which causes lots of problems and controversy even before Joe West gets involved. In general, the whole rule seems to me like a kludge, indicating an area where the rules of baseball just aren’t fundamentally designed that well.

But back to the Scully article. I had never before really thought about the way the balk rule was enforced from year to year, nor did it occur to me that there might be big differences between seasons. Now, changing the enforcement of the balk rule isn’t a huge rule change–it’s not like lowering the mound or changing the strike zone–but nevertheless it could impact how the game is played.

So I fired up the Baseball Databank to see if I could verify this alleged one-year explosion of balks in 1963. And sure enough, there it is:

(I defined “balks per 162 games” as 162*9*balks/inning. So it’s an approximation of the average number of balks charged to each team over 162 9-inning games.)

Just as the article suggested, the spike in walks in 1963 was only in the National League. But that apparently wasn’t the only time that the league did something like this–something similar happened in 1950, this time in both leagues, but then too it only lasted for one season. And then there’s the 1980’s, when there was both a general rise in balks over the decade and an enormous spike in 1988.

The latter year, it turns out, was the “year of the balk”. The spike in balks resulted from an actual rule change, in which the balk rule was modified to say that after the stretch, a pitcher had to come to “a single complete and discernible stop, with both feet on the ground.” (There’s even a Twins connection here, as the rule change was allegedly prompted by Bert Blyleven’s near-balks in the ’87 World Series.) This changed caused balks to skyrocket, to the point that the previous single season-record for balks was broken by seven different pitchers, and the AL record for team balks was broken by every team but one.

Just as before, the status quo was restored the next season and balks came back down. But as I noted above, the status quo in the ’80’s still involved more balks than in any other era. So this made me wonder: how might this have affected play?

Baseball in the ’80’s was distinguished by less power hitting and more base-stealing than you see in today’s game. I found that interesting in light of the rise in balks: if there’s one aspect of play that you’d expect to be directly affected by balk enforcement, it would be base-stealing.  All things being equal, more balk calls ought to lead to more stolen base attempts, since pitchers will have more trouble effectively holding runners on without getting called for a balk.

Let’s look at the balk chart again, followed by a chart of stolen bases. I’ve defined base-stealing using this formula:

SBrate = ( SB ) / ( (H - 2B - 3B - HR) + BB + HBP)

This follows the method used by Bill James in his speed scores. It counts stolen bases relative to the number of times that a runner ended up on first base.

[Note that I’m not including caught-stealings–that’s because the data isn’t available consistently before 1950. But when I tested this analysis with caught-stealing included, it didn’t really change the result. Even beyond this issue, this measure isn’t perfect, since steal attempts will usually only happen when second base is empty, which in turn is more likely if the league has a lower on-base percentage overall. It also doesn’t really deal correctly with steals of third (or home). But these are minor issues, and I’m confident the patterns I show below are legitimate.]

So here are the graphs:

As expected, the rise in stolen bases parallels the rise in balks. But the really interesting part here is the comparison between the AL and the NL. During the 1980’s–and only then–the two leagues showed very different levels of both stolen base attempts and balks called, with the NL having more balks and more stolen bases. This suggests that there really was some connection between the two, and it’s not just a coincidence.

So what explains this association between balks and stolen bases? In general, there are three possibilities:

  1. More balks caused more stolen base attempts, because pitchers had a harder time controlling baserunners.
  2. More stolen bases led to more balks, as pitchers spent more time throwing over to hold the runner and therefore had more opportunities to balk.
  3. Some other factor caused the change in both stolen bases and balks.

Possibility (1) seems like the most likely to me, but it’s worth considering the others. Option (2) is plausible in theory but it seems less likely to me–primarily because in other high-stolen base eras, we don’t see a comparable rise in balks. In addition, I can’t really think of a reason stolen bases would diverge so much between leagues otherwise, whereas there’s a good reason the number of balks would be different: the leagues had different officials and separate presidents at the time, so it’s possible the NL just decided to crack down on balks a little harder. And as for option (3), it’s certainly possible that there’s some other causal factor here, but I can’t think what it might be.

Note also that while the long-run increase in balk calls in the ’80’s is associated with more stolen bases, stolen bases don’t seem to change in the seasons after the big one-year spikes in 1950, 1963, and 1988. This makes sense, too–baserunners aren’t likely to change their behaviro immediately; rather, they’ll likely adapt once they realize that the officiating has permanently changed.

None of this is definitive proof–it’s more of an initial exploration. And anyway, the increase in stolen bases in the 1970’s and ’80’s obviously wasn’t all because of the balk rule. But it does look like balk calls were a part of the story. It’s something I’ve never seen discussed, although no doubt there’s a study out there about this that I don’t know about. Regardless, I think this stuff is really interesting to think about. After all, the whole reason it’s possible to do statistical studies of baseball like this one is that the game has been played with mostly the same rules for over 100 years. Because of that, we sometimes over look the ways in which the game hasn’t stayed the same, and how that might effect what happens on the field.

Posted by: Peter | August 6, 2010

Did that umpire really screw Jesse Crain?

Yesterday’s game was certainly the strangest Twins game I’ve seen in a while. Kubel’s game-winning catwalk pop-up would have been odd enough, especially coming right after a 6-run Rays comeback. But on top of all that, there was Jesse Crain’s face-off with Willy Aybar, which ended in a bases loaded walk. Over at Twinkie Town, they’re up in arms about that at-bat, because it appears that home plate umpire Chris Guccione was totally unwilling to call a strike, even on balls that appeared to be in the middle of the strike zone:

That sure looks bad, but when I saw it I was a little suspicious. To see whether this was really as egregious as it looks, we need to put these pitches in context. Yes, it’s true that two of those pitches were called balls even though they were pretty clearly within the rulebook-defined strike zone. But as John Walsh showed in a seminal analysis, umpires don’t call the rulebook strike zone. Not only is the “empirical” strike zone different from the rulebook one, it’s also different for right- and left-handed hitters! So just observing that the umpire wasn’t calling the rulebook zone isn’t that remarkable. What we really want to know is whether he was calling a consistent zone. So those pitches by Crain really need to be put in the context of all the pitches that were called in the game, split up by right- and left-handed hitters  (click the image to expand):

You can see those two questionable balls from Crain on the right-hand side. Only now they don’t look quite as questionable. Relative to the zone Guccione had been calling throughout the game, they were really borderline pitches, because the up-and-in pitch to left-handers hadn’t been called a strike all day.

As a Twins fan, I’d love to be able to tell a story about an incompetent ump screwing the Twins over, and the baseball gods setting things right through the power of the catwalk. And before I did this analysis, that’s basically what I thought had gone down. Unfortunately, the evidence just doesn’t back that up. It just goes to show how easy it is to misinterpret evidence when you have a strong motivation to fit it into your pre-conceived worldview.

Posted by: Peter | April 19, 2010

Kansas City Slumming

Another series, another series victory for the Twins, despite the ugliness of a bad Pavano start followed by a Crainwreck in yesterday’s game. But loathe though I am to complain about winning, sometimes watching the Royals just makes me feel bad for the team and their fans. The first game of the Twins-Royals series featured Zack Greinke, one of the best pitchers in baseball, which should have favored the Royals even with the completely respectable Scott Baker taking the mound for the Twins. But the two pitchers were probably on an even footing, given the lineups they were facing:

               Name  wOBA WAR                Name  wOBA  WAR
          Joe Mauer 0.401 7.3        Billy Butler 0.372  2.3
     Justin Morneau 0.366 2.7    Alberto Callaspo 0.336  1.7
        Jason Kubel 0.353 1.2       David DeJesus 0.336  2.7
    Michael Cuddyer 0.351 1.6     Scott Podsednik 0.315  1.0
        Denard Span 0.347 3.3        Jose Guillen 0.312 -0.3
       Delmon Young 0.346 1.0         Rick Ankiel 0.310  1.0
     Orlando Hudson 0.320 1.6   Willie Bloomquist 0.305  0.6
     Brendan Harris 0.312 0.4 Yuniesky Betancourt 0.301  0.3
         J.J. Hardy 0.312 2.3       Jason Kendall 0.284  0.7

These are the CHONE projections for weighted On Base Average and Wins Above Replacement for these players. Look at the wOBA numbers, which just take account of hitting ability and not defense or playing time.  A third of the Royals lineup was worse than anyone the Twins sent out.  And their second-best hitter would have been a distant seventh in the Twins lineup. Basically, it was Billy Butler and a bunch of scrubs. And this probably underestimates the talent disparity, because I think J.J. Hardy is going to have a better year than that.

So combine that lineup with an off night from Greinke, and it’s not surprising that the Twins had an AVG/OBP/SLG line of .344/.455/.406 on the night. That means that the team’s hitters were, on average, the equivalent of . . . well, of no-one, because baseball players who have to face real opponents don’t put up that kind of on-base percentage with that little power. In the series overall, the Twins hit .289/.415/.412.

Unfortunately, yesterday’s game saw the Twins waste a lot of scoring opportunities. It already seems like a million times that they’ve ended an inning with the bases loaded. As frustrating as that is, though, it’s encouraging in a way. I don’t really think the team’s current poor hitting with runners in scoring position is anything more than bad luck, just as it was luck when they hit unusually well with RISP a couple of years ago. So if anything, we can expect this offense to be even better in the weeks ahead.

To quantify this, I went over to FanGraphs and picked up some team hitting data. My plan was to compare the number of runs the teams are predicted to score, based on the hitting statistics of individual players, and the number of runs actually scored. However, picking the right statistic for this job turned out to be a bit tricky. While FanGraphs has some nice statistics–such as wRC and wRAA–that model run creation for individual players, these don’t seem to work that well as predictors of team performance: when I looked at previous seasons, both of them seemed to overpredict for good-hitting teams and under-predict for bad ones. So I ultimately ended up using a version of BaseRuns, an old statistic invented by David Smyth (I used the second of the three formulas given at the link.) Even this required a tweak: in order to make it work as an estimator, I had to normalize it so that that the total number of league runs scored predicted by BaseRuns equaled the actual total scored.

When I did all that, this was the result:Predicted and actual MLB runs, as of April 19th 2010

The line running through the middle is where a team would fall if it was scoring exactly as many runs as you’d expect. So the great performance of the Phillies is a little bit lucky, but not much–their hitting really has just been fantastic. The Twins, meanwhile, are a little below the line, meaning that they’re underachieving relative to the number of runs you’d expect them to score. The discrepancy isn’t all that great, though. They’ve scored 69 runs against a prediction of about 71, and that -2 margin places them in the middle of the pack relative to the rest of baseball. The Red Sox are the biggest underachievers at around -8 runs, while the Braves are the biggest overachievers at +8.

Still this indicates that the Twins’ current (excellent) pace of run scoring is sustainable, and could even increase. Of course, it’s only sustainable if the hitting performances themselves are sustainable. But nothing I’ve seen suggests that the Twins hitters are on an especially hot streak–some have been great, but others like Denard Span and Jason Kubel have underperformed a bit. So I’m hopeful that there are many more offensive explosions to come.

Posted by: Peter | April 10, 2010

Evaluating Liriano

The first few games of the season have been fun to watch, but the one I was really waiting for was last night’s start by Francisco Liriano. This was our first chance since last season to see him pitch in a meaningful game in front of the pitchf/x cameras, which means it’s our first chance to start evaluating all the winter’s hype. Is Liriano really back to something approaching his former greatness? Or are we in for another season of suckitude?

On the surface, the performance was decent but not great, with too many walks and too few strikeouts. But that could be a matter of luck or early-season jitters, and doesn’t definitively tell us what to expect from this point on. Digging a bit deeper into the numbers, I had three basic questions: how fast is Liriano’s fastball, how much break is there on his slider, and how well is he locating his pitches?

To get a handle on the first two, we can compare the speed and break of Liriano’s pitches this year to the last two years:

The vertical axis here is pitch velocity, in miles per hour.  The horizontal axis is “spin”, which is measure that uses some basic physics to approximate the amount of spin that the pitcher is putting on the ball. For more on precisely how it’s calculated, see here. Liriano’s three pitches form clear clusters: the fastballs on top, the changeups directly under the fastballs, and the sliders off to the right.

From this plot, it appears that last night, Liriano had a faster fastball, and more spin on his slider, compared to the past couple of years. This is definitely a good sign. To put the fastball in perspective, here’s what the average and standard deviation of Liriano’s fastball speed has been in each game going back to 2008:

On this graph, it looks like the improvement in velocity started at the end of last season. But those late-season appearances all came in relief, and most pitchers throw a bit harder coming out of the bullpen. If Liriano can keep this elevated velocity as a starter, it bodes well for him–as Mike Fast recently showed, increased fastball velocity generally does translate into fewer runs allowed.

Now let’s look at slider spin by game:

This is another good sign. Except for one game that was probably a fluke, this is the most bite Liriano has had on his slider in a long time.

Of course, a blazing fastball and a wicked slider aren’t that useful if you don’t know where they’re going. And clearly the biggest problem with Friday’s start was that Liriano threw too many balls. But ball-strike ratio is a somewhat crude measure of a pitcher’s ability to locate pitches. Sometimes, you want to throw a ball, and a sometimes a strike right down the middle will often lead to worse results than a pitch out of the zone.

Ideally, we’d have information about where the catcher sets his target, so that we could evaluate a pitcher’s results in relation to that.  We don’t have that information, unfortunately, so I thought I’d try another way of looking at location. The basic idea is to look at the percentage of a pitcher’s throws that result in “quality strikes”: balls that are around the edge of the strike zone, neither way outside nor right down the middle.

To determine what counts as a quality strike, I’ll use the strike zone model I described in this post. Basically, I assign each pitch a value between 0 and 1, indicating how likely that pitch is to be called a strike by the umpire. I call a pitch a “quality strike” if it has a more than 10% and less than 90% chance of being called a strike. This sounds like it would include a large area, but actually it’s just the area around the edges of the strike zone–the pitches in the middle of the zone have strike probabilities above 90%, while most pitches out of the zone are close to 0%.

Anyway, here is the percentage of quality strikes for Francisco Liriano–and, as a comparison, for Scott Baker.

player                  2008        2009       2010
Scott Baker             36.5        36.3       29.3
Francisco Liriano       29.4        32.6       38.0

In 2008 and 2009, Baker threw more quality strikes–which makes sense, since he was a better pitcher. But in 2010 (which includes only the first start for each pitcher), Liriano looks better than Baker (who struggled in his season debut) and better than he has in previous years.

This doesn’t mean Liriano is going to turn into an ace this year, and it’s clear that he really does have some control issues yet to work out. But all in all, I’m happy with his first start, and I think it was more encouraging than the numbers in the box score would suggest.

Posted by: Peter | March 31, 2010

Sabermetric Classics

Tom Tango had a nice little post about an old Bill James stat called “Secondary Average”, which linked to a discussion here. The formula is just:

As Tango puts it, this is “everything that batting average isn’t”. Batting average counts singles and extra base hits the same and ignores other means of gaining bases (like walks and stolen bases). Secondary average includes all these “missing bases”.

It turns out that across the league, batting average and secondary average have roughly the same average value, but secondary average is more consistent, varies more between players, and isn’t closely associated with batting average. This makes it a neat stat, although it’s of mostly historical interest today–we have fancier and more accurate ways of measuring production. Still, I thought it would be fun to compare the batting average and secondary average of this year’s primary Twins hitters. I used the above formula, except that I also included hit batters.

First, here’s how they did for 2009:

The line indicates where a hitter would fall if his batting average and secondary average were equal. You can clearly see how Twins hitters are grouped into different types of hitters. There are the power hitters, who contribute more bases than their batting average would indicate. There are the “empty batting average” guys who contribute little beyond singles. There are guys like Nick Punto and J.J. Hardy, who just sucked. And there’s Joe Mauer, who’s just all-around great at everything.

Looking at these players’ career numbers shows mostly the same patterns:

Everyone is closer together, as variation is evened out in the larger sample size. Everyone except Jim Thome, that is, who shows just how awesome he is at doing the things that aren’t hitting singles.  And with the start of the season less than a week a way, I can’t wait to see him start doing those things in a Twins uniform.

Posted by: Peter | March 3, 2010

Expectations and uncertainty

I share the general consensus that the Twins had a very good winter. They basically did everything I could have asked for, short of picking up a third baseman. I’m excited about J.J. Hardy, Orlando Hudson is a solid addition, and Jim Thome is just the frosting on top.

It makes me a little nervous that, to a lot of people, the Twins are favorites in the AL Central now. After surprising everyone with last season’s ridiculous comeback, I’d hate to see them let everyone down this year, especially since it would bring on a cavalcade of idiotic stories about how they can’t stand playing outside or how they’re lost without the advantage of the Metrodome roof.

This got me thinking about the performances I’m expecting from various players this year, and what they mean for the team’s chances of either overachieving or disappointing. Now, it’s easy to find lots of projections for how different players will perform, and by combining those you can get an idea of the most likely win total for the Twins this year. What you don’t generally see, though, is much discussion of the uncertainty around these various projections. (PECOTA does some of this with their “Breakout rate” and “Collapse rate”, but you have to pay to look at those numbers.) Particularly on this year’s team, the level of uncertainty surrounding different players varies wildly. With some guys, we more or less know what to expect, for better or for worse. With other guys, almost anything could happen.

So with that in mind, I decided to to a quick-and-dirty little study of the probable starting lineup and pitching rotation. I wanted to know two things:

  1. Which players have more uncertainty surrounding their performance this year?
  2. Is the uncertainty on the upside or the downside? That is, if things don’t go according to the projections, is it more likely to be because the player had a breakout year, or because they fell apart?

Naturally, I had some prior beliefs about the answers to this question, but it seemed worth looking at some numbers. So–and again, this is not all that rigorous, but I still think it’s helpful–here’s what I did.

  1. I chose to measure hitting performance using wOBA, and pitching performance using FIP. These are both statistics that give a pretty good indication of a player’s performance, independent of luck and the performances of other players. They aren’t ballpark-adjusted, which is fine because I don’t want to deal with the uncertainty around how the new park will play.
  2. For each guy, I went to FanGraphs and got five numbers: the actual wOBA/FIP for 2009, and the predicted 2010 value from four different projection systems: Marcel, Bill James, CHONE, and the FanGraphs community projection.
  3. To get a sense of prediction uncertainty, I compare these four systems. The more varied they are, the more uncertain the projection. Because Marcel is just a “dumb” projection based on previous season performances, age, and regression to the mean, I treat it as my baseline, and then see how the other projections look in relation to it. If the “smart” systems generally project better numbers than Marcel, I take that as evidence of uncertainty on the upside. If they project worse numbers, I take that as evidence of uncertainty on the downside.

Now, to the numbers. First, the hitters:

Predicted wOBA, Twins hitters

wOBA is scaled to correspond to on-base percentage, so basically 300 is really bad, and 400 is really good. But it’s not the levels we’re interested in, it’s the uncerainty.

The projections are the most spread out for Hudson, Hardy, and Young. That’s not hugely surprising, since they all have major question marks attached to them. How much will Hudson decline due to age? Will Hardy bounce back from his awful year? Will Delmon ever figure it out?

On the flip side, the projections are more clustered for Morneau, Cuddyer, and Punto. These guys are known quantities–small quantities, in LNP’s case. It should be noted though, that everyone is predicting that Cuddyer will backslide from his unusually strong power numbers–nobody expects him to slug over .500 again.

Now, what about upside vs. downside uncertainty?

The guys who have a chance to put up surprisingly good numbers are:

  • Mauer. The issue is basically whether he can come close to repeating his MVP numbers, or whether he is, in fact, just a mortal human being.
  • Morneau. The upside scenario here is basically just that Justin stays healthy and his power doesn’t disappear after the all-star game. We still don’t have enough data to know whether his second half disappearing act is due to something real, or whether it’s just a coincidence.
  • Cuddyer. Basically, he would have to show that last year wasn’t a total fluke.
  • Delmon Young. This, of course, would be the scenario where Delmon finally, after all these years, shows evidence of becoming a useful player. Maybe this much-discussed weight loss of his will be the ticket, after all.

Now, as for the possible disappointments:

  • Span. This is basically just a question of what his true talent level is. Marcel thinks that overall, he’ll basically be as good as last year. The most pessimistic projection, from Bill James, has him losing both some on-base percentage and some power.
  • Hudson. When you’re talking about a middle infielder in his thirties, there’s always a risk he’ll fall off a cliff. Some age-related decline is expected, it’s just a question of how much.
  • J.J. Hardy. If you don’t believe the story that Hardy’s brutal season was mostly down to bad luck, then you might expect him to disappoint.

That covers the hitters–although note that I didn’t include Jim Thome, who could be a factor if he ends up getting significant playing time. Now on to the more interesting and unpredictable group, the starting pitchers:

Twins starting rotation, predicted 2010 FIP

Since a low FIP is better (as with ERA), being to the left is good on this graph.  And interestingly, the “smart” projections like all the Twins starters better than Marcel, with the exception of Brian Duensing (who is likely to not make the rotation out of spring training).  So there’s a possibility that the whole rotation could exceed expectations.

Unsurprisingly, the biggest wild card is Francisco Liriano. Marcel projects him to be worse than any of the other five options, and nearly as bad as he was last year. Bill James, meanwhile, projects him to be the ace of the staff. Which of these is closer to the mark is probably the biggest single factor impacting the Twins’ fate this year.

To summarize, let’s consider two possible outcomes for 2010: an overachieving Twins team that wins 95 games and makes a run at the World Series, and a disappointing year where they lose the division to the Tigers or White Sox.

The overachieving scenario looks something like this:

  • Joe Mauer comes close to repeating his MVP campaign, and shows that his newfound power was for real.
  • Justin Morneau stays strong all year, making a bid for his second MVP
  • Delmon young takes a major step forward.
  • Francisco Liriano finds his confidence and his slider, leading a starting rotation that has improved across the board.

The underachievers, meanwhile, look like this:

  • J.J. Hardy turns out to have lost it, and puts up putrid numbers again.
  • Orlando Hudson turns into a pumpkin.
  • Liriano frustrates us again, prompting most to give up on him.

Now, there are a couple of loose ends I haven’t mentioned. The first is injuries–I’ve ignored them here, but obviously if a key player gets hurt, it could change the whole season. The other question mark is a player who I haven’t mentioned, but whose future is also rather uncertain: Joe Nathan.

Last year, a lot of people, including me, started to wonder if Nathan was starting to show his age. He had some close calls and blown saves, and gave up home runs at the highest rate of his Twins career. It made me start to think that maybe the Twins had made a mistake by not trading Nathan at the peak of his value, since closers tend to be over-valued anyway.

Looking at the projections doesn’t really clear things up. Marcel expects Nathan to post his worst FIP since becoming the Twins’ closer, based on extrapoliting his declining strikeouts, rising walks, and rising home run rate. The “smart” systems are much more favorable however–CHONE and the fans expect more or less a repeat of last year, while Bill James gives Nathan his best FIP since 2006. Compared to Liriano and some of the hitters, I don’t think this is quite as big an issue for the team, but it will certainly be an aggravating year if Twitchy suddenly starts to look like a question mark in the ninth inning.

We’ll just have to wait and see–in just another month, we’ll be done with meaningless spring training games and we can start to see which of these storylines will play out.

Older Posts »