Posted by: Peter | January 18, 2011

Rating the Raters: Bill James 1994

Over at FanGraphs, Carson Cistulli has a nice couple of posts where he looks back to some Bill James player forecasts from 1993 and 1994, in an attempt to see how the predictions of player value held up. Tom Tango adds some additional insight based on running a regression of career WAR on the “dollar value” James assigned to players. Tango ends up using a polynomial fit of the data, resulting in the following equation:

This is good stuff (although I’m not sure why Tango decided to fix the intercept at 0 in that regression). However, I felt like both posts could have really benefited from some kind of graphical display of the data. Especially because I think that looking at a graph changes one’s interpretation of the results somewhat. Fortunately, Carson was nice enough to supply his data. So here are the players from 1994:

Bill James player ratings 1994, and career WAR

The color codes reflect James’s letter grade, or “Z” for ungraded players. The line is a locally-weighted regression (“LOESS“) line that shows the general trend.

What does this picture say? Well, the way I read it, it tells us that for the top prospects–those with A or B ratings–James’s dollar values really tell us something. For players who got valuations over $20 or so, there is a clear positive relationship between the dollar ratings and eventual career WAR. What’s more, there is a clear floor under these players: beyond the $25-30 threshold, none of them ended up being total busts.

On the other hand, it doesn’t look like the dollar values below $20 really contain any information. Among all those C and D grade prospects, James’s ratings don’t help us distinguish between surprise stars like Jim Edmonds (valued at $15, 68.1 career WAR) and total washouts like Stanton Cameron (also $15, 0 WAR). If I run a linear regression using only the players valued over $20, I get:

I think this is probably the more useful relationship, since it throws out all the fringe prospects who aren’t really adding any information.

I can only add my ringing endorsement to Tango’s call for forecasters to release more of this kind of information. I’d love to know whether this pattern is a fluke of this data, or of Bill James as a forecaster, or whether it generalizes. If forecasts are useful for top prospects but not marginal ones, that would be very important to know.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s


%d bloggers like this: