Wednesday, October 28, 2009

Predicting Runs and OBP

OK, so we've talked about Batting Average, now it's time to move on to OBP, and Runs. OBP is a measure of a hitter's ability to get on base, while runs measures the number of times he's crossed the plate. OBP is a product of a batters batting average, combined with his ability to take walks. Runs, is a product of a runners ability to get on base, run the bases, and ultimately get some help from his teammates.

First, how do we project On Base Percentage. Just like batting average, this will fluctuate a lot from year to year, based on a hitters luck with balls in play (BABIP). Since we've already determined batting average, the main thing to look at now, is a players walk percentage. Unlike batting average, this is more about the players skills, and thus, it's easier to predict. This is a stat that players tend to improve on as they develop, so it's easier to expect a player to repeat, or even improve upon last years walk% (and thus improve their OBP). If a player's walk% remains relatively constant over their career, I'll use that walk%. If they have shown improvements in recent years, I'll tend towards those numbers. Rarely does a player actually decline their walk% (though it does happen). Unfortunately, built into OBP is a player's sacrifice fly's, and bunt's. This makes it impossible to simply project a players OBP using their batting average, and BB% alone (Sac fly's, and bunt's will vary a lot from player to player, and even from year to year). So the way that I project it, is to pick a year in the players career that best represents the walk % I predict for that given player (if it exists), and I'll add or subtract from that years OBP based on the batting average I projected for them. So if their batting average was 20 points better in my projection, I'll add 20 points to that particular OBP, and use that as my projection. For younger players, this is more difficult, and I find myself often just taking an existing OBP (career, or even 1 year), and tweaking it upwards. For young players, I will look at their minor league numbers as well, for a point of reference, as they tend to move close to (and sometimes exceed) their minor league numbers as time goes on.

Alright, so there's OBP, my method's aren't highly mathematical, but I think taking into account trends in BB%, and taking out the batting average fluctuations, makes for a fairly accurate OBP projection. Now it's on to predicting a players runs, and this is where my research gets a little more interesting. Runs are based on a few things, some of those things (OBP, Speed, Plate Appearances), are statistical in nature, while others (where they hit in the lineup, and how well the people behind them in the lineup are knocking them in) are out of the players control, and difficult to project. So what I've done, is thrown out what's out of the players control, and figured out a way to predict a players runs, based on their skills alone. So what you get is "skill runs", that is, the number of runs that a players skills should allow him to score. In a better batting slot, they will perform better then their skill runs, while in a worse one, they will perform under it. But batting slot is very difficult to predict, so for the sake of our projections, let's throw that out entirely.

So how did I do it? I Took a large sample of data (3 years worth of player data, that I took from fangraphs), and I ran some statistical analysis of a players runs scored as compared to their Stolen bases, OBP, and PA. When I did this analysis, I came up with the following equation to predict a players "skill runs": -90.241129 + (Plate Appearances x -90.241129) + (OBP x 200.8088179) + ( SB x 0.293131537 )

Using this equation, and my projections, here's a sample of what I came up with, for the leaders in runs scored next year in 2010, and I'm pretty happy with the results:

Player Skill Runs
Pujols 115
Ellsbury 111
Reyes 109
Figgins 108
Abreu 107

Now obviously there is a good chance that team factors will push these guys, and others, up and down in the list, but given skills alone, this is where I project them to be. Note: I have everyone set t0 700 Plate Appearances currently, so that also skews the results, more accurate PA projections will change this.

At the bottom end of the spectrum, is Benjie Molina with 80 Runs. Remember, that's with him projected to have 700 plate appearances, which he's not going to do, nor has he done at any point in his career. Interestingly, there is not a huge difference in runs scored between the top, and bottom players, this just shows that by a large margin, plate appearances are the biggest factor in a players ability to score runs (which makes sense).

No comments:

Post a Comment