Back to the offense and pitching--my goal with these posts is to try and estimate the number of runs we can expect the offense to score, and how many runs we can expect the pitchers to give up. Then we can try and plug these numbers into something called the Pythagorean win percentage, which will give a rough estimate of how many wins to expect this year, since runs scored vs. runs allowed correlates pretty well with overall winning percentage.
Let me just emphasize again here that I am by no means an expert, and due to several limitation in the data, this will end up as very rough, back-of-the-envelope type calculations. In other words, don`t go placing wagers based on any of this. Still, I think in a general sense this will be instructive on what to expect from the Tigers, and what to look for as the season progresses. Please let me know if you see me doing something stupid and/or obviously wrong...
Luckily for me, the Tigers aren`t expected to start a bunch of freshmen all over the field like last year, and there are at least partial-season`s worth of data at every position going back over the past few years which will be used to project this year's performance. Here is the first limitation in the data, I`m going to assume that no freshmen are going to beat out the cast from last year for significant playing time, because it is next to impossible to try and project performance with high school stats (O.K., its also hard with just a year`s worth of college stats, but I`ll get to that in a second). High school stats tend to vary tremendously from school district to school district, and even from year to year within the same district. So I`m just going to plug returning players from last year into the slots that they saw the most time, and also based from the media coverage coming out of practice. So right away, if the coaching staff feels one of the new kids is a better bet to improve production in the field, that likely means that I'll be underrating the overall offensive output.
So how is individual contribution to runs scored at the plate measured? Tom Tango, Andrew Dolphin, and Mitchel Lichtman wrote about this in The Book: the comprehensive guide to playing the percentages in baseball, and came up with one unifying statistic which combines a players ability to get on base and their slugging percentage, in other words, how often they get on base and how many bases they end up taking after getting on base. (This also includes things like base-stealing ability, etc.) They call it the weighted on-base percentage, or wOBA, and it is scaled to fit regular on-base percentage, so you can think of an average wOBA as being around .350, and an excellent wOBA as being .400 or above.
Now in order to get a good prediction of a player`s wOBA, ideally you must have several season`s worth of prior data. Unfortunately, this just isn`t possible in this case. For most Clemson players, we have one or maybe two year`s worth of data. So another assumption I have made is that the performance from last year (or, when available, the average of multiple years past) is a true gauge of player`s ability. This of course is nonsense, last year Kyle Parker may have had a flukishly brilliant year, or he may have been bad by his own standards. But its just an assumption we will have to live with. One thing to keep in mind, though, is that at this stage in a player`s career, its perfectly natural to expect improvement. To be conservative, however, I`m not going to try and work in any kind of adjustments.
So, without further ado, here is the projected wOBA for each Clemson player, alongside their expected position and several of the counting stats:
The next step is to adjust wOBA for park effects and for strength of schedule. This can be done with the help of Boyd Nation's stat page, which has kept track of how individual college parks tend to affect the run environment for several years now. In other words, in some parks its easier to score runs and than others (because of shorter fences, higher altitudes, less foul territory, etc.), so this is a way to normalize so that every player is more or less playing in an equal environment. As it turns out, the Doug Kingsmore stadium rating for 2005-2008 is 109 with 100 being an average park, which means we expect a few more runs to score over the course of a season relative to an average park, something of a "hitter's park". But we're going to use Clemson's total weighted park factor for the same time period which accounts for the run environments of all stadiums that Clemson plays in over the course of a season, this number comes out to 105. This means that Clemson tends to play on the road in parks that are friendlier to a pitcher than Doug Kingsmore. Here are the numbers for player park weighted wOBA:
Next, we adjust for strength of schedule (SOS). This is bit trickier, because we need 2009 SOS for our predictions and of course its impossible to know how exactly the season is going to play out and how good our opponents will be. Luckily, though, Boyd Nation calculates the intended SOS for 2009, factoring in performances of teams over the past several years to get a rough estimate. Clemson is currently projected at a 108.4 SOS, good for 44th in the country. Here, then, are the numbers for player park-weighted, SOS-adjusted wOBA:
Whew, so now we have an idea of each player's wOBA for 2009, but we need to convert this to runs. Basically what we'll do is calibrate every player's contribution relative to an ACC league-average offensive run-producer. This can be done by comparing the projected wOBA to an expected ACC league average wOBA and multiplying this number by expected plate appearances. I calculated the 2008 ACC average wOBA by taking all players logging over 100 plate appearances and running their numbers through the park effect and SOS adjustments described above, with the final league average wOBA for 2008 coming out to .387. Then we can then calculate the total projected runs above or below league average based on total runs scored for the 2008 ACC league teams. Now a couple more limitations need to be mentioned, first, I'm comparing 2009 projections to 2008 numbers, and also calculating the league average runs for 2008. It would probably be better to project these numbers for 2009, or at least to use the past several seasons to come up with a better average. On the other hand, the best leagues (ACC, PAC10, Big12) are known to be pretty stable stats-wise, because the high quality of talent influx that comes in year after year, so I feel pretty comfortable doing this as an exercise in rough estimates.
With the several mentioned limitations in mind, the final numbers come out as follows (with wRAA as the runs above/below average):
With the several mentioned limitations in mind, the final numbers come out as follows (with wRAA as the runs above/below average):
Startlingly low numbers, to be sure, but remember its not just how many runs you score, its also how many runs you give up. I'll look at this side of the equation next time when projecting the pitching staff.
No comments:
Post a Comment