Thursday, March 26, 2009

baseballcalculus.com

Check out the new website, www.baseballcalculus.com. I've got a first pass at projections for the coming season right now, but these will be updated soon using current depth charts (the original projections are using last year's rosters and playing time). Also keep checking back for updated season and game by game projections for players and teams, and much more!

Tuesday, March 10, 2009

Coming Soon

OK, so after furiously throwing out some predictions last year, I was conspicuously absent for a long time. But it is that time again, so in the next couple of weeks, I will be furiously churning out a whole new bigger and better set of predictions. But that's not all.

This time I will not only post the player and team predictions, but I am also planning to set up a site to update these as the season goes on. On the site, we will also predict every single game next year, and much more.

All of these results will be built using the models I created for my Ph.D. thesis. (You can check out some of the specs in the research section of my website: www.stanford.edu/~null).

Details to follow. Stay tuned...

Monday, March 31, 2008

2008 Projected Player Stats

Here are some 2008 projected stats.


http://www.stanford.edu/~null/bradnullblogfiles/projected2008stats.xls


This is very preliminary, but I thought it was cool. Anyway, the counting stats (hits, HRs, etc) are based upon a projection of 600 Plate appearances for everyone (including several players out of baseball). For switch hitters, there is a quick approximation of how many PA they get from each side, and stats are broken down by side.

The e at the beginning of each header just means “estimated”. teOPS is the total estimated OPS for all hitters. For guys that aren’t switch hitters this is the same as eOPS, for switch hitters it is a weighted average (even though that isn’t the most precise way to do it).

This is my first pass, and I haven’t looked through it all, so any feedback is appreciated. If you see anything that looks funny, let me know, since that should help me review and improve the model, as well as catch any bugs.

Also, I find it interesting that the projected best hitter in 2008 (legal troubles not withstanding) is currently out of baseball.

Tuesday, March 25, 2008

2008 MLB playoff probabilities and such

Following on my post yesterday, here is a link to some 2008 predictions as well as probabilities of all teams to win the division and to make the playoffs.

http://www.stanford.edu/~null/bradnullblogfiles/divwinprob.xls

In the attached spreadsheet, the teams are sorted by division in order of most likely finish based upon 250,000 simulations. In short, the model predicts Cleveland, New York, and Seattle to win their divisions with Boston as the Wild Card in the AL. Milwaukee, New York, and LA are predicted to win their divisions in the ML with Atlanta as the Wild Card. The Yankees are the most likely team to win their division with a 58% chance. KC is the least likely with a .01% chance.

Pred Wins represents the team's most likely number of wins assuming they finish in this position in the standings.

E Wins is the expected value for team wins.

%division, %wild card, % out are the probabilities that the team wins division, wins wild card, and doesn’t make the playoffs respectively.

WDiv, WWC, and Wout are the expected number of wins for the team conditioned on the team winning the division, winning the wild card, and not making the playoffs respectively.

Realize also that the Expected wins are closer together than the actual distribution will be since they are averaging together a bunch of observations.

Monday, March 24, 2008

2008 MLB Projections

2008 Projected MLB Standings

team W L
CLE 87.93 74.07
DET 87.88 74.12
CHA 83.20 78.80
MIN 76.62 85.38
KC 62.23 99.77

NYY 95.53 66.47
BOS 91.00 71.00
TOR 89.04 72.96
TAM 73.90 88.10
BAL 72.84 89.16

SEA 82.41 79.59
OAK 82.30 79.70
LAA 78.76 83.24
TEX 76.32 85.68

MIL 84.37 77.63
CHC 83.62 78.38
STL 82.30 79.70
HOU 80.45 81.55
CIN 77.59 84.41
PIT 71.24 90.76

NYM 89.69 72.31
ATL 88.77 73.23
PHI 85.77 76.23
FLA 69.09 92.91
WAS 66.80 95.20

LAD 87.64 74.36
SDG 85.71 76.29
COL 80.26 81.74
ARI 79.31 82.69
SFO 77.44 84.56

These projections are based on a Markov Chain model that estimated the probability of every team winning every game based upon projected starting lineups. Based upon results using prior years, the estimated standard deviation of these projections is +/- 8.5 wins. So you can see that there are several divisions that are up for grabs.

There are a lot of caveats I could throw in here. But the biggest ones are that the current analysis does not account for relievers, bullpens, or the relative value of a team’s prospects. Thus, if a team has better than average subs and/or better than average prospects expected to contribute heavily, that team is likely to do better than we project here.

Nonetheless, most of these projections seem to pass the sniff test. The biggest eye opener is the Angels at 79 wins. In the previous 3 datasets, the Angels are the one team that consistently outperformed this model, and most people think they should do it again. I don't know yet if it is because they have great subs and relievers, a great farm system, or just consistently do better than past performance would indicate. But hopefully over the next few weeks I'll be able to figure that out.

Welcome (Why are we here?)

I am a PhD candidate in Management Science and Engineering at Stanford University. For those of you that don’t know what Management Science is, think optimization, building quantitative models to analyze and improve decision making. My thesis focuses on Prediction and Optimal Decision Making in Baseball, and for the last couple of years I have been building models to analyze decision making and predict results for major league baseball games.

I’ll get into the gritty details of the models later, (and there are a lot of them), but right now I just wanted to throw out some entirely premature projections before the 2008 season gets under way.

I plan to update these projections both as I add more features to the model and as the season progresses. I also plan to add a lot more projections and analysis. In the next week, I will use these projections to evaluate the probability of each team making the playoffs and winning the World Series.

If you want more info about my methods or would like to comment on anything, please feel free to contact me at null at stanford dot edu.