Sunday, June 7, 2009

And Now For Something Completely Similar

There is obviously a big difference between knowing what happened and knowing what will happen. Many of us cannot even agree on what did happen (the well publicized problems with the BCS for example), so how can we expect to know what will happen. However, knowing our limitations and the limitations of numbers can sometimes help us to get closer to predicting the future. There is a very good book called The Wisdom of Crowds by James Surowiecki, which talks about how large groups of people containing mostly non-experts and a few experts are very good at making predictions. He gives countless examples, but a few of them are financial markets, the Iowa Electronics Market (an extremely accurate predictor of political elections years in advance), and even a County Fair game of guessing the weight of an ox.

I should digress for a short moment to give the basic theory behind the book for those who do not take the time to read the linked articles. The idea is fundamentally simple: in large groups of people with good and bad information, the bad information cancels out the other bad information leaving only the good information. The example of the guessing game probably illustrates the point best. Imagine going to a County Fair where there is a competition of who can guess the weight of an ox. At this County Fair, besides you, are other members of your community: lawyers, doctors, janitors, teachers, bus drivers, etc. But there are also butchers, farmers, chefs and food wholesalers. The first group represent the non-experts. Though they may have very keen eyes, they probably don't know exactly what to look for when judging the weight of an ox. Some will guess much too low and some will guess much too high. Some will guess a little low and some will guess a little high. However, there is no reason to assume that their errors will be systemic. Quite simply, there is no reason to believe that the errors will pile up on the low end or the high end. In a group of large enough size, the bad low guesses and the bad high guesses should cancel each other out. Then there are the experts. They too will guess lower and higher than the actual weight. Some may actually guess almost exactly. However, again there is no reason to assume that their errors will either be mostly high or mostly low. In the end, you have a large number of low guesses and a large number of high guesses, which when averaged should be very near to the actual weight of the ox. This example is based on a real occurence at a 1907 County Fair attended by the mathematician Francis Galton, where the average of all the guesses was 1 pound from the actual weight, whereas no individual guess was nearly as close.

With this concept in mind, I pursued what I thought was the best way of predicting what will happen in the upcoming season. One more digression, I should mention that I used offensive and defensive efficiencies from the Relative Ratings as opposed to the adjusted pts/game for predicting 2009. I theorized that although teams' adjusted points for and against may fluctuate, how those points relate to the nation should remain more consistent year to year. For instance, Team A has an adjusted points for of 40 pts/game in 2008. In 2007 they had a value of 30 pts/game. The percent difference is 33%, which is pretty large. But let's say in 2008 the average adjusted points for of the nation was 20 pts/game and in 2007 the value was 17.5 pts/game. That means in 2008 Team A had an adjusted offensive efficiency of 2.00 (40 / 20), and in 2007 the value was 1.71 (30 / 17.5). This represents a 17% difference from 2007 to 2008, half as much as when using the adjusted pts / game. So although they "scored" 10 pts/game more, they weren't as drastically more efficient.

Ok. Great. So what? Using the last 10 years (BCS years), I went looking for trends. How did team's efficiencies change from year to year? Were there three/four/five year cycles, gradual improvements/declines, so-called unpredictable ascensions and collapses? Remembering The Widsom of Crowds, I decided to build projections with all of these possibilities in mind. I used regression and my own analytical ideas to calculate 11 separate offensive and defensive efficiencies for each team. Some have names like "Payback" or "Course Correction". Others have more boring names like "ADV" and "5YRAVG". Each method tries to model a different aspect of how a team may change year to year depending on what happened in the past. Most importantly each method has its own strengths and weaknesses; some of the predictions for 2009 will be too high, some will be too low.

I then figured out which methods were best when and assigned probabilities to each method being selected. This is kind of like saying "what is the population of experts and non-experts in my sample." So let's say "ADV" was the best predictor of a team's efficiencies about 10% of the time (in the past). Then 10% of the time any given team would have the "ADV" 2009 efficiency assigned to them.

I then wrote a fairly simple macro in Excel that would calculate a random number for every team for both offense and defense. It would then assign one of the 11 offensive or defensive efficiencies based on that random number (and based on the previously calculated probabilities). Every game in the season would then be simulated and the wins/losses/points tabulated. The process repeats until the number of simulations entered is completed. There is also some randomness built into each individual game, that way it prevents a team like Florida from beating a team like Troy in 100% of simulations. After all, unthinkable upsets sometimes happen (Appalachian State vs Michigan comes to mind).

Hopefully if you've simulated enough times, you are left with what amounts to the truth. Like The Wisdom of Crowds argues, whether your have experts (good predictions) or non-experts (bad predictions) in a group, given enough people (simulations) the bad will cancel out each other and the good will be left. I have included the results of 30,000 simulations below. Why 30,000? Because there are 121 teams, 11 possible offensive ratings and 11 possible defensive ratings. This means there are 14,641 different combination possibilities. I figured if you double that number you should get a good representation of all the possibilities. I then rounded that value to the nearest 10,000 because I like round numbers. A note about the table below: Log5 Wins uses the Log5 formula and each teams Relative Rating (reminder: RR = Off^2.8 / (Off^2.8 + Def^2.8)) to get a probability of each team winning and ABS (Absolute) Wins uses the points scored of each team to declare a winner. You will also notice a team called DIV I-AA Group. This represents all teams in what is now called the Football Championship Subdivision or FCS for short. I don't like that I have to do this, but as of yet I haven't been able to come up with a good way to properly value the skill level of an FCS opponent on an individual basis. Interestingly I have them winning 7 games in 2009, which is their average over the last 10 season. Is that evidence my methods are somewhat reasonable? I don't know. Onto the numbers:

TEAM 2008 WINS AVG LOG5 WINS AVG
ABS
WINS
MAX ABS WINS % OF SIMS
AIR FORCE 8 7.6 7.5 12 0.82
AKRON 5 6.3 6.1 12 0.22
ALABAMA 12 9.7 9.6 12 10.83
ARIZONA 7 6.8 6.8 12 0.27
ARIZONA ST 5 6.7 6.7 12 0.27
ARKANSAS 5 5.7 5.8 12 0.11
ARKANSAS ST 6 6.0 5.9 12 0.09
ARMY 3 4.5 4.6 12 0.03
AUBURN 5 6.9 6.9 12 0.72
BALL ST 12 8.7 8.5 12 3.88
BAYLOR 4 4.6 4.7 12 0.03
BOISE ST 12 10.9 10.9 13 17.25
BOSTON COLLEGE 9 8.4 8.1 12 2.19
BOWLING GREEN 6 5.8 5.6 12 0.19
BUFFALO 8 6.3 6.1 12 0.35
BYU 10 8.0 8.1 12 1.83
CALIFORNIA 8 7.5 7.3 12 0.96
CENTRAL FLORIDA 4 6.2 6.2 12 0.23
CENTRAL MICHIGAN 8 6.0 6.1 12 0.17
CINCINNATI 11 7.2 7.2 12 0.76
CLEMSON 6 8.1 7.5 12 1.49
COLORADO 5 4.9 4.9 12 0.05
COLORADO ST 6 5.4 5.6 12 0.07
CONNECTICUT 7 7.0 7.0 12 0.49
DIV I-AA GROUP 2 6.4 7.0 47 0.00
DUKE 4 4.5 4.7 12 0.01
EAST CAROLINA 9 6.6 6.5 12 0.25
EASTERN MICHIGAN 3 4.4 4.6 12 0.02
FLORIDA 12 10.4 10.3 12 22.61
FLORIDA ATLANTIC 6 5.8 5.7 12 0.24
FLORIDA INTL 5 3.6 3.8 12 0.01
FLORIDA ST 7 6.8 6.7 12 0.58
FRESNO ST 7 7.0 7.2 13 0.22
GEORGIA 9 7.5 7.1 12 1.33
GEORGIA TECH 8 7.2 7.2 12 0.89
HAWAII 7 8.1 7.9 13 0.72
HOUSTON 7 7.5 7.3 12 0.86
IDAHO 2 3.0 3.4 12 0.00
ILLINOIS 5 6.5 6.3 13 0.10
INDIANA 3 3.9 4.3 12 0.00
IOWA 8 8.0 7.8 12 1.35
IOWA ST 2 4.6 4.8 12 0.03
KANSAS 7 7.1 6.8 12 0.65
KANSAS ST 5 6.3 6.3 12 0.17
KENT ST 4 5.0 5.4 12 0.08
KENTUCKY 6 6.1 6.1 12 0.18
LA LAFAYETTE 6 5.6 5.8 12 0.12
LA MONROE 4 6.0 6.2 13 0.04
LOUISIANA TECH 7 4.7 4.6 12 0.00
LOUISVILLE 5 5.9 6.2 12 0.31
LSU 7 8.3 8.3 12 2.18
MARSHALL 4 4.8 5.0 12 0.05
MARYLAND 7 5.7 5.9 12 0.24
MEMPHIS 6 5.6 5.8 12 0.10
MIAMI FL 7 6.2 6.3 12 0.33
MIAMI OH 2 3.8 4.1 12 0.02
MICHIGAN 3 6.5 6.4 12 0.24
MICHIGAN ST 9 7.2 6.8 12 0.46
MIDDLE TENN ST 5 5.1 4.9 10 0.38
MINNESOTA 7 5.2 5.4 12 0.09
MISSISSIPPI 8 7.4 7.3 12 0.83
MISSISSIPPI ST 4 3.5 3.6 12 0.01
MISSOURI 9 8.5 8.3 12 3.02
NAVY 8 8.5 8.4 14 0.45
NEBRASKA 8 7.2 7.2 12 0.73
NEVADA 7 7.4 7.1 12 0.71
NEW MEXICO 4 5.5 5.5 12 0.17
NEW MEXICO ST 3 4.3 4.6 13 0.00
NORTH CAROLINA 8 6.4 6.5 12 0.26
NORTH CAROLINA ST 6 5.2 5.6 12 0.06
NORTH TEXAS 1 2.8 3.1 11 0.01
NORTHERN ILLINOIS 6 7.2 7.1 12 0.86
NORTHWESTERN 9 6.7 6.6 12 0.32
NOTRE DAME 6 5.9 5.9 12 0.44
OHIO 4 5.4 5.5 12 0.15
OHIO ST 10 9.8 9.9 12 12.85
OKLAHOMA 12 9.2 8.9 12 7.21
OKLAHOMA ST 9 7.1 7.1 12 0.77
OREGON 9 7.8 7.4 12 1.10
OREGON ST 8 7.6 7.4 12 0.75
PENN ST 11 10.2 10.2 12 18.25
PITTSBURGH 9 7.5 7.3 12 1.00
PURDUE 4 5.5 5.6 12 0.16
RICE 9 5.3 5.5 12 0.11
RUTGERS 7 8.9 8.6 12 5.00
SAN DIEGO ST 2 4.1 4.1 12 0.01
SAN JOSE ST 6 4.7 4.7 12 0.01
SMU 1 4.5 4.8 12 0.01
SOUTH CAROLINA 7 6.5 6.4 12 0.31
SOUTH FLORIDA 7 7.5 7.6 12 1.18
SOUTHERN MISS 6 7.5 7.4 12 0.83
STANFORD 5 5.2 5.4 12 0.05
SYRACUSE 3 3.2 3.3 11 0.03
TCU 10 9.8 9.8 12 16.74
TEMPLE 5 4.8 4.9 12 0.02
TENNESSEE 5 6.6 6.5 12 0.43
TEXAS 11 9.9 9.8 12 12.72
TEXAS A&M 4 5.5 5.5 12 0.11
TEXAS TECH 10 8.2 7.9 12 1.47
TOLEDO 3 4.6 4.8 12 0.04
TROY 8 7.2 7.0 11 1.48
TULANE 2 3.1 3.3 12 0.00
TULSA 10 7.4 7.1 12 1.10
UAB 4 3.9 4.4 12 0.03
UCLA 4 4.9 5.1 12 0.05
UNLV 5 4.6 4.7 12 0.02
USC 11 10.2 10.4 12 26.00
UTAH 12 8.4 8.3 12 2.88
UTAH ST 3 4.3 4.2 12 0.01
UTEP 5 6.2 6.1 12 0.21
VANDERBILT 6 5.6 5.8 12 0.10
VIRGINIA 5 5.5 5.8 12 0.13
VIRGINIA TECH 9 8.7 8.3 12 3.66
WAKE FOREST 7 6.9 6.9 12 0.57
WASHINGTON 0 3.7 3.7 12 0.01
WASHINGTON ST 2 2.7 2.9 11 0.02
WEST VIRGINIA 8 8.8 8.5 12 4.33
WESTERN KENTUCKY 1 3.8 3.8 10 0.25
WESTERN MICHIGAN 9 7.0 6.8 12 0.61
WISCONSIN 7 7.0 6.6 12 0.65
WYOMING 4 3.6 3.7 12 0.01

1 comment:

  1. You lost me after the ox story. Can you figure out the probability of a crowd of Raul Ibanezs hitting a Citz Bank Park HR off a crowd of Cole Hamels throwing a change up?

    ReplyDelete