July 21, 2009

Proposal for Changes in Computing RPI - Part 3

Greg Van Zant, the head coach at West Virginia University, recently sent the following recommendation to the NCAA Division I Baseball Committee to change the formula for calculating the Ratings Percentage Index (RPI).

The Weighting of Home Wins and Road Wins

Another problem with the RPI in college baseball is that it doesn't factor in home field advantage.  College baseball is unique because early in the season the vast majority of teams from the northern part of the country can not play at home due to the weather.  Northern teams do not choose to play on the road early in the season, they have to play on the road.  No other NCAA sport has this kind of scheduling inequity.

The home team wins approximately 60% of the time in Division I Baseball (61% in 2008) because of many factors such as hitting last, knowledge of the home playing field, supportive fans, knowledge of the umpires, sleeping at home, etc.  The reasons why the home team wins about 60% of the time could be listed and debated at length.  Regardless, it is a fact, for whatever reason or reasons, the home team wins roughly 60% of the time.

There is a simple solution to this problem, mathematically remove the home field advantage from the RPI.  Let me explain.

Let's look at the factors or components of the current RPI;

1.   A team's winning percentage against all the Division I teams on its schedule.  This

is the first 25% of the RPI.  This is where it all starts.  If a team is 30-10 against

Division I competition, the RPI for this team is .750 before the other two factors of

the RPI adjust this .750 winning percentage up or down based on the strength of

schedule of the 40 opponents this team played.

2.   Average Division I winning percentage of all your opponents when not playing

you.  This is 50% of the RPI.  This second factor rewards a team for playing

opponents that have high winning percentages.

3.   A team's opponents, opponents Division I winning percentage.  This counts 25%

and is an effort to make sure your opponents legitimately earned their winning

percentage by playing good competition.

Factor #2 has a huge bearing on the final RPI of each team as it counts for 50% of the total. This factor is based on the winning percentage of your opponents.  Factors #1 and #3 also are based on winning percentage.  Obviously, winning percentage is very important.

Suppose we had two identical college baseball teams playing each other in a 20-game series.  These two hypothetical teams, which I will name Team A and Team B, are identical in every aspect.  This 20-game series will be played on a neutral field and each team will be the home team 10 times.  Over time, each team will win half of the games because everything is equal.  Therefore, each team's win-loss record will be 10-10.

Now, let's take the same two identical teams and play all 20 games of the series at one of the team's home ballparks, Team A's for example.  Since over time it has been shown that the home team wins 60% of the time in college baseball, the home team will now win 60% of the time in this series and Team A will finish with a 12-8 record.  The visiting team, Team B, will finish 8-12.  Two identical teams but vastly different win loss percentages simply because the 20-game series was not played fairly, (i.e. 10 games at home and 10 on the road).

The RPI that we currently use would rank Team A much higher than Team B based on the .600 winning percentage of Team A and the .400 winning percentage of Team B, even though these two teams are identical in every aspect.  So the first 25% of the RPI for these two identical teams would be vastly different.

The second factor of the RPI, which is currently 50%, can also unfairly reward Team A and unfairly punish Team B if Team A is from the south and Team B is from the north.  This is because most of Team A's opponents are also from the south and they have also beaten northern teams at home, while most of Team B's opponents are from the north and they have lost to southern teams on the road.

Factor three of the RPI, the final 25%, magnifies the geographical bias of having more home games with the home field advantage even more.  Basically, the RPI rewards the teams that can play the most home games.

This is not just a geographical problem of north vs. south, it is a problem of trying to accurately measure the strength of teams from any geographic region.  Some northern teams play big home schedules, however, as we all realize, most of the time southern teams do play more home games and thus have a built-in advantage that our current RPI system does not account for.

My proposal is to use an Adjusted Winning Percentage in all three factors of the RPI instead of the actual winning percentage.  This Adjusted Winning Percentage would factor out the advantage of playing at home so that the RPI could more accurately measure the strength of the teams instead of measuring who plays the most home games.

Right now in the winning percentage, a win counts as 1.0 wins whether it is a home win, a road win, or a neutral win.  My proposal will still count a neutral field win as 1.0 wins, but will count a road win as 1.25 wins and a home win as 0.8333333333 wins.  This will negate the statistical advantage of playing at home.  This will also eliminate the need for the RPI bonus and penalty points that we currently have in place.

Using my proposed system in the example that I used earlier, both teams would get 1.0 wins for each win at the neutral site and both would have a .500 Adjusted Winning Percentage for having 10 wins in 20 games.  However, when the series was switched to all 20 games at one site and the home team was 12-8 due to home field advantage, this team would still have a .500 Adjusted Winning Percentage (12 wins x .8333333333 = 10 wins in 20 games).  Furthermore, the visiting team for all 20 games that went 8-12 would also have a .500 Adjusted Winning Percentage (8 wins x 1.25 = 10 wins in 20 games).

The Adjusted Winning Percentage will more accurately reflect the strength of teams when they are not equal as well.  For example if Team C is a better team than Team D and Team C plays all 20 games of their series with Team D at home and wins 15 of 20, Team C's winning percentage is .750 but their Adjusted Winning Percentage will be 15 x         .8333333333 = 12.5 wins in 20 games = .625.  Team D's winning percentage is 5 wins in 20 games = .250 but their Adjusted Winning Percentage will be 5 x 1.25 = 6.25 wins in 20 games= .3125.

Currently some schools play almost their entire non-conference schedule at home.  In the current RPI system, this is a huge advantage for these teams.  A 56-game schedule with 27 conference games leaves 29 games out of conference.  Assuming all 29 non-conference games were played at home, and 15 of the 27 league games were also at home, this school could play 44 home games.  For this example, letâ€™s say this team goes 39-5 at home and 5-7 on the road.  This team would have a .7857 winning percentage (44/56) in the current RPI system but would have a .692 Adjusted Winning Percentage. ( 39 home wins x .8333333333 + 5 road wins x 1.25) = 32.5 + 6.25 = 38.75/56 = .692.

Since a road win will be worth 1.25 wins using the Adjusted Winning Percentage, some coaches may try to schedule easy road wins against weak teams.  This may help in the first 25% of the RPI but the remaining 75% of the RPI, factors #2 and #3, will penalize teams that try this tactic.  However, it could help college baseball for some teams to play on the road once in awhile.

Imagine what an advantage a conference could have if all of the teams in the conference played the majority of their non-conference games at home.  In a conference of 12 teams that plays 30 conference games, each team could play 26 non-conference games (12 x 26 = 312 games.)  Suppose the conference as a whole played 300 of those games at home and 12 on the road and the win-loss record in these games was 265-35 at home and 6-6 on the road.  The current RPI formula would assign a non-conference winning percentage of 271/312 = .8686 while the Adjusted Winning Percentage would be .7318 (265 home wins x .8333333333 + 6 road wins x 1.25) = 220.83 + 7.5 = 228.33/312 = .7318.

Average teams in conferences that can play most of their non-conference games at home tend to be ranked higher in the RPI team rankings due to the compounding effect of the winning percentages in the RPI once these conference teams start playing each other.  These teams are disproportionally bumped up in the RPI rankings due to the advantages the current system gives to teams playing at home.

The current RPI system in baseball also favors conferences that are geographically located adjacent to conferences that play more home games.  This is because the RPI rewards teams that play other teams with winning records.  As has been noted, a certain percentage of these wins are based on home field advantage.

The ranking of the conferences using the RPI seems to be a critical factor in the amount of at-large teams selected for the NCAA Tournament.  Usually, when a conference is ranked high in the RPI, more teams are picked out of that conference for the play-offs.

In my opinion, this ranking is based mainly on how well each conference performs in their non-conference games.  This is because the RPI is based on winning percentage and every conference will always have a .500 winning percentage in conference games because if State wins, Tech loses.

For example, if Conference X has 10 teams and each team plays 30 conference games and 26 non-conference games, Conference X will play a total of 260 games against the other conferences.  These non-conference games are the ones that matter.  If non-conference games were not allowed, each conference's winning percentage would be .500.

Let's assume that Conference Y, a very similar league to Conference X, also has 10 teams and plays 260 total non-conference games.  However, Conference X plays 200 of the 260 games at home while Conference Y plays 130 of the 260 at home.  It doesn't take a rocket scientist to figure out who has the home field advantage and who will most likely have a higher RPI but it does take a better RPI to figure out who is the best.

Using the Adjusted Winning Percentage in the RPI formula will allow the RPI to more accurately measure the strength of each team and each conference.

(photo by Jimmy Jones)