The Elo Rating system is a method to rate players in chess and other competitive games. A new player starts with a rating of 1000. This rating will go up if they win games, and go down if they lose games. Over time a player’s rating becomes a true reflection of their ability – relative to the population.
My video was mostly based on A Comprehensive Guide to Chess Ratings by Prof Mark E Glickman f
Below are some of the things I wanted to talk about, but cut so the video wasn’t too long!
Some explanations of the Elo rating system say it is based on the normal distribution, which is not quite true. Elo’s original idea did model each player’s ability as a normal distribution. The difference between the two players strengths would then also be a normal distribution. However, the formula for a normal distribution is a bit messy so today it is preferred to model each player using an extreme value distribution. The difference between the two players strengths is then a logistic distribution. This has the property that if a player has a rating 400 points more than another player they are 10 times more likely to win, this makes the formula nicer to use. Practically, the difference between a logistic distribution and the normal distribution is small.
Logistic distribution on Wikipedia />We replace e with base 10, s=400, mu=R_A – R_B and x=0 in the cdf.
For the update formula I say that your rating can increase or decrease by a maximum of 32 points, and I said there was no special reason for that. This value is called the K-factor, and the higher the K-factor the more weight you give to the players tournament performance (and so less weight to their pre-tournament performance). For high level chess tournaments they use a K-factor of 16 as it is believed their pre-tournament rating is about right, so their rating will not fluctuate as much. Some tournaments use different K-factors.
In the original Elo system, draws are not included, instead they are considered to be equivalent to half a win and half a loss. The paper by Mark Glickman above contains a formula that includes draws. Similarly the paper contains a formula that includes the advantage to white.
Another criticism of Elo is the reliability of the rating. The rating of an infrequent player is a less reliable measure of that player’s strength, so to address this problem Mark Glickman devised Glicko and Glicko2. See descriptions of these methods at l
On the plus side, the Elo system was leagues ahead of what it replaced, known as the Harkness system. I originally intended to explain the Harkness system as well, so here are the paragraphs I cut:
“In the Harkness system an average was taken of everyone’s rating, then at the end of the tournament if the percentage of games you won was 50% then your new rating was the average rating.
If you did better or worse than 50% then 10 points was added or subtracted to the average rating for every percentage point above or below 50.
This system was not the best and could produce some strange results. For example, it was possible for a player to lose every game and still gain points.”