SPRAGGETT ON CHESS
In 1970 FIDE started to use Arpad Elo’s statistical basis for calculating a player’s chess rating. Since then the elo system (as it has become known) has really aquired a life of its own and has been used for other games and even some sports: college football and basketball; and major league baseball!
Meant to give an indication of relative strength, some feel that the chess world has given far too much significance to ratings and have distorted its true meaning. Arpad Elo himself expressed surprise disappointment towards the end of his life. Ditto the widely respected Elmer Sangalang, who recently had published on chessbase a short article alerting us to the dangers to our game if nothing is done. He writes of an often irrational and unfounded perception/understanding of ratings.
Elmer Sangalang edited the second edition of Elo’s book in 1986 and today is a regular contributor to and consultant of the World Chess Federation (FIDE) on the Elo Rating System. Here is the article that appeared on chessbase (http://www.chessbase.com/newsdetail.asp?newsid=6973 ) I have taken the liberty of underlining certains points for emphasis.
Demythologizing the chess player’s Elo rating
By Elmer Dumlao Sangalang
The invention and employment of the Elo rating system may be the best, but not the perfect, thing that ever happened to chess playing and organizing. The late Professor Arpad E. Elo expressed strong sentiments over the inordinate importance being attributed to the rating. He regretted that the Elo Rating System contributed greatly to the prevailing opinion that regards chess as first and foremost a sport. As a result, the chessplayer’s Elo Rating has been overvalued in significance by top-rank players and organizers of major and prestigious chess competitions. And for that matter, even by FIDE, itself.
This problem has serious undesirable consequences some of them being that:
1. Top players will tend to protect their ratings from decreasing at great cost. When rating is given undue importance, its preservation or improvement will, of necessity, lead the player to abandon his desire to create and innovate. The proliferation of colorless draws is inimical to the interest of organizers and sponsors of tournaments, and spectators.
2. In many a chessplayer and chess aficionado, it will foster an attitude of irrational respect (euphemism for fear) for higher-rated opponents and unwarranted contempt for lower-rated ones.
3. Organizers and sponsors of competitions will be misguidedly confining their choice of participants to a narrow field of high-rated players to the discrimination of the greater majority of their equally qualified associates.
4. Gratuitous payment of excessive appearance fees by organizers of competitions will be made to high-rated players, on the latter’s demand, the effect of which is the depletion of funds that should, otherwise, be part of the prizes whose legitimate recipients are the winners of tournaments.
5. FIDE can be misdirected in its policymaking efforts regarding International Titles and Ratings requirements and regulations.
6. FIDE can unjustifiably be compelled, at considerable cost, to grant requests for recalculation to restore the loss of a few rating points due to accidental exclusion of some game results from the rating calculation.
7. FIDE can wrongfully be exposed and dragged into costly legal disputes in cases where individual ratings are inadvertently miscalculated on account of clerical error such as data omission. Players would argue that the failure to receive an invitation to a chess competition could lead to economic loss. Participation in a prestigious tournament means the receipt of an appearance fee and the possibility of winning substantial prize money.
FIDE, and all quarters concerned, should act immediately to address the problem and rectify this erroneous perception about the real significance of a chessplayer’s Elo Rating. Short of requiring everyone to read Prof. Elo’s book, The Rating of Chessplayers – Past and Present, FIDE should disseminate to all its member national federations, to all chess publications, and require all chessplayers (amateurs and professionals alike) to read, a circular explaining that:
The Elo Rating System is a statistical system. Though the calculation processes involved are mathematical (they use formulas that are precise mathematical statements), the underlying concepts employed (such as probabilities, confidence intervals, margins of error, measures of reliability) in the derivation of these formulas are statistical in nature. Even the data that form the basis of the calculations are themselves fluctuating units of human performance that is subject to variability. Professor Arpad E. Elo so eloquently stated the process for the benefit and appreciation of the layman in each of us: “The measurement of the rating of an individual chessplayer might be well compared with the measurement of the position of a cork bobbing up and down on the surface of agitated water with a yardstick tied to a rope which is swaying in the wind” and held by a trembling hand [last phrase is mine].
Therefore, the Elo Rating of an individual player is not a mathematically precise figure. It is statistically derived with an accuracy in direct correlation to the amount of data (game results) on which it is based. A measurement based on the standard 30 games provides a rating that is 95% probable to be within plus or minus 100 Elo points of its true value. That’s the reason why it is not a certainty that the higher-rated player will always beat his lower-rated opponent. Take this simplistic illustration: Player A rated Elo 2600 plays with Player B rated Elo 2500. The true strength of A lies in the range of 2500 – 2700 while that of B is in the range of 2400 – 2600. If A plays languidly at 2550 while B plays inspired chess at 2550, we will have an even match and the game should result in a draw!
In the Elo Rating System, the absolute value of the player’s rating is meaningless. It is the difference in ratings between players which has significance and it represents their relative scoring capabilities. (As far as only this pair of chessplayers is concerned, Alexander Grischuk’s Elo 2773 and Wesley So’s Elo 2673 may as well be arbitrarily changed to Elo 2000 and Elo 1900, respectively.)
When the rating system is conducted on the continuous basis, such as being done by FIDE now, ratings are computed after each event by the current rating formula: Rn = Ro + K(W-We). The self-correcting characteristics of this equation, when applied continually with statistically adequate interplay within the rating pool, will automatically generate proper relative ratings after sufficient time. Therefore, minor errors in rating calculations due to data omissions will not affect the overall integrity of the system in the long run.
When all the participants in a tournament have ratings that fall within a rating interval of 200 Elo points, the players are said to belong to one playing class and good all-around competition results. No one is badly outclassed and no one badly outclasses the field. The weakest player on his good day will play about as well as the strongest player on the latter’s off day.
In the choice of a particular participant to be invited to a chess event, the organizer should not rely on the player’s rating as his sole criterion for selection but, rather, also the overall character of the player which certainly will have a greater impact on the conduct and success of the competition.
With the foregoing demythologizing of the chessplayer’s Elo Rating we have come to realize that its obsessive valuation by players, organizers and FIDE alike, is unfounded on fact.
SPRAGGETT ON CHESS