A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play

doi:10.1126/science.aar6404

DOI: 10.1126/science.aar6404 ISSN:

A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play

David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis

Multidisciplinary

Show PDF Cite

One program to rule them all

Computers can beat humans at increasingly complex games, including chess and Go. However, these programs are typically constructed for a particular game, exploiting its properties, such as the symmetries of the board on which it is played. Silver et al. developed a program called AlphaZero, which taught itself to play Go, chess, and shogi (a Japanese version of chess) (see the Editorial, and the Perspective by Campbell). AlphaZero managed to beat state-of-the-art programs specializing in these three games. The ability of AlphaZero to adapt to various game rules is a notable step toward achieving a general game-playing system.

Science , this issue p. 1140 ; see also pp. 1087 and 1118

Outline

A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play

One program to rule them all

More from our Archive