Exploring whether move sequences predict chess player skill
Revealed that opening moves (first 10-15) are significantly more predictive than later play; demonstrated challenges of single-game skill prediction
This project explores a simple question: can you predict a chess player's skill level just from the moves they make in a single game? I trained an LSTM model on ~30,000 games in PGN format to see whether move sequences alone contain a strong enough signal to estimate player ELO.
The results were modest at best. The model's peak performance was ~16.5% accuracy within a ±200 ELO window — only a 2% improvement over baseline. In practical terms, that's not particularly strong. What was more interesting, though, was where the signal appeared. The model performed best when analyzing just the first 10–15 moves; adding more moves consistently reduced accuracy. That drop-off highlights how difficult it is for neural networks to generalize across chess's enormous possibility space.
A few patterns stood out. Opening moves were far more predictive than late-game play, likely because openings are more standardized and reflect study depth. There was a clear point of diminishing returns around 10–15 moves, and the experiments suggested that richer board representations (like FEN encoding) would likely help — but only with a much larger dataset. It also reinforced a core limitation: single games are noisy, and human play is highly variable.
From a technical perspective, the project focused on applying sequence models to structured gameplay data. That included building preprocessing pipelines to convert PGN into numerical encodings, training an LSTM with cross-entropy loss, and experimenting with ways to balance sequence length against generalization. The accompanying writeup walks through the methodology and results in more detail, along with ideas for how this type of modeling could apply to chess education or matchmaking.
Future directions include incorporating board-state awareness via FEN encoding, analyzing multiple games per player for stability, and combining learned signals with opening theory to better understand where skill differences actually emerge.