118 lines
6.2 KiB
Markdown
118 lines
6.2 KiB
Markdown
# Connect 4 AI: How the Computer Thinks
|
|
|
|
## 1. The Virtual Board
|
|
|
|
The computer doesn't see colored discs on a grid. It sees a table of numbers:
|
|
|
|
- **0** = Empty space
|
|
- **1** = Yellow disc
|
|
- **2** = Red disc
|
|
|
|
The board has 7 columns and 6 rows. After every move, a scan function checks all directions (horizontal, vertical, and both diagonals) to see if anyone has four in a row.
|
|
|
|
## 2. What is a "Ply"?
|
|
|
|
A **ply** is one move by one player. If the AI is set to ply 6, it looks 6 individual moves into the future. Since players alternate turns, ply 6 means the AI considers 3 of its own moves and 3 of the opponent's moves.
|
|
|
|
More plies = stronger play, but takes longer to calculate. On the ESP32-C3, ply 4 is nearly instant, ply 6 takes about a second, and ply 8-10 can take several seconds. The AI shows a pulsing light while it is thinking.
|
|
|
|
## 3. The Minimax Strategy
|
|
|
|
### The basic idea
|
|
|
|
Imagine you are playing Connect 4 against a friend. Before you drop your disc, you think: "If I put my disc here, what will my friend do? And then what would I do after that?"
|
|
|
|
That is exactly what the computer does, except it checks **every** possible move, not just a few.
|
|
|
|
### Two players, two goals
|
|
|
|
The AI calls the two players **Max** (itself) and **Min** (you):
|
|
|
|
- **Max** wants the highest possible score (the AI winning).
|
|
- **Min** wants the lowest possible score (you winning).
|
|
|
|
The AI assumes you will always make your best move. It doesn't hope you'll make a mistake.
|
|
|
|
### A simple example
|
|
|
|
Imagine there are only 3 columns left and the AI can look 2 moves ahead. It builds a tree like this:
|
|
|
|
```
|
|
AI's turn (Max - pick the highest)
|
|
/ | \
|
|
col 2 col 3 col 4
|
|
/ \ / \ / \
|
|
Your turn (Min - pick the lowest)
|
|
... ... ... ... ... ...
|
|
+5 -3 +2 +8 -1 +4
|
|
```
|
|
|
|
1. After column 2: you would pick the move scoring -3 (lowest = best for you).
|
|
2. After column 3: you would pick the move scoring +2.
|
|
3. After column 4: you would pick the move scoring -1.
|
|
|
|
The AI compares -3, +2, and -1, and picks column 3 because +2 is the best it can guarantee.
|
|
|
|
### Scoring
|
|
|
|
The AI assigns scores to board positions:
|
|
|
|
- **+1000 or more:** The AI wins. A faster win gets a higher score, so the AI goes for the quickest victory.
|
|
- **-1000 or less:** The opponent wins. A faster loss gets a more negative score, so the AI fights hardest against immediate threats.
|
|
- **0:** Nobody has won and the search depth ran out. The position is neutral.
|
|
|
|
This scoring is why the AI has "killer instinct" - it doesn't just try to win, it tries to win as fast as possible.
|
|
|
|
## 4. Alpha-Beta Pruning: The Smart Shortcut
|
|
|
|
### The problem
|
|
|
|
Looking ahead 8 plies in Connect 4 means exploring millions of board positions. Even a fast microcontroller can't check them all in a reasonable time.
|
|
|
|
### The solution
|
|
|
|
**Alpha-Beta pruning** is a way to skip branches of the tree that can't possibly change the final decision.
|
|
|
|
Think of it like shopping for a birthday present. You visit Shop A and find a nice toy for 10 euros. Then you go to Shop B. The first item you see costs 15 euros, and you notice everything else in Shop B is even more expensive. You don't need to check every item in Shop B - you already know Shop A is better. You leave Shop B and save time.
|
|
|
|
The AI does the same thing:
|
|
|
|
- **Alpha** is the best score the AI (Max) has found so far. Think of it as "I already know I can do at least this well."
|
|
- **Beta** is the best score the opponent (Min) has found so far. Think of it as "The opponent already knows they can limit me to at most this."
|
|
|
|
When the AI is exploring a branch and discovers that the score can never beat what it already has (beta <= alpha), it **prunes** (cuts off) that entire branch. It skips all remaining moves in that branch because they can't change the outcome.
|
|
|
|
### How much does it help?
|
|
|
|
In practice, pruning lets the AI skip 50-90% of the positions it would otherwise need to check. This is why the column order matters - the AI checks the center column first (column 3), then works outward. Good moves tend to be near the center, so checking them first leads to better pruning and faster search.
|
|
|
|
## 5. The Three-Phase Move Strategy
|
|
|
|
Before running the expensive minimax search, the AI takes two quick shortcuts:
|
|
|
|
1. **Can I win right now?** The AI tries placing its disc in each column. If any column completes four in a row, it takes that move immediately. No need to think further.
|
|
|
|
2. **Can my opponent win next turn?** The AI checks if the opponent could win by playing in any column. If so, it blocks that column. Missing this would be a fatal mistake.
|
|
|
|
3. **Deep search.** Only if there are no immediate wins or threats does the AI run the full minimax search with alpha-beta pruning.
|
|
|
|
This three-phase approach makes the AI both fast (instant reactions to obvious moves) and smart (deep strategic thinking when needed).
|
|
|
|
## 6. Demo Mode: Asymmetric Skill
|
|
|
|
In demo mode, two AI players play against each other. To make the games interesting (rather than always ending in a draw), each player is randomly assigned a different search depth. One player might look 5 moves ahead while the other only looks 3 moves ahead. The stronger player can find winning setups that the weaker one misses, leading to exciting games with real winners. Who gets the advantage is randomized each game.
|
|
|
|
## 7. Responsive Controls
|
|
|
|
The ESP32-C3 is a single-core processor. When the AI is thinking, it could block all input for several seconds. Two techniques keep the game responsive:
|
|
|
|
1. **Mid-search button checks:** During the minimax search, the AI periodically checks whether the player has pressed the button. If so, it immediately abandons the search.
|
|
|
|
2. **Abort flag:** A global flag (`abortAi`) propagates through all levels of the recursive search. Once set, every level of the search returns immediately, unwinding the entire calculation in microseconds.
|
|
|
|
## Further Reading
|
|
|
|
- [Connect Four - Mathematical Solution (Wikipedia)](https://en.wikipedia.org/wiki/Connect_Four#Mathematical_solution)
|
|
- [Minimax Algorithm (Wikipedia)](https://en.wikipedia.org/wiki/Minimax)
|
|
- [Alpha-Beta Pruning (Wikipedia)](https://en.wikipedia.org/wiki/Alpha%E2%80%93beta_pruning)
|