[update] Documentation.

This commit is contained in:
2026-03-27 13:41:47 +01:00
parent 1c370f80a6
commit 54bae2faf5
2 changed files with 44 additions and 18 deletions
+21 -8
View File
@@ -55,20 +55,33 @@ The AI compares -3, +2, and -1, and picks column 3 because +2 is the best it can
### Scoring: How the AI Rates a Board
After playing out a "what if?" scenario, the AI needs to decide: is this a good result or a bad one? It uses a very simple scoring system with only three possible outcomes:
After playing out a "what if?" scenario, the AI needs to decide: is this a good result or a bad one? It uses a layered scoring system:
- **+1000 or more: "I win!"** The AI found a way to get four in a row. The bonus points above 1000 depend on how quickly it can win. Winning in 2 moves scores higher than winning in 6 moves. This is why the AI always goes for the fastest victory — it never wastes time when it can finish the game.
- **-1000 or less: "I lose!"** The opponent gets four in a row. Losing sooner gets an even worse score. This makes the AI fight hardest against moves that threaten an immediate loss.
- **0: "I don't know yet."** The AI looked as far ahead as it could (it ran out of plies) and nobody won. It simply calls this position "neutral" — not good, not bad.
- **Heuristic score: "I don't know yet, but I can tell how good this looks."** When the AI has looked as far ahead as it can (it ran out of plies) and nobody has won, it evaluates the position using a heuristic — a quick estimate of who is in a stronger position.
That's it — the AI does not give extra points for having three in a row, controlling the center, or any other clever trick. It relies entirely on looking many moves ahead to figure out which moves lead to wins and which ones don't. If it can't see a win or loss within its search depth, every position looks the same.
### The Heuristic: Reading the Board
Instead of calling every unsolved position "neutral," the AI examines every possible group of four consecutive cells on the board (horizontal, vertical, and both diagonals — 69 groups in total). For each group, it counts pieces:
- **3 AI pieces + 1 empty:** This is a strong threat — the AI is one move away from winning here. Score: **+50**.
- **2 AI pieces + 2 empty:** A promising setup that could develop into a threat. Score: **+5**.
- **3 opponent pieces + 1 empty:** A dangerous opponent threat. Score: **-50**.
- **2 opponent pieces + 2 empty:** The opponent is building something. Score: **-5**.
- **Mixed groups** (both players have pieces in the same group): Blocked — nobody can win here. Score: **0**.
On top of that, the AI gives a small bonus (**+3** per piece) for controlling the center column, and a penalty (**-3** per opponent piece) there. The center column is involved in more winning lines than any other column, so controlling it is valuable.
All these small scores add up. The maximum possible heuristic score is well below 1000, so it never interferes with actual win/loss detection — a guaranteed win always beats the best heuristic position.
This heuristic means the AI can now tell the difference between a strong position (many threats being built) and a weak one (the opponent has all the threats), even when it can't see a forced win or loss within its search depth.
### Why the center column matters
Even though the AI doesn't give bonus points for playing in the center, it always checks the center column first (column 3), then works outward (2, 4, 1, 5, 0, 6).
The center column is involved in more possible winning lines than the edges, so checking it first helps the AI find good moves faster and skip bad ones sooner (thanks to alpha-beta pruning).
The AI always checks the center column first (column 3), then works outward (2, 4, 1, 5, 0, 6). The center column is involved in more possible winning lines than the edges, so checking it first helps the AI find good moves faster and skip bad ones sooner (thanks to alpha-beta pruning). The heuristic also gives a small bonus for center control, reinforcing this natural advantage.
## 4. Alpha-Beta Pruning: The Smart Shortcut
@@ -100,11 +113,11 @@ In practice, pruning lets the AI skip 50-90% of the positions it would otherwise
Before running the expensive minimax search, the AI takes two quick shortcuts:
1. **Can I win right now?** The AI tries placing its disc in each column. If any column completes four in a row, it takes that move immediately. No need to think further.
1. **Can I win right now?** The AI checks **all** columns for a winning move. If any column completes four in a row, it takes that move immediately. No need to think further. Importantly, the AI scans every column for its own win before checking for threats — this ensures it never accidentally blocks an opponent's threat when it could win the game outright.
2. **Can my opponent win next turn?** The AI checks if the opponent could win by playing in any column. If so, it blocks that column. Missing this would be a fatal mistake.
2. **Can my opponent win next turn?** Only after confirming there is no instant win, the AI checks all columns for opponent threats. If the opponent could win by playing in any column, the AI blocks it. Missing this would be a fatal mistake.
3. **Deep search.** Only if there are no immediate wins or threats does the AI run the full minimax search with alpha-beta pruning.
3. **Deep search.** Only if there are no immediate wins or threats does the AI run the full minimax search with alpha-beta pruning and the heuristic evaluation.
This three-phase approach makes the AI both fast (instant reactions to obvious moves) and smart (deep strategic thinking when needed).