[update] Documentation for heuristic, fork detection, and playable threats.

Updated Background information.md, Achtergrondinformatie.md, and README.md to describe the improved AI: playable vs non-playable threat scoring, fork detection bonus, and the split Phase 1 strategy. README now lists all three implementations and the AI strategy section. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 17:00:10 +01:00
parent b27032762e
commit 3341c3922a
3 changed files with 49 additions and 15 deletions
@@ -69,17 +69,22 @@ Na het doorspelen van een "wat als?"-scenario, moet de AI beslissen: is dit een

 In plaats van elke onbesliste positie "neutraal" te noemen, bekijkt de AI elke mogelijke groep van vier opeenvolgende cellen op het bord (horizontaal, verticaal en beide diagonalen — 69 groepen in totaal). Voor elke groep telt hij de schijfjes:

- **3 AI-schijfjes + 1 leeg:** Dit is een sterke dreiging — de AI heeft nog maar één zet nodig om hier te winnen. Score: **+50**.
+- **3 AI-schijfjes + 1 leeg (speelbaar):** De lege cel kan nu meteen gevuld worden (hij zit op de onderste rij of er zit een schijfje onder). Dit is een directe dreiging. Score: **+100**.
+- **3 AI-schijfjes + 1 leeg (nog niet speelbaar):** De lege cel zweeft in de lucht — de dreiging bestaat maar kan nog niet benut worden. Score: **+40**.
 - **2 AI-schijfjes + 2 leeg:** Een veelbelovende opbouw die zich tot een dreiging kan ontwikkelen. Score: **+5**.
- **3 tegenstander-schijfjes + 1 leeg:** Een gevaarlijke dreiging van de tegenstander. Score: **-50**.
+- **3 tegenstander-schijfjes + 1 leeg (speelbaar):** Een direct gevaar. Score: **-100**.
+- **3 tegenstander-schijfjes + 1 leeg (nog niet speelbaar):** Een toekomstig gevaar. Score: **-40**.
 - **2 tegenstander-schijfjes + 2 leeg:** De tegenstander bouwt iets op. Score: **-5**.
 - **Gemengde groepen** (beide spelers hebben schijfjes in dezelfde groep): Geblokkeerd — niemand kan hier winnen. Score: **0**.

-Daarbovenop geeft de AI een kleine bonus (**+3** per schijfje) voor het beheersen van de middelste kolom, en een straf (**-3** per tegenstander-schijfje) daar. De middelste kolom is betrokken bij meer winnende lijnen dan elke andere kolom, dus het beheersen ervan is waardevol.
+Daarbovenop gebruikt de AI twee extra scorebonussen:

-Al deze kleine scores tellen bij elkaar op. De maximale heuristiek-score ligt ruim onder 1000, dus het verstoort nooit de echte winst/verlies-detectie — een gegarandeerde winst wint altijd van de beste heuristiek-positie.
+- **Controle over de middelste kolom:** +3 per AI-schijfje in de middelste kolom, -3 per tegenstander-schijfje. De middelste kolom is betrokken bij meer winnende lijnen dan elke andere kolom, dus het beheersen ervan is waardevol.
+- **Vorkdetectie:** Als een speler **twee of meer** drie-op-een-rij dreigingen tegelijk heeft, is dat een vork — de tegenstander kan er maar één per beurt blokkeren, dus de andere wint het spel. De AI geeft een grote bonus (**+200** of **-200**) wanneer hij een vork detecteert, waardoor hij agressief vork-opstellingen najaagt en wanhopig probeert te voorkomen dat de tegenstander er een maakt.

-Deze heuristiek betekent dat de AI nu het verschil kan zien tussen een sterke positie (veel dreigingen in opbouw) en een zwakke (de tegenstander heeft alle dreigingen), zelfs als hij geen gedwongen winst of verlies kan zien binnen zijn zoekdiepte.
+Al deze scores tellen bij elkaar op. De maximale heuristiek-score ligt ruim onder 1000, dus het verstoort nooit de echte winst/verlies-detectie — een gegarandeerde winst wint altijd van de beste heuristiek-positie.
+
+Deze heuristiek betekent dat de AI nu het verschil kan zien tussen een sterke positie (veel dreigingen in opbouw, vooral speelbare) en een zwakke (de tegenstander heeft alle dreigingen), zelfs als hij geen gedwongen winst of verlies kan zien binnen zijn zoekdiepte.

 ### Waarom de middelste kolom belangrijk is

@@ -67,17 +67,22 @@ After playing out a "what if?" scenario, the AI needs to decide: is this a good

 Instead of calling every unsolved position "neutral," the AI examines every possible group of four consecutive cells on the board (horizontal, vertical, and both diagonals — 69 groups in total). For each group, it counts pieces:

- **3 AI pieces + 1 empty:** This is a strong threat — the AI is one move away from winning here. Score: **+50**.
+- **3 AI pieces + 1 empty (playable):** The empty cell can be filled right now (it's on the bottom row or has a piece below it). This is an immediate threat. Score: **+100**.
+- **3 AI pieces + 1 empty (not yet playable):** The empty cell is floating in the air — the threat exists but can't be used yet. Score: **+40**.
 - **2 AI pieces + 2 empty:** A promising setup that could develop into a threat. Score: **+5**.
- **3 opponent pieces + 1 empty:** A dangerous opponent threat. Score: **-50**.
+- **3 opponent pieces + 1 empty (playable):** An immediate danger. Score: **-100**.
+- **3 opponent pieces + 1 empty (not yet playable):** A future danger. Score: **-40**.
 - **2 opponent pieces + 2 empty:** The opponent is building something. Score: **-5**.
 - **Mixed groups** (both players have pieces in the same group): Blocked — nobody can win here. Score: **0**.

-On top of that, the AI gives a small bonus (**+3** per piece) for controlling the center column, and a penalty (**-3** per opponent piece) there. The center column is involved in more winning lines than any other column, so controlling it is valuable.
+On top of that, the AI uses two more scoring bonuses:

-All these small scores add up. The maximum possible heuristic score is well below 1000, so it never interferes with actual win/loss detection — a guaranteed win always beats the best heuristic position.
+- **Center column control:** +3 per AI piece in the center column, -3 per opponent piece. The center column is involved in more winning lines than any other column, so controlling it is valuable.
+- **Fork detection:** If a player has **two or more** three-in-a-row threats at the same time, that's a fork — the opponent can only block one per turn, so the other wins the game. The AI adds a large bonus (**+200** or **-200**) when it detects a fork, making it aggressively pursue fork setups and desperately avoid letting the opponent create one.

-This heuristic means the AI can now tell the difference between a strong position (many threats being built) and a weak one (the opponent has all the threats), even when it can't see a forced win or loss within its search depth.
+All these scores add up. The maximum possible heuristic score is well below 1000, so it never interferes with actual win/loss detection — a guaranteed win always beats the best heuristic position.
+
+This heuristic means the AI can now tell the difference between a strong position (many threats being built, especially playable ones) and a weak one (the opponent has all the threats), even when it can't see a forced win or loss within its search depth.

 ### Why the center column matters

@@ -114,12 +114,36 @@ All configurable parameters are defined as `-D` flags in `platformio.ini`:
 | `WIFI_SSID`            | `Connect4` | SSID for the WiFi access point                     |
 | `WIFI_PASSWORD`        | `youlose4` | Password for the WiFi access point                 |

+## AI Strategy
+
+The AI uses **minimax with alpha-beta pruning** and a **heuristic evaluation function**. Moves are selected in three phases:
+
+1. **Instant win/block** — scan all columns for an immediate win first, then for an opponent threat to block.
+2. **Blunder** (optional) — random move at a configurable chance, skipping the deep search.
+3. **Deep minimax search** — full tree search with alpha-beta pruning up to the configured ply depth.
+
+The heuristic evaluates leaf nodes by scoring all 69 possible four-cell windows on the board:
+
+- **Playable threats** (3-in-a-row where the gap can be filled now): ±100
+- **Non-playable threats** (gap is floating in the air): ±40
+- **Two-in-a-row setups**: ±5
+- **Center column control**: ±3 per piece
+- **Fork bonus** (2+ simultaneous three-in-a-row threats): ±200
+
+See `Background information.md` / `Achtergrondinformatie.md` for a detailed explanation accessible to all ages.
+
 ## Project Structure

 ```
-src/main.cpp        Single-file application (all game logic, AI, LED, web server)
-platformio.ini      Build configuration, pin mappings, and tunable parameters
-README.md           This file - technical and practical information
-Background information.md   How the AI works (suitable for all ages)
-CLAUDE.md           AI assistant project context
+src/main.cpp                 ESP32 application (game logic, AI, LED, web server)
+connect_four.js              JavaScript browser edition (canvas rendering)
+connect_four.html            HTML wrapper for the JavaScript version
+connect_four.py              Python terminal edition (Rich TUI)
+platformio.ini               Build configuration, pin mappings, and tunable parameters
+README.md                    This file - technical and practical information
+Background information.md    How the AI works (English, suitable for all ages)
+Achtergrondinformatie.md     How the AI works (Dutch, suitable for all ages)
+CLAUDE.md                    AI assistant project context
 ```
+
+All three implementations (C++, JavaScript, Python) share the same AI algorithm and heuristic.