[update] Documentation.

2026-03-27 13:41:47 +01:00
parent 1c370f80a6
commit 54bae2faf5
2 changed files with 44 additions and 18 deletions
@@ -57,22 +57,33 @@ De AI kiest dan de kolom met de hoogste score die overblijft, in dit geval kolom

 ### Scoren: Hoe de AI een Bord Waardeert

-Na het doorspelen van een "wat als?"-scenario, moet de AI beslissen: is dit een goed of slecht resultaat? Hij gebruikt een eenvoudig scoringsysteem met drie mogelijke uitkomsten:
+Na het doorspelen van een "wat als?"-scenario, moet de AI beslissen: is dit een goed of slecht resultaat? Hij gebruikt een gelaagd scoringsysteem:

 - **+1000 of meer: "Ik win!"** De AI heeft een manier gevonden om vier op een rij te krijgen. Hoe sneller hij kan winnen, hoe hoger de score. Winnen in 2 zetten scoort hoger dan winnen in 6 zetten. Daarom gaat de AI altijd voor de snelste overwinning.

 - **-1000 of minder: "Ik verlies!"** De tegenstander krijgt vier op een rij. Hoe sneller hij verliest, hoe slechter de score. Dit zorgt ervoor dat de AI het hardst vecht tegen zetten die dreigen tot een direct verlies.

- **0: "Ik weet het nog niet."** De AI heeft zo ver vooruit gekeken als hij kon (hij is door zijn plies heen) en niemand heeft gewonnen. Hij noemt deze positie "neutraal" — niet goed, niet slecht.
+- **Heuristiek-score: "Ik weet het nog niet, maar ik kan zien hoe goed het eruitziet."** Als de AI zo ver vooruit heeft gekeken als hij kon (door zijn plies heen) en niemand heeft gewonnen, beoordeelt hij de positie met een heuristiek — een snelle schatting van wie er sterker voor staat.

-Dat is alles — de AI geeft geen extra punten voor drie op een rij, het controleren van het midden, of andere slimme trucs. Hij vertrouwt volledig op het ver vooruit kijken om te bepalen welke zetten tot een overwinning leiden en welke niet. Als hij binnen zijn zoekdiepte geen winst of verlies ziet, ziet elke positie er hetzelfde uit.
+### De Heuristiek: Het Bord Lezen
+
+In plaats van elke onbesliste positie "neutraal" te noemen, bekijkt de AI elke mogelijke groep van vier opeenvolgende cellen op het bord (horizontaal, verticaal en beide diagonalen — 69 groepen in totaal). Voor elke groep telt hij de schijfjes:
+
+- **3 AI-schijfjes + 1 leeg:** Dit is een sterke dreiging — de AI heeft nog maar één zet nodig om hier te winnen. Score: **+50**.
+- **2 AI-schijfjes + 2 leeg:** Een veelbelovende opbouw die zich tot een dreiging kan ontwikkelen. Score: **+5**.
+- **3 tegenstander-schijfjes + 1 leeg:** Een gevaarlijke dreiging van de tegenstander. Score: **-50**.
+- **2 tegenstander-schijfjes + 2 leeg:** De tegenstander bouwt iets op. Score: **-5**.
+- **Gemengde groepen** (beide spelers hebben schijfjes in dezelfde groep): Geblokkeerd — niemand kan hier winnen. Score: **0**.
+
+Daarbovenop geeft de AI een kleine bonus (**+3** per schijfje) voor het beheersen van de middelste kolom, en een straf (**-3** per tegenstander-schijfje) daar. De middelste kolom is betrokken bij meer winnende lijnen dan elke andere kolom, dus het beheersen ervan is waardevol.
+
+Al deze kleine scores tellen bij elkaar op. De maximale heuristiek-score ligt ruim onder 1000, dus het verstoort nooit de echte winst/verlies-detectie — een gegarandeerde winst wint altijd van de beste heuristiek-positie.
+
+Deze heuristiek betekent dat de AI nu het verschil kan zien tussen een sterke positie (veel dreigingen in opbouw) en een zwakke (de tegenstander heeft alle dreigingen), zelfs als hij geen gedwongen winst of verlies kan zien binnen zijn zoekdiepte.

 ### Waarom de middelste kolom belangrijk is

-Ook al geeft de AI geen bonuspunten voor spelen in het midden, 
-hij controleert altijd eerst de middelste kolom (kolom 3), en werkt dan naar buiten toe (2, 4, 1, 5, 0, 6). 
-De middelste kolom is betrokken bij meer mogelijke winnende lijnen dan de randen, dus door deze eerst te controleren, 
-vindt de AI sneller goede zetten en kan hij slechte zetten eerder overslaan (dankzij alpha-beta snoeien).
+De AI controleert altijd eerst de middelste kolom (kolom 3), en werkt dan naar buiten toe (2, 4, 1, 5, 0, 6). De middelste kolom is betrokken bij meer mogelijke winnende lijnen dan de randen, dus door deze eerst te controleren, vindt de AI sneller goede zetten en kan hij slechte zetten eerder overslaan (dankzij alpha-beta snoeien). De heuristiek geeft ook een kleine bonus voor controle over het midden, wat dit natuurlijke voordeel versterkt.

 ---

@@ -114,9 +125,11 @@ Goede zetten zitten vaak in het midden, dus door deze eerst te controleren, leid

 De AI doet zijn werk in drie stappen:

-1. **Kan ik nu winnen?** De AI probeert in elke kolom een schijfje te leggen. Als hij ergens vier op een rij kan maken, doet hij dat meteen. Geen verdere berekeningen nodig.
-2. **Kan de tegenstander volgende beurt winnen?** De AI controleert of jij ergens vier op een rij kunt maken. Zo ja, dan blokkeert hij die kolom. Dit overslaan zou een grote fout zijn.
-3. **Diepe zoektocht.** Als er geen directe winst of bedreiging is, voert de AI de volledige minimax-strategie uit met alpha-beta snoeien.
+1. **Kan ik nu winnen?** De AI controleert **alle** kolommen op een winnende zet. Als hij ergens vier op een rij kan maken, doet hij dat meteen. Geen verdere berekeningen nodig. Belangrijk: de AI controleert eerst elke kolom op eigen winst voordat hij naar dreigingen kijkt — zo blokkeert hij nooit per ongeluk een dreiging van de tegenstander als hij zelf het spel kan winnen.
+
+2. **Kan de tegenstander volgende beurt winnen?** Pas nadat is bevestigd dat er geen directe winst is, controleert de AI alle kolommen op dreigingen van de tegenstander. Als jij ergens vier op een rij kunt maken, blokkeert hij die kolom. Dit overslaan zou een grote fout zijn.
+
+3. **Diepe zoektocht.** Als er geen directe winst of bedreiging is, voert de AI de volledige minimax-strategie uit met alpha-beta snoeien en de heuristiek-evaluatie.

 Deze drie stappen maken de AI zowel snel (directe reacties op duidelijke zetten) als slim (diep nadenken als het nodig is).

@@ -55,20 +55,33 @@ The AI compares -3, +2, and -1, and picks column 3 because +2 is the best it can

 ### Scoring: How the AI Rates a Board

-After playing out a "what if?" scenario, the AI needs to decide: is this a good result or a bad one? It uses a very simple scoring system with only three possible outcomes:
+After playing out a "what if?" scenario, the AI needs to decide: is this a good result or a bad one? It uses a layered scoring system:

 - **+1000 or more: "I win!"** The AI found a way to get four in a row. The bonus points above 1000 depend on how quickly it can win. Winning in 2 moves scores higher than winning in 6 moves. This is why the AI always goes for the fastest victory — it never wastes time when it can finish the game.

 - **-1000 or less: "I lose!"** The opponent gets four in a row. Losing sooner gets an even worse score. This makes the AI fight hardest against moves that threaten an immediate loss.

- **0: "I don't know yet."** The AI looked as far ahead as it could (it ran out of plies) and nobody won. It simply calls this position "neutral" — not good, not bad.
+- **Heuristic score: "I don't know yet, but I can tell how good this looks."** When the AI has looked as far ahead as it can (it ran out of plies) and nobody has won, it evaluates the position using a heuristic — a quick estimate of who is in a stronger position.

-That's it — the AI does not give extra points for having three in a row, controlling the center, or any other clever trick. It relies entirely on looking many moves ahead to figure out which moves lead to wins and which ones don't. If it can't see a win or loss within its search depth, every position looks the same.
+### The Heuristic: Reading the Board
+
+Instead of calling every unsolved position "neutral," the AI examines every possible group of four consecutive cells on the board (horizontal, vertical, and both diagonals — 69 groups in total). For each group, it counts pieces:
+
+- **3 AI pieces + 1 empty:** This is a strong threat — the AI is one move away from winning here. Score: **+50**.
+- **2 AI pieces + 2 empty:** A promising setup that could develop into a threat. Score: **+5**.
+- **3 opponent pieces + 1 empty:** A dangerous opponent threat. Score: **-50**.
+- **2 opponent pieces + 2 empty:** The opponent is building something. Score: **-5**.
+- **Mixed groups** (both players have pieces in the same group): Blocked — nobody can win here. Score: **0**.
+
+On top of that, the AI gives a small bonus (**+3** per piece) for controlling the center column, and a penalty (**-3** per opponent piece) there. The center column is involved in more winning lines than any other column, so controlling it is valuable.
+
+All these small scores add up. The maximum possible heuristic score is well below 1000, so it never interferes with actual win/loss detection — a guaranteed win always beats the best heuristic position.
+
+This heuristic means the AI can now tell the difference between a strong position (many threats being built) and a weak one (the opponent has all the threats), even when it can't see a forced win or loss within its search depth.

 ### Why the center column matters

-Even though the AI doesn't give bonus points for playing in the center, it always checks the center column first (column 3), then works outward (2, 4, 1, 5, 0, 6). 
-The center column is involved in more possible winning lines than the edges, so checking it first helps the AI find good moves faster and skip bad ones sooner (thanks to alpha-beta pruning).
+The AI always checks the center column first (column 3), then works outward (2, 4, 1, 5, 0, 6). The center column is involved in more possible winning lines than the edges, so checking it first helps the AI find good moves faster and skip bad ones sooner (thanks to alpha-beta pruning). The heuristic also gives a small bonus for center control, reinforcing this natural advantage.

 ## 4. Alpha-Beta Pruning: The Smart Shortcut

@@ -100,11 +113,11 @@ In practice, pruning lets the AI skip 50-90% of the positions it would otherwise

 Before running the expensive minimax search, the AI takes two quick shortcuts:

-1. **Can I win right now?** The AI tries placing its disc in each column. If any column completes four in a row, it takes that move immediately. No need to think further.
+1. **Can I win right now?** The AI checks **all** columns for a winning move. If any column completes four in a row, it takes that move immediately. No need to think further. Importantly, the AI scans every column for its own win before checking for threats — this ensures it never accidentally blocks an opponent's threat when it could win the game outright.

-2. **Can my opponent win next turn?** The AI checks if the opponent could win by playing in any column. If so, it blocks that column. Missing this would be a fatal mistake.
+2. **Can my opponent win next turn?** Only after confirming there is no instant win, the AI checks all columns for opponent threats. If the opponent could win by playing in any column, the AI blocks it. Missing this would be a fatal mistake.

-3. **Deep search.** Only if there are no immediate wins or threats does the AI run the full minimax search with alpha-beta pruning.
+3. **Deep search.** Only if there are no immediate wins or threats does the AI run the full minimax search with alpha-beta pruning and the heuristic evaluation.

 This three-phase approach makes the AI both fast (instant reactions to obvious moves) and smart (deep strategic thinking when needed).