[update] Documentation

This commit is contained in:
2026-03-09 10:43:46 +01:00
parent 0994c11f0b
commit 0aa1802cce
2 changed files with 72 additions and 111 deletions
+46 -49
View File
@@ -1,72 +1,69 @@
# 📑 Technical Specification: Connect 4 AI Logic
# 🕹️ Connect 4 AI: How the Brain Works
## 1. Board Representation: The 2D Grid Model
The game state is maintained in a 2D array of signed integers: `int8_t board[COLS][ROWS]`. This structure mirrors the physical dimensions of a standard Connect 4 rack.
### Data Structure
- **Dimensions:** 7 columns (X) by 6 rows (Y).
- **Mapping:** \* `0`: Null/Empty slot.
- `1`: Player 1 (Yellow / Human).
- `2`: Player 2 (Red / AI).
- **Hardware Translation:** To drive the 8x8 NeoPixel matrix, the 2D coordinates are flattened into a 1D index using the specific mapping:
$$Index = (y \times 8) + x$$
_Note: The 8th column ($x=7$) is ignored by the game logic and reserved for UI borders._
To create a competitive Connect 4 experience on a small microcontroller, the game uses a mix of mathematical strategy and "shortcuts" to play like a human master.
---
## 2. Positional Evaluation
## 1. The Virtual Board
Since Connect 4 has a state-space complexity of approximately $4.5 \times 10^{12}$, the ESP32 cannot calculate every possible outcome to the end of the game from the first move. Instead, it uses a **Heuristic Evaluation Function** to score board positions.
The computer sees the board as a grid of numbers. It uses a **7-column by 6-row** map where:
### Scoring Heuristics
- **0** = Empty space
- **1** = Yellow (Human Player)
- **2** = Red (Computer AI)
1. **Terminal Victory:** Any move that results in a 4-in-a-row is valued at $+1000$ (for AI) or $-1000$ (for Human).
2. **Temporal Weighting:** To ensure the AI chooses the _fastest_ path to victory and the _longest_ path to defeat, the score is adjusted by the search depth:
- **AI Win:** $1000 + depth$
- **Human Win:** $-1000 - depth$
3. **Column Geometry:** The AI inherently values central columns higher than edges. This is not explicitly hardcoded in the score but emerges from the search logic: a disc in column 3 can be part of horizontal, vertical, and diagonal win lines in both directions, making it mathematically more valuable.
Every time a disc is dropped, a "Scan" function checks every possible direction (horizontal, vertical, and diagonal) to see if anyone has reached four in a row.
---
## 3. Determining the Value of a "Half-Move"
## 2. Thinking Ahead (The "What If?" Engine)
A "half-move" is a single disc placement by one player. Its value is determined via the **Minimax Algorithm** with **Alpha-Beta Pruning**.
The AI doesn't just look at the current board; it plays out thousands of "What if?" scenarios in its head.
### The Recursive Search Process
### The Minimax Strategy
The AI simulates a move (a "branch") and then recursively simulates the opponent's best possible responses. The value of a move is the "minimized" maximum score possible from that branch.
The AI uses a strategy called **Minimax**. It assumes that you will play your absolute best move, and it tries to find the move that leaves you with the worst possible outcome. It "maximizes" its own advantage while "minimizing" yours.
### Optimization: Alpha-Beta Pruning
### Alpha-Beta Pruning (The Shortcut)
To prevent the ESP32-C3 from timing out, the engine "prunes" branches that are mathematically guaranteed to be worse than previously explored paths.
- **Alpha ($\alpha$):** The best score the AI (Maximizer) can guarantee.
- **Beta ($\beta$):** The best score the Human (Minimizer) can guarantee.
- **The Cut-off:** If at any point $\beta \leq \alpha$, the branch is abandoned.
### 4. Dynamic Move Ordering
The efficiency of the value determination is heavily reliant on **Move Ordering**. By evaluating the most promising columns first (starting from the center), the AI finds a high "Alpha" value quickly.
- **Search Order:** `3 -> 2 -> 4 -> 1 -> 5 -> 0 -> 6`
This ordering allows the Alpha-Beta pruning to discard up to 90% of the possible moves in the outer columns without calculating them, significantly reducing the "Thinking" time on the microcontroller.
Calculating every possible move in Connect 4 would take hours. To make the AI fast, it uses **Pruning**. If the AI starts calculating a move and realizes its definitely worse than a move it already found, it "prunes" (deletes) that entire branch of thought and moves on. This allows it to ignore up to 90% of useless moves.
---
## 5. Summary of Logic Execution
## 3. Scoring System (The AIs Instincts)
1. **Generate** all valid moves for the current board state.
2. **Order** moves starting from the center column.
3. **Execute** Minimax recursion for each move up to the current **Ply**.
4. **Prune** branches that cannot mathematically improve the current best option.
5. **Return** the move with the highest heuristic value.
Since the computer can't always see to the very end of the game, it uses a scoring system to guess which positions are strongest:
- **Speed Matters:** The AI is rewarded more for a win that happens soon than a win that takes a long time. This gives it a "killer instinct" to end the game as quickly as possible.
- **The Center is King:** The AI is programmed to prefer the middle column. Mathematically, the center column is involved in the most possible winning combinations, so the AI fights to control it early in the game.
---
## References
## 4. Being Responsive (Interrupt Handling)
- [Information on how to analyze Connect-four](https://www.google.com/search?q=https://en.wikipedia.org/wiki/Connect_Four%23Mathematical_solution)
- [How does minimax work](https://en.wikipedia.org/wiki/Minimax)
- [What is aplpha-beta pruning](ttps://en.wikipedia.org/wiki/Alpha%E2%80%93beta_pruning)
Your game runs on an **ESP32-C3**, which is a single-tasking processor. Normally, if the AI spends 2 seconds thinking, the buttons would feel "broken" or "frozen" until it finishes.
We solve this with two clever tricks:
1. **Mid-Thought Checks:** Every few milliseconds of calculation, the AI "pauses" for a microsecond to see if you have pressed the button.
2. **Instant Exit:** If it detects you pressed the button while it was thinking, it abandons all calculations immediately and jumps back to the main menu.
---
## 5. Summary of an AI Move
When it is the computer's turn, it follows these steps in a split second:
1. **Check for Lethal:** Can I win right now? If yes, take it.
2. **Check for Danger:** Can the human win on their next move? If yes, block it.
3. **Search:** Look at the middle columns first, then the edges.
4. **Prune:** Throw away bad moves immediately to save time.
5. **Act:** Choose the move that leads to the quickest victory.
---
## 📚 References & Further Reading
- [The Mathematical Solution to Connect 4](https://en.wikipedia.org/wiki/Connect_Four#Mathematical_solution)
- [How the Minimax Algorithm Works](https://en.wikipedia.org/wiki/Minimax)
- [Understanding Alpha-Beta Pruning](https://en.wikipedia.org/wiki/Alpha%E2%80%93beta_pruning)
+26 -62
View File
@@ -1,86 +1,50 @@
# 🕹️ Connect 4 AI: Master Edition (v2.0)
# 🕹️ Connect 4 AI: Grandmaster Edition (v2.5)
A high-performance, feature-rich Connect 4 implementation for the ESP32-C3. This version features a "living" AI that evolves as you play, human-like movement animations, and a robust win-detection engine.
---
A high-performance Connect 4 implementation for the ESP32-C3 (RISC-V). This version features a "Killer Instinct" AI, human-like animations, and a real-time interrupt system.
## 🛠 Hardware Configuration
### 🔌 Pin Mapping (Lolin C3 Mini)
| Component | ESP32-C3 Pin | Function |
| :------------------- | :----------- | :--------------- |
| **NeoPixel Matrix** | `GPIO 4` | Data Input (DIN) |
| **Rotary Encoder A** | `GPIO 0` | Directional CLK |
| **Rotary Encoder B** | `GPIO 1` | Directional DT |
| **Encoder Button** | `GPIO 2` | Selection (SW) |
| Component | ESP32-C3 Pin | Function |
| :------------------- | :----------- | :------------------- |
| **NeoPixel Matrix** | `GPIO 4` | Data Input (DIN) |
| **Rotary Encoder A** | `GPIO 0` | Directional CLK |
| **Rotary Encoder B** | `GPIO 1` | Directional DT |
| **Encoder Button** | `GPIO 2` | Selection/Abort (SW) |
### 📐 Physical Layout
The project is optimized for an 8x8 NeoPixel Matrix (65mm x 67mm).
- **Row 0:** Interaction & AI Decision Visualization.
- **Row 1:** Static Blue UI border.
- **Rows 2-7:** Active $7 \times 6$ game board.
- **Status Column:** Far right column (Index 7) manages UI framing and "Glow" effects.
- **Game Board:** 7 Columns x 6 Rows.
- **Top Row (Row 0):** Interaction row (Selection & AI Thinking pulse).
- **UI Border:** Row 1 and Column 7 (Blue frame, toggleable via `SHOW_BORDER`).
- **Coordinate Formula:** $Index = (y \times 8) + x$
---
## 🧠 Advanced AI & Logic Features
## 🧠 Advanced AI Features
### 1. Progressive Difficulty (Evolution Mode)
### 1. Offense-Priority Strategy
To keep the game challenging and the CPU efficient, the AI search depth (Ply) scales as the board fills.
The AI follows a strict 3-phase move evaluation:
- **Formula:** $DynamicPly = BasePly + \lfloor \frac{DiscsOnBoard}{7} \rfloor$
- **Benefit:** The AI is "casual" in the opening but becomes a "Grandmaster" in the endgame when tactical precision is vital.
1. **Lethal:** If the AI can connect four this turn, it takes the win immediately.
2. **Defensive:** If the human player has a lethal move, the AI blocks it.
3. **Strategic:** If no immediate wins exist, it runs a deep Minimax search.
### 2. Intelligent Win Detection & Flashing
### 2. High-Priority Interrupts
The win-engine has been refactored to prevent "color ghosting."
The AI's single-core RISC-V processor is kept responsive via an "Abort Flag." Pressing the button or turning the encoder during an AI calculation (Demo or Playing) immediately kills the recursion and returns the user to the Menu.
- **Winner Locking:** The `scanBoard()` function returns the specific ID of the winner (1 for Yellow, 2 for Red).
- **Flashing Accuracy:** The final animation uses this ID to ensure the winning 4-in-a-row flashes in the **correct player's color**, regardless of whose turn it was when the game ended.
### 3. Evolution Mode
### 3. Smart Watchdog (Tiered Timeout)
The game respects your "thinking time" by using a tiered idle-timeout system:
- **Menu/Finished State:** Standard timeout (e.g., 60s).
- **Playing State:** **Double Timeout** (e.g., 120s). This gives human players more time to analyze complex boards before the game auto-resets to Demo Mode.
### 4. Strategic Blunder Injection
To ensure Demo Mode doesn't end in an infinite loop of draws, a 20% "Blunder Chance" is injected. This forces the AI to occasionally make a human-like mistake, creating openings for a definitive winner.
AI difficulty scales dynamically: $DynamicPly = BasePly + \lfloor \frac{DiscsOnBoard}{7} \rfloor$.
---
## 📖 Code Architecture & Modules
## 🛠 Installation & Build
### 🔄 State Machine
The core loop manages five distinct states:
1. **MENU:** Mode selection and board reset.
2. **PLAYING:** Active turn-based logic with gravity-accelerated drop animations.
3. **FINISHED_WIN:** Locks the winner ID and flashes the winning segment.
4. **FINISHED_DRAW:** Blinks the entire board to signify a stalemate.
5. **DEMO:** Auto-plays with randomized difficulty (Ply 3-6) and mandatory blunder logic.
### 🌐 Web Administration Portal
Accessible via the **"Connect4-Config"** AP at `192.168.4.1`.
- **Base Ply:** Sets the starting difficulty level.
- **Brightness:** Global LED intensity (0-255).
- **Evolution Toggle:** Turn on/off the progressive difficulty scaling.
- **Blunder Toggle:** Allow the AI to make mistakes during Human-vs-AI matches.
---
## 🛠 Installation
1. **Environment:** Use VS Code with the **PlatformIO** extension.
1. **Environment:** VS Code with PlatformIO.
2. **Dependencies:** `FastLED`, `Encoder`, `Preferences`.
3. **Build Flag:** Define your WiFi password in `platformio.ini`: `-D WIFI_PASSWORD=\"your_password\"`.
4. **Flash:** Upload to your ESP32-C3 and enjoy the ultimate desktop Connect 4 experience.
3. **Build Flags:** - `-D SHOW_BORDER=1` (Enables blue frame)
- `-D SHOW_BORDER=0` (Full-screen board mode)