From 6c8802509ffa4b155a4621f20300ea830c43cb7b Mon Sep 17 00:00:00 2001 From: Seppe De Loore Date: Sat, 7 Mar 2026 11:15:46 +0100 Subject: [PATCH] [add] Background information on gameplay. --- .gitignore | 1 + Background information.md | 72 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 73 insertions(+) create mode 100644 Background information.md diff --git a/.gitignore b/.gitignore index 89cc49c..457f27a 100644 --- a/.gitignore +++ b/.gitignore @@ -3,3 +3,4 @@ .vscode/c_cpp_properties.json .vscode/launch.json .vscode/ipch +.vscode/settings.json \ No newline at end of file diff --git a/Background information.md b/Background information.md new file mode 100644 index 0000000..7f37826 --- /dev/null +++ b/Background information.md @@ -0,0 +1,72 @@ +# 📑 Technical Specification: Connect 4 AI Logic + +## 1. Board Representation: The 2D Grid Model + +The game state is maintained in a 2D array of signed integers: `int8_t board[COLS][ROWS]`. This structure mirrors the physical dimensions of a standard Connect 4 rack. + +### Data Structure + +- **Dimensions:** 7 columns (X) by 6 rows (Y). +- **Mapping:** \* `0`: Null/Empty slot. + - `1`: Player 1 (Yellow / Human). + - `2`: Player 2 (Red / AI). +- **Hardware Translation:** To drive the 8x8 NeoPixel matrix, the 2D coordinates are flattened into a 1D index using the specific mapping: + $$Index = (y \times 8) + x$$ + _Note: The 8th column ($x=7$) is ignored by the game logic and reserved for UI borders._ + +--- + +## 2. Positional Evaluation + +Since Connect 4 has a state-space complexity of approximately $4.5 \times 10^{12}$, the ESP32 cannot calculate every possible outcome to the end of the game from the first move. Instead, it uses a **Heuristic Evaluation Function** to score board positions. + +### Scoring Heuristics + +1. **Terminal Victory:** Any move that results in a 4-in-a-row is valued at $+1000$ (for AI) or $-1000$ (for Human). +2. **Temporal Weighting:** To ensure the AI chooses the _fastest_ path to victory and the _longest_ path to defeat, the score is adjusted by the search depth: + - **AI Win:** $1000 + depth$ + - **Human Win:** $-1000 - depth$ +3. **Column Geometry:** The AI inherently values central columns higher than edges. This is not explicitly hardcoded in the score but emerges from the search logic: a disc in column 3 can be part of horizontal, vertical, and diagonal win lines in both directions, making it mathematically more valuable. + +--- + +## 3. Determining the Value of a "Half-Move" + +A "half-move" is a single disc placement by one player. Its value is determined via the **Minimax Algorithm** with **Alpha-Beta Pruning**. + +### The Recursive Search Process + +The AI simulates a move (a "branch") and then recursively simulates the opponent's best possible responses. The value of a move is the "minimized" maximum score possible from that branch. + +### Optimization: Alpha-Beta Pruning + +To prevent the ESP32-C3 from timing out, the engine "prunes" branches that are mathematically guaranteed to be worse than previously explored paths. + +- **Alpha ($\alpha$):** The best score the AI (Maximizer) can guarantee. +- **Beta ($\beta$):** The best score the Human (Minimizer) can guarantee. +- **The Cut-off:** If at any point $\beta \leq \alpha$, the branch is abandoned. + +### 4. Dynamic Move Ordering + +The efficiency of the value determination is heavily reliant on **Move Ordering**. By evaluating the most promising columns first (starting from the center), the AI finds a high "Alpha" value quickly. + +- **Search Order:** `3 -> 2 -> 4 -> 1 -> 5 -> 0 -> 6` + This ordering allows the Alpha-Beta pruning to discard up to 90% of the possible moves in the outer columns without calculating them, significantly reducing the "Thinking" time on the microcontroller. + +--- + +## 5. Summary of Logic Execution + +1. **Generate** all valid moves for the current board state. +2. **Order** moves starting from the center column. +3. **Execute** Minimax recursion for each move up to the current **Ply**. +4. **Prune** branches that cannot mathematically improve the current best option. +5. **Return** the move with the highest heuristic value. + +--- + +## References + +- [Information on how to analyze Connect-four](https://www.google.com/search?q=https://en.wikipedia.org/wiki/Connect_Four%23Mathematical_solution) +- [How does minimax work](https://en.wikipedia.org/wiki/Minimax) +- [What is aplpha-beta pruning](ttps://en.wikipedia.org/wiki/Alpha%E2%80%93beta_pruning)