📑 Technical Specification: Connect 4 AI Logic

1. Board Representation: The 2D Grid Model

The game state is maintained in a 2D array of signed integers: int8_t board[COLS][ROWS]. This structure mirrors the physical dimensions of a standard Connect 4 rack.

Data Structure

Dimensions: 7 columns (X) by 6 rows (Y).
Mapping: * 0: Null/Empty slot.
- 1: Player 1 (Yellow / Human).
- 2: Player 2 (Red / AI).
Hardware Translation: To drive the 8x8 NeoPixel matrix, the 2D coordinates are flattened into a 1D index using the specific mapping: Index = (y \times 8) + x Note: The 8th column (x=7) is ignored by the game logic and reserved for UI borders.

2. Positional Evaluation

Since Connect 4 has a state-space complexity of approximately 4.5 \times 10^{12}, the ESP32 cannot calculate every possible outcome to the end of the game from the first move. Instead, it uses a Heuristic Evaluation Function to score board positions.

Scoring Heuristics

Terminal Victory: Any move that results in a 4-in-a-row is valued at +1000 (for AI) or -1000 (for Human).
Temporal Weighting: To ensure the AI chooses the fastest path to victory and the longest path to defeat, the score is adjusted by the search depth:
- AI Win: 1000 + depth
- Human Win: -1000 - depth
Column Geometry: The AI inherently values central columns higher than edges. This is not explicitly hardcoded in the score but emerges from the search logic: a disc in column 3 can be part of horizontal, vertical, and diagonal win lines in both directions, making it mathematically more valuable.

3. Determining the Value of a "Half-Move"

A "half-move" is a single disc placement by one player. Its value is determined via the Minimax Algorithm with Alpha-Beta Pruning.

The Recursive Search Process

The AI simulates a move (a "branch") and then recursively simulates the opponent's best possible responses. The value of a move is the "minimized" maximum score possible from that branch.

Optimization: Alpha-Beta Pruning

To prevent the ESP32-C3 from timing out, the engine "prunes" branches that are mathematically guaranteed to be worse than previously explored paths.

Alpha (\alpha): The best score the AI (Maximizer) can guarantee.
Beta (\beta): The best score the Human (Minimizer) can guarantee.
The Cut-off: If at any point \beta \leq \alpha, the branch is abandoned.

4. Dynamic Move Ordering

The efficiency of the value determination is heavily reliant on Move Ordering. By evaluating the most promising columns first (starting from the center), the AI finds a high "Alpha" value quickly.

Search Order: 3 -> 2 -> 4 -> 1 -> 5 -> 0 -> 6 This ordering allows the Alpha-Beta pruning to discard up to 90% of the possible moves in the outer columns without calculating them, significantly reducing the "Thinking" time on the microcontroller.

5. Summary of Logic Execution

Generate all valid moves for the current board state.
Order moves starting from the center column.
Execute Minimax recursion for each move up to the current Ply.
Prune branches that cannot mathematically improve the current best option.
Return the move with the highest heuristic value.

3.6 KiB Raw Blame History