GeneralsAI | Subhadip Dawn

Overview

GeneralsAI is an experiment in applying large language models to real-time strategy games — specifically, building an AI agent that can reason about complex game states, form multi-step strategic plans, and explain its decisions in plain language.

Design

State encoding: The game map is serialized into a compact structured representation that fits in a context window — unit positions, resource counts, terrain, known enemy positions, and recent game history. This encoding is the hardest part: LLMs are sensitive to how information is presented.

Hierarchical planning: The agent operates at two levels. A high-level planner (one LLM call per strategic turn) sets objectives: “expand north”, “defend base”, “attack enemy flank”. A low-level executor (one call per unit action) translates objectives into concrete moves. This two-level structure keeps context windows manageable and produces more coherent strategies than flat prompting.

Chain-of-thought: Both planner and executor use CoT prompting. The planner’s reasoning trace is logged and displayed alongside each decision — you can watch the AI argue with itself about whether to rush or expand.

Evaluation: Agents play round-robin tournaments against each other and against scripted bots. Win rate and average game length are tracked per model/prompt variant.

Interesting Findings

GPT-4-class models significantly outperform smaller models, particularly in late-game resource management
CoT prompting improves win rate by ~15% vs. direct prompting (the model reasons through tradeoffs it would otherwise miss)
The biggest failure mode: over-commitment. LLMs are bad at knowing when to cut losses on a failing strategy

Tech Stack

Python · FastAPI · LangChain · TypeScript · WebSocket (live state streaming)