Most LLM-based games hit a wall around turn 15. The context window fills up, the model starts hallucinating inventory items, and causal chains break (e.g. an NPC forgets you owe them money). I’m the dev behind a project called Altworld, and we decided to tackle this by completely decoupling the narrative generation from the canonical state. Instead of relying on a massive chat transcript and hoping the LLM remembers the plot, we built a pipeline where specialist models just mutate JSON/PostgreSQL state, and the narrative is rendered last. Here is a breakdown of our approach and the latency tradeoffs.
The Architecture: State > Text
The core rule is: narrative text is generated after state changes, not before. We use Next.js, Prisma, and PostgreSQL. The AI layer is split into specialist roles (via OpenRouter/OpenAI): world systems reasoning, NPC planning, action resolution, and narrative rendering. When a player submits a natural language move, the pipeline looks roughly like this:
// Simplified Turn Advancement Pipeline
async function advanceTurn(runId: string, playerAction: string) {
// 1. Acquire processing lock & load hard state
const state = await loadCanonicalState(runId);
// 2. Advance world systems (economy, weather, unrest)
const worldUpdates = await simulateWorld(state);
// 3. NPC Simulation (local knowledge only, no omniscient scripts)
const npcActions = await simulateNPCs(state.factions, state.rumors);
// 4. Resolve player action against stats/inventory
const actionResult = await adjudicateAction(playerAction, state.character);
// 5. Persist ALL structural changes transactionally
await prisma.$transaction([ ...updateQueries ]);
// 6. Narrative Render (The only part the user actually reads)
const narrative = await renderScene(actionResult, worldUpdates, npcActions);
return narrative;
}
Tradeoffs
: Latency vs. Coherence
The obvious limitation here is latency. Running 3-4 distinct LLM calls (adjudication, world sim, NPC sim, rendering) sequentially is slow.
To hide this, we heavily lean into UI streaming and a phased loading state. The user sees a "World panel" that streams in environmental changes first, while the heavier NPC logic calculates in the background. We also built deterministic/local fallback behavior so the core loop doesn't completely crash if an API call times out.
The benefit? We get actual persistent causal chains. If you steal from a merchant on turn 2, the Relationship and Faction tables update. On turn 20, that merchant's local NPC logic will flag your presence and trigger a bounty event, regardless of whether that merchant has been mentioned in the prompt context recently.
Affiliation & Demo
Per the sub rules, disclosing that I am the builder of this system. It's still in an alpha state, but if you want to poke at the implementation and see how the database handles long-term memory constraints, the demo is live at
https://altworld.io/
Happy to answer questions about the prompt chaining or Prisma schema.
submitted by
/u/Dace1187
Originally posted by u/Dace1187 on r/ArtificialInteligence
