Original Reddit post

Well, how do I start this, I think we first need some important context. Chai: https://preview.redd.it/qf98b20vze6h1.png?width=1356&format=png&auto=webp&s=7416ba3cca0d599a9acfcccd55c7c523097414fc Hasbullah / Hasbi: https://preview.redd.it/4racu78zze6h1.png?width=1120&format=png&auto=webp&s=15fcae309c0b410e715cf2bdb71b712f89eddf85 Together, Chasbinder was born. Ok maybe this wasn’t important… At least you now know AI didn’t right this… I think. However, it’s important to note, that my Openclaw Agent running through Codex GPT 5.5 xHigh helped enable this test. The same prompt was given to 5 different models on their highest reasoning/think setting via OpenRouter with only one shot. The test was simple, I just wanted my agent Chasbi to have its own cool interactive homepage and I thought of a Tamagotchi game that could be actually playable. You can see the prompt below and breakdown of cost. So here are the results, why don’t you try to guess who made what before you reveal the results and see if you got it right? (GPT 5.5, Opus 4.8, Fable/Mythos 5. Gemini 3.5 Flash, Deepseek V4 Pro, Qwen 3.7 Max). https://chasbi.uk/t1 = Gemini 3.5 Flash https://chasbi.uk/t2 = Qwen 3.7 Max https://chasbi.uk/t3 = Claude Opus 4.8 https://chasbi.uk/t4 = Claude Fable/Mythos 5 https://chasbi.uk/t5 = ChatGPT 5.5 https://chasbi.uk/t6 = Deepseek V4 Pro Did you get it right? Well they were all through OpenRouter API with their highest available reasoning setting, everything else was at default and heres the breakdown of how the tokens were tokenised by each provider and the cost for each. https://preview.redd.it/ku1gi4ad1f6h1.png?width=2432&format=png&auto=webp&s=f8896dc539582b3cf366c29e17d465395a5f7531 https://preview.redd.it/68r8wq7g1f6h1.png?width=2468&format=png&auto=webp&s=e9759cc9aace1ca3f84f176d5ab7f91bd6ae47a6 So they were all done around the same time at 8AM BST except for Fable/Mythos 5 which I did the day before at 06:50PM BST if that matters, as we’re like 5-6 hours ahead of the US it could make all the difference in the world in terms of performance. I am on the Codex Max plan and I stuck it out, because GPT 5.5 xHigh has been amazing for me, except since last week whether it’s OpenAI reallocating resources for their launch of GPT 5.6 who knows, but it’s never made mistakes for me until now, so I was surprised. I really want to test Fable/Mythos 5 on my codebase but honestly, it cost frikkin’ $2.47 for this stupid 1 shot Tamagotchi test! So the only way that’s feasible for me right now is to use the Claude Max plan and use it for the 2 weeks we have it until it goes away on 22nd June. Anyway it would be interesting to get your views. Who do you think did it the best… If you want me to test anything else let me know. Each model received the same prompt template and identical task/spec, with only the lane name and target route changed. E.g.: {LANE}

T1/T2/T3/T5/T6 {ROUTE}

/t1 /t2 /t3 /t5 /t6 {LANE_LOWER} = output path label like t1 , t2 , etc. The Prompt: Build Chasbinder Pet Lab {LANE} as a model-lane benchmark for chasbi.uk. Target lane:

  • Public route: {ROUTE}/
  • Title must include Chasbinder Pet Lab {LANE}.
  • This model is competing under the same brief as the other fresh lanes. Do not mention that this is a placeholder or a previous version. Context:
  • This is a public-safe static browser game. Do not include private/personal data, secrets, real family details, or network calls.
  • The challenge is to make a small finished indie-feeling Tamagotchi/pet-lab game, not a demo, landing page, or reskin.
  • It should be strong enough to compare fairly against the Fable/Mythos-style V4 lane and the SoRa/Codex T7 lane. Return ONLY one complete HTML document. No markdown, no explanation. Hard constraints:
  • Single self-contained index.html.
  • HTML, CSS, vanilla JS only.
  • No external fonts, libraries, images, audio, tracking, or network calls.
  • Mobile-first but polished on desktop.
  • Must work as a static file under https://chasbi.uk/{ROUTE}/\ .
  • Use localStorage, versioned save data, migration/reset if corrupt.
  • Include export/import/reset debug controls.
  • Do not use eval, alerts for normal gameplay, or browser permissions.
  • Keep total file reasonably compact; aim under 120KB if possible.
  • Use stable layout dimensions so controls do not jump on mobile. Game direction:
  • Core fantasy: Chasbinder is a tiny digital guardian living in a warm terminal-garden. The world is losing its “memory lights”; the player raises Chasbinder, sends him on short expeditions, restores rooms, and unlocks story chapters.
  • Keep Tamagotchi care at the center, but add a real story loop and difficulty.
  • Should be playable in one sitting for 5-10 minutes and still progress over days. Required systems:
  • Pet stats: hunger, thirst, energy, hygiene, mood, trust/bond, health, stress, discipline, curiosity, weight/fitness, illness risk, age/stage, sleep/wake state, personality, and learned preferences.
  • Offline progression: elapsed real time affects needs, events, story timers, recovery, and expedition return.
  • Actions with tradeoffs and cooldowns: feed, drink, clean, rest/sleep, comfort, train, play, explore/expedition, clinic/medicine, craft/restore.
  • Difficulty modes: Cosy, Standard, Survival. Difficulty changes stat decay, rewards, event risk, and story pressure. Let player pick at new game and show current mode.
  • Story progression:
  • Several named chapters/rooms.
  • Unlock story snippets through care plus expedition resources.
  • Provide an achievable “chapter complete” arc in one sitting and longer-term goals.
  • Expedition/minigame:
  • Lightweight interactive risk/reward loop, not just a button.
  • Should be simple on mobile: choose a route, spend energy, react to events, collect memory sparks, avoid stress/illness.
  • Difficulty should matter.
  • Consequences:
  • Neglect, dirty habitat, dehydration, overfeeding, spam-clicking, low sleep, bad expedition choices can cause illness, injury, tantrums, stress, poor rewards.
  • Good care improves trust, story outcomes, and expedition success.
  • UI:
  • Pet/room scene with canvas or SVG animation.
  • Compact stats with readable bars.
  • Tabs/segmented controls for Care, Adventure, Story, Memory.
  • Journal of important events.
  • Achievements/badges.
  • Clear cooldown/disabled states.
  • No text overflow on narrow phones.
  • Feel:
  • Warm, cosy, polished, playful Chasbi/Chasbinder personality.
  • Avoid one-note dark blue/purple gradient overload.
  • Avoid marketing/landing-page composition. First screen is the game. Quality bar:
  • Code must be robust enough that I can save it directly as /root/Chasbi/web/public/{LANE_LOWER}/index.html.
  • Include enough comments only where helpful.
  • Make it fun to inspect visually and mechanically.
  • Do not leave placeholder labels like “model lane placeholder”. submitted by /u/ikyz

Originally posted by u/ikyz on r/ArtificialInteligence