Original Reddit post

working on a sticker generation app and wanted to share the AI architecture since it’s a bit different from typical chatbot stuff the pipeline: user uploads reference photos (selfies, pets, groups) claude analyzes the photos and writes creative prompts - figures out what expressions/poses would make good stickers image model generates the actual stickers based on claude’s prompts background removal + format processing for messaging apps why claude for prompt writing: originally tried to skip this step and go straight to image generation with basic prompts. results were generic. claude looking at the actual reference photos and describing what would make good stickers made a huge difference. it picks up on stuff like “this person has a distinctive hairstyle” or “this dog has floppy ears” and works those into the prompts. the tricky parts: keeping the style consistent across 9 stickers per pack claude sometimes gets too creative with the prompts and the image model can’t follow background removal still isn’t perfect on complex edges whatsapp has strict format requirements (512x512, webp, under 100kb) so there’s a quality ladder for compression costs: every generation is actually 4+ model calls. adds up fast. still figuring out the right pricing to make margins work. anyone else doing multi-model pipelines for consumer products? curious what others have learned submitted by /u/dorongal1

Originally posted by u/dorongal1 on r/ArtificialInteligence