Original Reddit post

Hey! Just shipped a side project I’ve been working on and looking for real users to stress test it. What it is: HabitFlow — a habit tracker where nudges are selected by a contextual multi-armed bandit that learns per-user intervention preferences in real time. The ML side (for those interested): Each user has 10 bandit arms — one per intervention strategy (streaks, loss framing, dark humor, social proof, etc.) Thompson Sampling maintains a Beta(α, β) distribution per arm and updates on every feedback signal Feedback signals: completed (+1.0), engaged (+0.5), ignored (0.0), dismissed (-0.2), negative (-0.5) The system learns your preferred strategy without any offline training — purely online learning from production feedback Built a separate MLOps dashboard with policy registry, A/B testing framework, fairness constraints, and automated retraining pipeline Stack: FastAPI · PostgreSQL · Redis · React · Celery · SQLAlchemy What I need: Real users generating real feedback signals. Even 5-10 people for a week gives me actual bandit convergence data to analyze. If you want to try out the app or check out the dashboard, DM me and I’ll be happy to share the links. Happy to answer questions about the implementation — the bandit engine and policy evaluator were the most interesting parts to build. submitted by /u/Donald-the-dramaduck

Originally posted by u/Donald-the-dramaduck on r/ArtificialInteligence