eifachposte

eifachposte

Note: I recently shared recollection of our journey establishing artificial intelligence enablement services at a large enterprise organization. Figured the audience here may be interested. I am sorry that this is so long, but it is challenging to be concise given the complexity of the journey. I do not consider myself an expert - at anything. But have learned from the experience and from the knowledge of my team. Happy to answer any questions I may be able to based on that experience. There’s a particular kind of confidence that comes from reading your fifth whitepaper on enterprise AI adoption. You start to believe you understand the landscape. You sketch architectures on whiteboards. You use phrases like “operationalize” and “at scale” in meetings. Then you actually try to do it. I lead a team that has stood up enterprise AI enablement services in a large organization. We serve tens of thousands of users, carry significant regulatory obligations, and operate inside the kind of institutional complexity that makes “move fast and break things” sound less like a philosophy and more like a termination offense. Over the past several years, my team has built an enterprise AI program essentially from scratch. We’ve shipped real capabilities to real users. We’ve also made mistakes that, in hindsight, seem almost comically avoidable. This is the story of that journey. Not the sanitized version you’d find in a vendor case study, but the real one. If you’re an enterprise leader standing up your own AI capabilities, I hope our experience saves you at least a few headaches. Before the Revolution: When “AI” Meant Regressions and Random Forests Long before ChatGPT became a household name, we were already running machine learning workloads. But to understand where AI fits into our story, you need to understand what our office actually does. We manage a centralized data platform with a suite of analytics and business intelligence tools built around it: Tableau for visualization, Alteryx for data preparation and workflows, ArcGIS for geospatial analysis, and Palantir Foundry for large-scale data integration and operational analytics. Together, these form an enterprise data and analytics platform that serves the entire organization. So when machine learning started gaining traction, our office was a natural home for it. The problem was that plenty of teams across the organization had already figured that out on their own. Research enclaves were popping up everywhere. Individual offices were spinning up their own virtual machines, installing their own tools, running their own analyses. And many of them were doing good work. That made our job harder, not easier. At the enterprise level, we hosted JupyterHub on an EMR cluster. It worked, but we struggled with package and kernel management, and the cost efficiency wasn’t great. Over time, we’ve been working to migrate our Jupyter users to Notebooks on SageMaker, which has been a much better fit. But the real challenge in those early days wasn’t technical. It was credibility. When you’re trying to centralize a capability that dispersed teams are already doing successfully on their own, you can’t just show up and tell them to stop. You have to deliver something better. Our pitch was that enterprise management offered things they couldn’t easily replicate in isolation: easy access to governed data, top-down governance and compliance, cost savings from economy of scale, and communities of practice where data scientists across the organization could actually learn from each other. It was a compelling case, but only as long as we could back it up with execution. We’re still earning that trust. But the foundation we built during this period, the relationships with those dispersed teams, the credibility we accumulated by actually delivering, gave us a significant head start when generative AI changed everything. “Can We Get a ChatGPT?”: The Generative AI Moment Like every other large organization on the planet, we got the question almost immediately. Leadership wanted to know what generative AI could do for us. Individual teams were already experimenting with consumer tools in ways that kept our security team up at night. The pressure to deliver something, anything, was real. Our answer was an internal AI assistant built on OpenAI models hosted through Microsoft Azure. Simple enough in concept. Considerably less simple in execution. What followed was weeks of unglamorous but essential work: hardening the security boundary, protecting endpoints, standing up virtual networks, configuring certificates, registering domains. This is not the kind of work that makes it into keynote presentations, but without it, nothing else was possible. We were building a secure front door before we could invite anyone inside. Looking back, I wish I’d allocated twice the time we originally estimated for this phase. Security architecture at enterprise scale isn’t something you can shortcut, and every corner we considered cutting would have come back to haunt us. But we got it done. And we got it done early. To my knowledge, we delivered a secure, internal ChatGPT-like service before any comparable organization had. I want to be clear: it wasn’t glamorous. There was no file upload. No multi-modal capability. It ran on an older model. But it worked. It was secure. And it opened the eyes of a workforce that had been hearing about generative AI in the news but hadn’t been able to touch it in a sanctioned environment. The response was overwhelming. Our “AI Assistant” became one of the most talked-about services we’d ever launched. Not because it was cutting-edge, but because it was available . For tens of thousands of users who had been told “don’t use ChatGPT at work,” we had finally given them something they could use. That mattered more than any feature list. The Chatbot Cambrian Explosion Once we had a working AI assistant, we turned our attention to what was, at the time, the most requested capability: Retrieval-Augmented Generation. RAG chatbots. The ability to point a language model at your own data and have a conversation with it. First, we had to educate. Most of our users had never heard the term “RAG,” and frankly, many of them didn’t need to. What they needed to understand was the art of the possible: that they could take a library of policy documents, technical manuals, or operational guides and make them conversationally searchable. The demand was immediate and overwhelming. Our first RAG chatbot was a custom LangChain application running on a virtual machine. It worked well enough as a proof of concept, but custom coding a new chatbot for every request was not scalable for a small team. So we got creative. We developed a framework that allowed us to deliver the same RAG environment to different users, managing access through Active Directory rather than building new solutions from scratch each time. We built a ticketing workflow to intake requests and streamlined provisioning so we could stand up new instances quickly. And then the chatbots proliferated. Every division wanted one. Every program office saw a use case. We went from a handful to dozens in what felt like no time at all. Eventually, even our framework couldn’t keep up with the volume. That pressure is a big part of what drove our eventual migration to SimpleChat, a self-service platform I’ll get into later in this article. Somewhere in the middle of all this, we had two ideas that I’m still a little embarrassed never materialized. The first was an index of chatbots, a central registry so users could discover what already existed before requesting something new. The second, even more ambitious, was a chatbot of chatbots: a meta-assistant that could route queries to the right specialized bot. Both were good ideas. Neither was implemented. The pace of the underlying technology was shifting so fast that our roadmap kept getting rewritten before we could execute on it. I suspect many enterprise AI teams will recognize that pattern. The planning horizon for generative AI capabilities is brutally short. What seemed like a six-month initiative in January was often obsolete by June. The Policy Problem Nobody Wants to Talk About Here’s something that doesn’t get enough attention in enterprise AI discussions: the policy gap. When generative AI arrived, most organizations, including ours, didn’t have policies that addressed it. Existing data governance frameworks weren’t designed for a technology that could ingest, synthesize, and generate text at scale. We recognized early that we needed to move on two fronts simultaneously: deploying technology and establishing the rules of engagement. We authored our organization’s AI Strategy early in the generative AI wave, early enough that we were genuinely proud of the timing. We followed it with specific policy on the use of generative AI tools: what was authorized, what wasn’t, and under what conditions. And then came the confusion. Despite our best efforts at communication, users struggled to understand what they were and weren’t allowed to do. The boundaries weren’t always intuitive. Was a particular third-party tool authorized? Could they paste sensitive data into an approved platform? What about a tool that was authorized for one use case but not another? We learned an important lesson here: you cannot depend solely on policy and rules of system use as your governance mechanism. Policy is necessary, absolutely, but in a world where AI capabilities are embedded into a constantly expanding set of tools, you need technological guardrails too. The problem is that those guardrails can’t be everywhere. You can’t put a filter on every application, every browser extension, every API call. So you end up with a hybrid approach where policy sets the boundaries, technology enforces what it can, and education fills the gaps. It’s imperfect. We’re still refining it. But I think any enterprise leader who tells you they’ve solved AI governance is either operating at a much smaller scale than they’re letting on or isn’t looking closely enough. The Bureaucracy That Keeps You Safe (Even When It Doesn’t Feel Like It) I’ll be honest: there were days when I resented the compliance process. Privacy threshold assessments. Security impact analyses. Authority to operate reviews. Each one felt like another weight on a team that was already sprinting. But over time I’ve come to appreciate that these controls exist for good reason. When you’re deploying AI capabilities to tens of thousands of users, the blast radius of a data exposure or a poorly governed model isn’t theoretical. It’s a front-page story. One of the most impactful things we did was integrate with our organization’s software review board. We positioned our team as a service to the entire enterprise: every software request that included an AI capability came through us for data control review. This wasn’t about being gatekeepers. It was about making sure the people making procurement and deployment decisions had a clear picture of how each tool handled data, where it went, who could access it, and what happened to it after the session ended. It wasn’t always a popular role to play. But it was necessary. Where We Are Now: Bedrock, Agents, and the Long Game Today, our program looks dramatically different from where we started. We’ve migrated much of our infrastructure to AWS, with Bedrock as the backbone for our most sophisticated AI capabilities. We’re supporting multi-agent architectures that are nearly autonomous. I say “nearly” because we maintain essential zero-trust and human-in-the-loop requirements. Full autonomy in our environment isn’t just inadvisable. It’s not an option. Getting here required far more behind-the-scenes technical coordination than I expected, and this is probably my single biggest lesson learned. The Infrastructure Iceberg When I first took on this role, I was relatively unfamiliar with the broader IT organization in which my division resided. My instinct was to configure all of the underlying infrastructure with my own team: IAM roles, service permissions, network configurations. We were capable and motivated. We were also, it turns out, deeply naive about what “enterprise scale” actually means. Standing up AWS services like SageMaker and Bedrock for one or two users is a fundamentally different challenge than doing it for tens and hundreds of teams, many with massive compute and throughput requirements. The IAM role configurations alone took weeks to perfect. Cost guardrails had to be architected so that individual teams couldn’t accidentally spin up resources that blew through budgets. Even the intake workflow, how teams requested access, how accounts were provisioned, how governance was applied, required its own dedicated engineering effort. Over time, we did what we should have done from the beginning: we became close partners with our organization’s cloud services team. They had the expertise, the access, and the institutional knowledge that we lacked. This collaboration transformed our delivery capability. What had been a bottleneck became a pipeline. If I could go back and change one thing, it would be this. I would have walked down the hall and started that partnership on day one. The AI Use-Case Lifecycle One of our most important innovations has been the development of a formal AI use-case lifecycle. We noticed a pattern early on: teams would get excited about AI, build a prototype, demo it to enthusiastic stakeholders, and then… nothing. The prototype would languish. It would never make it to production. The excitement would fade, and the investment would be wasted. So we built a structured process to carry use cases from spark to sustainment. It starts with ideation and requirements gathering, where our team works alongside use-case developers to refine their concept and ensure it’s viable. From there, we provision technical resources, provide technical consultancy, and assist with data preparation and governance alignment. Development, testing, and validation follow, and we stay engaged through the entire arc. System rollout and transition to sustainment aren’t afterthoughts. They’re planned phases with defined handoffs. Post-deployment, we monitor model health, schedule retraining cycles, and track performance against the metrics that justified the use case in the first place. This lifecycle changed how we operate. It’s the difference between an organization that experiments with AI and one that actually runs on it. From Internal Assistant to Self-Service Our tooling has evolved as well. We’ve migrated from our original internal AI assistant to utilizing the open-source " Simple Chat ," a platform that gives users self-service RAG capabilities, community workspaces, and a more flexible interaction model. The shift from centrally managed chatbots to user-empowered workspaces reflects something we took a while to internalize: our job isn’t to build every AI solution ourselves. It’s to provide the platform, the guardrails, and the expertise so that domain experts across the organization can build their own. We’re also running pilot programs with AI-enabled coding platforms like GitHub Copilot, Replit, and Claude Code, and exploring the Google Workspace AI integrations. Each of these requires its own governance review, its own security assessment, and its own change management effort. The work doesn’t stop. What I’d Tell You Over Coffee If you’re an enterprise leader early in your AI journey, here’s what I’d want you to know. The infrastructure is the iceberg. The AI models are the visible tip. Below the waterline is identity management, network security, cost governance, intake workflows, provisioning automation, and a dozen other things that nobody writes Medium articles about. Budget for them. Staff for them. Respect them. Partner early with your infrastructure teams. Don’t try to build it all yourself. The cloud services team, the network team, the identity team: they’re not obstacles. The sooner you integrate with them, the faster you’ll move. Policy alone won’t save you. You need policy and technology and education working together. Any one of those by itself is insufficient. Build the lifecycle, not just the prototype. Anyone can demo an AI chatbot. The hard part is getting it to production, keeping it healthy, and making sure it delivers sustained value. If you don’t have a structured path from ideation to sustainment, you’re going to end up with a graveyard of proofs of concept. Embrace the bureaucracy. Push to make it faster and better calibrated to the technology, sure. But don’t try to skip it. The controls exist because the risks are real, and at enterprise scale, the consequences of getting it wrong are significant. And stay humble. The technology is moving faster than any of us can fully track. The strategy you write today will need to be revised in six months. The architecture you’re proud of right now will feel dated in a year. That’s not failure. That’s the nature of the work. We’re still learning. Every week brings something we hadn’t anticipated. But we’re learning from a position of operational maturity that we built through years of work that was rarely glamorous and almost never easy. I wouldn’t trade any of it. submitted by /u/whiskeyboarder

Originally posted by u/whiskeyboarder on r/ArtificialInteligence

We Didn't Know What We Didn't Know: Standing Up Enterprise AI Services at Scale

We Didn't Know What We Didn't Know: Standing Up Enterprise AI Services at Scale