Original Reddit post

We just published our research on what we’re calling “Machine Learning as a Tool” (MLAT) - a design pattern for integrating statistical ML models directly into LLM agent workflows as callable tools. The Problem: Traditional AI systems treat ML models as separate preprocessing steps. But what if we could make them first-class tools that LLM agents invoke contextually, just like web search or database queries? Our Solution - PitchCraft: We built this for the Google Gemini Hackathon to solve our own problem (manually writing proposals took 3+ hours). The system:

  • Analyzes discovery call recordings
  • Research Agent performs parallel tool calls for prospect intelligence
  • Draft Agent invokes an XGBoost pricing model as a tool call
  • Generates complete professional proposals via structured output parsing
  • Result: 3+ hours → under 10 minutes Technical Highlights:
  • XGBoost trained on just 70 examples (40 real + 30 synthetic) with R² = 0.807
  • 10:1 sample-to-feature ratio under extreme data scarcity
  • Group-aware cross-validation to prevent data leakage
  • Sensitivity analysis showing economically meaningful feature relationships
  • Two-agent workflow with structured JSON schema output Why This Matters: We think MLAT has broad applicability to any domain requiring quantitative estimation + contextual reasoning. Instead of building traditional ML pipelines, you can now embed statistical models directly into conversational workflows. Links:
  • Full paper: Zenodo , ResearchGate Would love to hear thoughts on the pattern and potential applications! submitted by /u/okay_whateveer

Originally posted by u/okay_whateveer on r/ArtificialInteligence