Trained Qwen 3.5 2B for pruning tool output in coding agents / Claude Code workflows

www.reddit.com

Trained Qwen 3.5 2B for pruning tool output in coding agents / Claude Code workflows

www.reddit.com

eifachposteMB to AI (Reddit RSS)English · 8 hours ago

Original Reddit post

Agents can spend a lot of context on raw pytest, grep, git log, kubectl, pip install, file reads, stack traces, etc., even though usually only a small block is actually relevant. I built a benchmark for task-conditioned tool-output pruning and fine-tuned Qwen 3.5 2B for it with Unsloth. The benchmark combines real SWE-bench-derived tool observations with synthetic multi-ecosystem examples. Held-out test results: 86% recall 92% compression Beats other pruners and zero shot models (+11 recall over zero-shot Qwen 3.5 35B A3B) You can put squeez in front of tool output before the next reasoning step, or add it to something like CLAUDE md as a lightweight preprocessing step. You can serve it with vLLM or any other OpenAI-compatible inference stack. Everything is open source, check for details:

paper: https://arxiv.org/abs/2604.04979
model: https://huggingface.co/KRLabsOrg/squeez-2b
dataset: https://huggingface.co/datasets/KRLabsOrg/tool-output-extraction-swebench
code: https://github.com/KRLabsOrg/squeez submitted by /u/henzy123

Originally posted by u/henzy123 on r/ClaudeCode

You must log in or # to comment.

Chat