Shimmy is a 4.8MB single-binary that provides 100% OpenAI-compatible endpoints for GGUF models. Point your existing AI tools to Shimmy and they just work — locally, privately, and free.
LLM: "Call get_expenses(employee_id=1)" → Returns 100 expense items to context LLM: "Call get_expenses(employee_id=2)" → Returns 100 more items to context ... (20 employees later) → 2,000+ line items ...