Skip to main content
Documentation
API Reference

Supported Models

Use model: "auto" to let Laghav's ML router pick the cheapest capable model for your query. Or specify any model below directly.

Always start with model: "auto"
The auto router achieves 89% cost savings by classifying query complexity (simple, translation, code, complex) and routing accordingly. Override only when you have a specific reason.

Anthropic

Model IDTierInput cost / 1M tokensBest for
claude-haiku-3Cheapest$0.25FAQ, classification, simple Q&A, support bots
claude-sonnet-4Balanced$3.00Code generation, analysis, structured data
claude-opus-4Most capable$15.00Complex reasoning, multi-step agents, PhD-level tasks

OpenAI

Model IDTierInput cost / 1M tokensBest for
gpt-4o-miniCheapest$0.15Fast classification, simple tasks
gpt-4oBalanced$5.00General purpose, vision, code

Google

Model IDTierInput cost / 1M tokensBest for
gemini-1.5-flashCheapest$0.075Very long contexts, document analysis
gemini-1.5-proBalanced$3.50Multimodal, 1M token contexts

Routing categories (auto mode)

The ML router (DistilBERT ONNX, 3.4ms CPU latency) classifies every prompt into four categories and routes accordingly:

CategoryDefault modelTriggers
simpleclaude-haiku-3FAQ, greetings, yes/no questions, classification
translationclaude-haiku-3Language translation tasks
codeclaude-sonnet-4Code generation, debugging, technical analysis
complexclaude-opus-4Multi-step reasoning, research, legal, medical
Fallback behavior
If the ML router confidence score is below 0.70 (e.g., ambiguous query), Laghav falls back to a pattern-matching rule set. If the routing service is unavailable, it defaults to claude-haiku-3.