Routing and Provider Policy
Plan how requests are pinned to approved providers without losing control of cost or behavior.
Why routing matters
As soon as a team uses more than one provider, routing stops being an implementation detail and becomes part of product quality. The same prompt can have different cost, speed, or modality support depending on where it lands.
Common routing goals
- keep premium models for the workloads that truly need them
- keep provider policy clear when a model is available from more than one upstream
- reduce waste on evaluation and background jobs
- keep billing easier to explain across multiple model families
Current upstream selection
Newly connected models are pinned by exact versioned model name. Cheap Model no longer exposes family shortcuts such as gpt, claude, or gemini.
For example, gpt-5.5, claude-sonnet-4-5-20250929, claude-haiku-4-5-20251001, gemini-3.1-pro-preview, gemini-2.5-pro, gpt-image-2, seedance-2.0, and suno are pinned to EvoLink.
gpt-5, gpt-5-mini, gemini-2.5-flash, Wan 2.7 image, and Wan 2.7 video routes are pinned to APIMart. Cheap Model does not automatically retry another provider for these mapped models.
Why this is not automatic provider retry
For these exact models, provider choice is made before runtime. A request either goes to the mapped upstream or fails with that upstream's error. This keeps quality, billing, and task IDs easier to explain.
If you want to add automatic provider retry later, treat it as a separate product policy and define which models are allowed to move between providers.
When routing should stay strict
Strict routing is better when you need stable quality, predictable pricing, or very specific modality support. That is the current behavior for registered models.
Operational checklist
- define the approved provider for each workload
- decide which models are allowed to use APIMart-only routes
- review logs and billing often enough to catch drift
- treat routing rules as product policy, not just infrastructure
Cheap Model