v2026.4.2: Kimi K2.5 routes to Moonshot direct
Kimi K2.5 now defaults to Moonshot direct after Fireworks capacity issues caused 429 'service is overloaded' errors. Fireworks remains available when explicitly requested.

Reliability fix
Users were seeing 429 too_many_requests and 504 timeout errors on kimi-k2.5 because the Fireworks endpoint for K2.5 has been capacity-constrained, returning "Request didn't generate first token before the given deadline, the service is overloaded".
We've marked the Fireworks mapping for kimi-k2.5 as unstable, which removes it from automatic routing. Auto-selected kimi-k2.5 requests now go to Moonshot direct (also 262,144 token context, same pricing).
Fireworks K2.5 remains usable if you explicitly pin it via ?provider=fireworks or the provider field, and we'll re-enable it in auto-routing once Fireworks capacity is restored.
Provider timeout raised to 5 minutes
The gateway's per-request provider timeout was bumped from 90s → 300s so long reasoning calls (Kimi K2.5 thinking, large qwen3p6-plus jobs, etc.) finish instead of being killed with a 504. Most requests still complete in under 30 seconds; this only changes the ceiling.