Over the last months, we kept running into the same production issues with LLMs:
– Provider outages and partial degradation – Silent retries multiplying cost – Hard coupling to a single vendor
We built Perpetuo, a thin gateway that sits between your app and LLM providers.
It routes requests based on latency, cost and availability, applies automatic failover, and keeps billing predictable — all using your own API keys (no reselling, no lock-in).
This is early, but already running in real workloads.
I’d really appreciate feedback from people running LLMs in production — especially what you’d expect from this kind of infrastructure layer.
Happy to answer any technical questions.
0 comments