Section 3 — Latency at the Edge 07 / 12

Inference benchmark

The model is fast. The network is slow.

Source: Cirrus internal benchmark, p99 over 10K requests, 70B-parameter model, Q4 2024

Cirrus 07