Performance

Defining the benchmark for performance of an API, providing an overview of how performance is approached and what it means, while also providing actual tests, results, and other evidence that demonstrates that performance is taken seriously.

Also known as: Latency, Throughput, SLO, Benchmarks

Example

- type: X-Performance
  url: https://developers.example.com/performance

Standards

W3C Server-Timing header (W3C)
OpenTelemetry / CNCF OpenTelemetry — HTTP semantic conventions
OpenTelemetry / CNCF OpenTelemetry Metrics specification
Prometheus / CNCF Prometheus exposition format
Apdex Alliance Apdex
Google (web measurement) Core Web Vitals

HTTP Headers

Header	Direction	Spec	Description
`Server-Timing`	response	W3C Server-Timing	Communicates server-side timing metrics to clients.
`Timing-Allow-Origin`	response	W3C Resource Timing Level 2	Permits cross-origin readers to access detailed timing values.

Media Types

application/json — Benchmark result documents and SLO definitions.
text/plain — Prometheus exposition format for scraping performance metrics.

OpenAPI Expression

x-performance (Vendor extension) — Points to benchmark reports, SLOs, or load-test artifacts for the API.
info.x-sla (Vendor extension) — Custom marker linking the description to a published SLA/SLO document.

Governance Rules

oas-tag-description (Spectral built-in) — Tags should describe performance-relevant grouping (read-heavy vs. write-heavy).
operation-operationId (Spectral built-in) — Stable operationIds are required to correlate benchmarks with the spec over time.

Risk & Compliance

Security: Performance instrumentation can leak internal topology via Server-Timing or trace identifiers; scrub upstream service names before exposing externally. Unbounded request shapes (large page sizes, deep filters) are both a performance and DoS risk.

Tools

k6 — Load testing (AGPL-3.0)
Locust — Load testing (MIT)
Apache JMeter — Load testing (Apache-2.0)
Artillery — Load testing
Prometheus — Metrics / monitoring (Apache-2.0)
Grafana — Observability dashboards (AGPL-3.0)

Suggested Metrics

request_latency_p50_ms — Median end-to-end request latency.
request_latency_p95_ms — 95th-percentile latency; common SLO anchor.
request_latency_p99_ms — 99th-percentile latency; captures tail behavior.
request_rate — Requests per second per operation (RED — Rate).
error_rate — Share of responses with 5xx (RED — Errors).
saturation — Utilization of bottleneck resources — CPU, queue depth, connections (USE — Saturation).
apdex_score — Apdex score against a defined target latency T.

Example Implementations

Cloudflare — Publishes Server-Timing values and network performance dashboards.
Fastly — Real-time stats API exposing edge performance metrics.
Stripe — Public status and historical latency reporting for API operations.
GitHub — Published REST and GraphQL rate limits and status incident history.