Performance
Defining the benchmark for performance of an API, providing an overview of how performance is approached and what it means, while also providing actual tests, results, and other evidence that demonstrates that performance is taken seriously.
Also known as: Latency, Throughput, SLO, Benchmarks
Example
Standards
- W3C Server-Timing header (W3C)
- OpenTelemetry / CNCF OpenTelemetry — HTTP semantic conventions
- OpenTelemetry / CNCF OpenTelemetry Metrics specification
- Prometheus / CNCF Prometheus exposition format
- Apdex Alliance Apdex
- Google (web measurement) Core Web Vitals
HTTP Headers
| Header | Direction | Spec | Description |
|---|---|---|---|
Server-Timing |
response | W3C Server-Timing | Communicates server-side timing metrics to clients. |
Timing-Allow-Origin |
response | W3C Resource Timing Level 2 | Permits cross-origin readers to access detailed timing values. |
Media Types
application/json— Benchmark result documents and SLO definitions.text/plain— Prometheus exposition format for scraping performance metrics.
OpenAPI Expression
-
x-performance(Vendor extension) — Points to benchmark reports, SLOs, or load-test artifacts for the API. -
info.x-sla(Vendor extension) — Custom marker linking the description to a published SLA/SLO document.
Governance Rules
oas-tag-description(Spectral built-in) — Tags should describe performance-relevant grouping (read-heavy vs. write-heavy).operation-operationId(Spectral built-in) — Stable operationIds are required to correlate benchmarks with the spec over time.
Risk & Compliance
Security: Performance instrumentation can leak internal topology via Server-Timing or trace identifiers; scrub upstream service names before exposing externally. Unbounded request shapes (large page sizes, deep filters) are both a performance and DoS risk.
Tools
- k6 — Load testing (AGPL-3.0)
- Locust — Load testing (MIT)
- Apache JMeter — Load testing (Apache-2.0)
- Artillery — Load testing
- Prometheus — Metrics / monitoring (Apache-2.0)
- Grafana — Observability dashboards (AGPL-3.0)
Suggested Metrics
request_latency_p50_ms— Median end-to-end request latency.request_latency_p95_ms— 95th-percentile latency; common SLO anchor.request_latency_p99_ms— 99th-percentile latency; captures tail behavior.request_rate— Requests per second per operation (RED — Rate).error_rate— Share of responses with 5xx (RED — Errors).saturation— Utilization of bottleneck resources — CPU, queue depth, connections (USE — Saturation).apdex_score— Apdex score against a defined target latency T.
Example Implementations
- Cloudflare — Publishes Server-Timing values and network performance dashboards.
- Fastly — Real-time stats API exposing edge performance metrics.
- Stripe — Public status and historical latency reporting for API operations.
- GitHub — Published REST and GraphQL rate limits and status incident history.
Related Properties
Tags
- Performance
- Load Testing
- Latency