Checklist 8 min read Intermediate

Cloudflare Load Balancing checklist

Use this checklist to prepare Cloudflare Load Balancing before production traffic changes. It covers origin inventory, pool design, health checks, steering policy, session affinity, failover testing, alerting, analytics, rollback, and managed operations handover.

Step by step

Step-by-step checklist

8 steps

1
Inventory origins per application: addresses or hostnames, regions, capacity, and which are active versus standby, plus the hostnames that will sit on Load Balancing.
2
Design origin pools: group origins into pools, set per-origin weights, define minimum healthy origins per pool, and decide pool ordering for failover.
3
Configure health checks ('monitors') per pool: protocol, path, expected status, interval, timeout, retries, and the check region(s), and confirm the origin firewall admits the health-check probes.
4
Choose a steering policy that matches the goal — off (failover/priority), random, dynamic (latency) steering, geo steering, or proximity — and document why for each load balancer.
5
Decide session affinity: whether requests must stick to an origin (cookie-based) and the affinity duration, and confirm the application's session model needs it before enabling.
6
Test failover deliberately: mark an origin or pool unhealthy and confirm traffic shifts to the next pool, healthy origins absorb load, and recovery re-adds origins as expected.
7
Wire monitoring and alerting: pool/origin health notifications, Load Balancing analytics, and Logpush, and define who is paged when a pool goes unhealthy.
8
Plan cutover and rollback: stage with low DNS TTLs, validate steering and affinity on a test hostname, and keep a documented revert to the prior routing before shifting production traffic.

Risk register

Risks to control

Health-check probes are blocked by the origin firewall, so healthy origins are marked down.

Allow Cloudflare health-check sources at the origin and verify each monitor's path, expected status, and region report healthy before relying on them.

A health check passes on a shallow path while the application is actually degraded.

Point monitors at a meaningful health endpoint that exercises real dependencies, and tune interval, timeout, and retries to detect genuine failure without flapping.

Steering policy does not match intent — e.g. latency steering chosen where strict priority failover was wanted.

Select the steering mode deliberately per load balancer (off/priority, dynamic, geo, proximity, random) and validate the routing it actually produces before cutover.

Session affinity is left off and stateful sessions break, or left on and skews load.

Decide affinity based on the application's session model, set an appropriate duration, and test sticky behavior and load distribution together.

Failover is never tested, so the secondary pool cannot actually carry traffic.

Deliberately fail an origin and a whole pool in a test window, confirm the next pool absorbs load within capacity, and verify recovery re-adds origins.

Pool capacity and minimum-healthy thresholds are misconfigured, draining a pool under partial failure.

Set per-origin weights, pool capacity, and minimum healthy origins to reflect real capacity so a single failure does not cascade.

Output

Useful deliverables

Origin inventory with addresses, regions, capacity, and active/standby roles per application.
Pool design: origin grouping, weights, minimum healthy origins, and failover ordering.
Health-check (monitor) configuration per pool with path, expected status, timing, and region.
Steering and session-affinity decisions documented per load balancer with rationale.
Failover test results covering origin-down, pool-down, and recovery behavior.
Monitoring setup: pool/origin health alerts, Load Balancing analytics, and Logpush.
Cutover and rollback plan with staged TTLs and a documented revert to prior routing.

Keep reading

Related resources

Cloudflare Migration Checklist

Cloudflare Migration Readiness Assessment

Cloudflare Cutover Runbook

Cloudflare Rollback Plan Template

FAQ

Frequently asked questions

Common questions teams ask when putting this resource into practice.

What is the difference between pools, origins, and monitors in Cloudflare Load Balancing?

Origins are the backend servers; a pool groups origins and holds capacity, weights, and minimum-healthy settings; a monitor is the health check attached to a pool that probes origins. A load balancer ties pools together with a steering policy and failover order.

Which steering policy should I use?

It depends on the goal. Off (priority) gives strict primary/secondary failover; dynamic steering routes by measured latency; geo and proximity steering route by location; random spreads evenly. Pick deliberately per load balancer and validate the routing it produces before cutover.

Do I need session affinity?

Only if the application requires requests from a user to stay on the same origin. Cookie-based affinity supports stateful sessions, but enabling it skews load distribution, so decide based on the session model and set an appropriate duration.

How are health checks configured so failover is reliable?

Each monitor defines protocol, path, expected status, interval, timeout, retries, and check region. Point it at a health endpoint that exercises real dependencies, ensure the origin firewall admits the probes, and tune timing to detect failure without flapping.

How is failover validated before going live?

By deliberately marking an origin and then an entire pool unhealthy in a test window, confirming traffic shifts to the next pool, the remaining healthy origins absorb the load within capacity, and origins are re-added cleanly on recovery.

Nanosek

Plan load balancing

Nanosek can turn this resource into a practical delivery plan for your environment — with rollback planning, stakeholder alignment, and 24/7 managed operations support.

Talk to Nanosek Browse resources