Hedronite · Synthesis Lesson · Dev · Go through δ (Chain) · Tue 2026-06-09

Go's httputil.ReverseProxy and Health-Aware Load Balancing for Chain RPC Endpoints

Backend pools, the height-lag gate, and retry-on-next-backend.

Lesson Class: Dev Synthesis
Language: Go (Lang-W1; Tue+Fri Go) — seventh Go lesson
Refraction: Go through δ (Chain) + DevOps
Word Count: ~2,480 prose + 5 clean code blocks
Paired Ops: RPC and Full-Node Infrastructure Operations for Sovereign Chains
Paired Cert: AWS Edge Networking and Multi-Region Resilience across SAP and DOP
Discipline: ROD v3 · clean code blocks (no inline comments)

§ IFrame

Today's Ops lesson named the health-gated RPC pool: a set of identical chain query nodes behind a load balancer that routes only to the nodes caught up to the chain head. The discipline was stated there in operator terms. The node that falls behind must be pulled from rotation, allowed to recover, and returned once its lag closes. This lesson builds that router in Go.

Go is the right language for the build. The standard library ships a production reverse proxy in net/http/httputil, the concurrency model makes a background health-checker a few lines, and the sync/atomic package gives a lock-free way to swap the in-rotation backend set while requests are reading it. A chain-RPC load balancer in Go is small enough to read in one sitting and complete enough to run in front of a real node pool.

The build has three pieces, each matching one concern from the Ops lesson. A backend pool that holds the set of query nodes and tracks which are healthy. A health-checker that polls each node's status, computes its height lag, and updates the healthy set. And the proxy front itself, which picks a healthy backend, forwards the request, and on failure retries against the next healthy backend rather than returning the error to the caller. The three pieces compose into one binary.

§ IILanguage Idiom — ReverseProxy, atomic.Value, and the Background Goroutine

Three Go idioms carry this build. The first is httputil.ReverseProxy. It is a struct, not a function. Its Director field rewrites an inbound request to point at a chosen backend; its ErrorHandler field runs when the forward fails. Constructing a proxy is filling in those fields. The proxy handles the streaming, header copying, and connection pooling; the operator supplies only the routing decision and the failure response.

The second is atomic.Value for the healthy-backend set. The checker writes a new set every interval; every inbound request reads the current set. The read path is hot and the write path is rare, so the lock-free read of atomic.Value fits the access shape. The discipline is to treat the stored slice as immutable: the checker builds a fresh slice and stores it whole, and readers never mutate what they load. Copy-on-write, swapped atomically.

The third is the background goroutine launched once at startup and left running for the life of the process. The health-checker is a for loop with a time.Ticker, polling each backend and storing the new healthy set. It owns the write side of the atomic.Value. Nothing else writes it. Single-writer discipline keeps the lock-free read sound.

§ IIICode Worked Example — A Chain-RPC Load Balancer

Start with the backend type. Each backend holds its upstream URL, a parsed proxy bound to that URL, and its last observed height. The pool holds all backends and the atomic healthy set.

type Backend struct {
	URL    *url.URL
	Proxy  *httputil.ReverseProxy
	height atomic.Int64
}

type Pool struct {
	all     []*Backend
	healthy atomic.Value
}

func NewPool(rawURLs []string) (*Pool, error) {
	p := &Pool{}
	for _, raw := range rawURLs {
		u, err := url.Parse(raw)
		if err != nil {
			return nil, fmt.Errorf("parse backend %q: %w", raw, err)
		}
		p.all = append(p.all, &Backend{URL: u, Proxy: httputil.NewSingleHostReverseProxy(u)})
	}
	p.healthy.Store([]*Backend{})
	return p, nil
}

httputil.NewSingleHostReverseProxy builds a proxy whose Director already rewrites the request to the backend's host. The pool stores an empty healthy slice at construction so the first read before the first health check returns no backends rather than a nil panic.

The health-checker polls each backend's Tendermint /status endpoint, reads the latest block height and the catching_up flag, records the height, and assembles the healthy set. A backend is healthy when it is not catching up and its lag against the pool maximum is within the threshold.

func (p *Pool) checkOnce(ctx context.Context, maxLag int64) {
	heights := make([]int64, len(p.all))
	caught := make([]bool, len(p.all))

	for i, b := range p.all {
		h, syncing, err := fetchStatus(ctx, b.URL)
		if err != nil {
			b.height.Store(0)
			continue
		}
		b.height.Store(h)
		heights[i] = h
		caught[i] = !syncing
	}

	var head int64
	for _, h := range heights {
		if h > head {
			head = h
		}
	}

	fresh := make([]*Backend, 0, len(p.all))
	for i, b := range p.all {
		if caught[i] && heights[i] > 0 && head-heights[i] <= maxLag {
			fresh = append(fresh, b)
		}
	}
	p.healthy.Store(fresh)
}

The checker computes the head as the maximum height across responding backends rather than trusting any single node to know the true head. A backend joins the fresh set only when it answered, is not catching up, and sits within maxLag of that observed head. The fresh slice is built whole and stored in one Store, so a reader sees either the old set or the new set, never a half-built one.

The status fetch is an ordinary HTTP call with a tight timeout. The timeout matters: a hung node must not hold the whole check loop, so the per-node fetch carries its own deadline.

func fetchStatus(ctx context.Context, base *url.URL) (height int64, catchingUp bool, err error) {
	ctx, cancel := context.WithTimeout(ctx, 2*time.Second)
	defer cancel()

	req, _ := http.NewRequestWithContext(ctx, http.MethodGet, base.String()+"/status", nil)
	resp, err := http.DefaultClient.Do(req)
	if err != nil {
		return 0, false, err
	}
	defer resp.Body.Close()

	var body struct {
		Result struct {
			SyncInfo struct {
				LatestBlockHeight string `json:"latest_block_height"`
				CatchingUp        bool   `json:"catching_up"`
			} `json:"sync_info"`
		} `json:"result"`
	}
	if err := json.NewDecoder(resp.Body).Decode(&body); err != nil {
		return 0, false, err
	}
	h, err := strconv.ParseInt(body.Result.SyncInfo.LatestBlockHeight, 10, 64)
	return h, body.Result.SyncInfo.CatchingUp, err
}

Tendermint reports the height as a string, so the decode pulls it as a string and parses it to an integer. The catching_up flag arrives as a real boolean and passes straight through.

The checker runs forever on a ticker, launched once at startup. It is the single writer of the healthy set.

func (p *Pool) RunHealthChecks(ctx context.Context, every time.Duration, maxLag int64) {
	ticker := time.NewTicker(every)
	defer ticker.Stop()
	p.checkOnce(ctx, maxLag)
	for {
		select {
		case <-ctx.Done():
			return
		case <-ticker.C:
			p.checkOnce(ctx, maxLag)
		}
	}
}

The loop runs one check immediately before entering the ticker wait, so the pool is populated within the first poll rather than after the first full interval. It exits cleanly when the context cancels.

The front is an http.Handler. It loads the current healthy set, and on an empty set it sheds load with a 503 rather than routing to a known-stale node. On a non-empty set it picks a backend, forwards through that backend's proxy, and installs an error handler that retries the next backend.

func (p *Pool) ServeHTTP(w http.ResponseWriter, r *http.Request) {
	backends := p.healthy.Load().([]*Backend)
	if len(backends) == 0 {
		http.Error(w, "no healthy backend at chain head", http.StatusServiceUnavailable)
		return
	}

	start := int(rand.Int63()) % len(backends)
	for offset := 0; offset < len(backends); offset++ {
		b := backends[(start+offset)%len(backends)]
		failed := false
		proxy := *b.Proxy
		proxy.ErrorHandler = func(w http.ResponseWriter, r *http.Request, err error) {
			failed = true
		}
		proxy.ServeHTTP(w, r)
		if !failed {
			return
		}
	}
	http.Error(w, "all healthy backends failed", http.StatusBadGateway)
}

The handler starts at a random index so load spreads across the pool rather than always hammering the first backend. On a forward failure the error handler sets a local flag instead of writing to the response, the loop advances to the next healthy backend, and the request is retried. Only when every healthy backend has failed does the caller see a 502. The empty-set 503 and the all-failed 502 are different signals, and the Ops dashboards read them differently: the first means the pool fell behind the head, the second means the nodes are at the head but unreachable.

The Idempotency Caveat A proxied request whose body has already been partially streamed to a failed backend cannot always be replayed cleanly. For the read-only JSON-RPC query traffic a chain endpoint serves, requests are idempotent and small, so the retry is safe. A production build that also proxies transaction broadcasts would gate the retry on request idempotency, exactly as the relayer lesson gated its three-process race on receiver-side deduplication.

§ IVConnection to Today's Ops Lesson

The Ops lesson named three operator concerns for the node fleet: identity, liveness, safety. This Go build encodes two of them directly. Liveness is the height-lag computation in checkOnce: a backend stays in rotation only while its height tracks the observed head. Safety is the empty-set guard in ServeHTTP: when no node is caught up, the balancer sheds load rather than serving stale state, which is the stale-answer discipline rendered as a 503.

The health gate the Ops lesson described in prose is the atomic.Value swap. The Ops lesson said a node that falls behind is pulled from rotation and returned once its lag closes. In code that pull-and-return is the checker storing a fresh healthy slice each interval; a drifting node simply stops appearing in the slice, and a recovered node reappears. There is no explicit eviction call, because the set is rebuilt from current truth every poll rather than mutated incrementally. Rebuild-from-truth is steadier than incremental eviction: a transient health blip self-corrects on the next poll with no stuck state.

§ VPrior-Lesson Reach

The errgroup and semaphore lesson (Go Sat 2026-06-06) built the relayer's worker pool as a tree of cooperating goroutines under a cancellation context. This lesson's health-checker is the simpler single-goroutine cousin: one long-lived loop under a context, exiting on cancel. Both share the discipline of a background worker owned by one launch site and torn down through context, not through a side channel.

The middleware-chains lesson (Go Thu 2026-05-28) composed request handling as a stack of wrapping handlers. The reverse proxy here is the terminal handler at the bottom of such a stack: a real deployment fronts this ServeHTTP with logging, rate-limiting, and metrics middleware before the request reaches the backend-selection logic. The load balancer slots into the middleware shape the prior lesson built.

The worker-pool router lesson (Go Mon 2026-05-25) routed inference requests across model backends under a cost-aware semaphore. The shape recurs: a pool of interchangeable backends, a routing decision per request, a health and capacity signal that gates which backends are eligible. The model router gated on cost and concurrency; this chain-RPC balancer gates on height lag. Same routing skeleton, different eligibility predicate.

§ VIClosing

A health-aware load balancer for chain RPC is three small Go pieces. httputil.ReverseProxy forwards the request and handles the wire mechanics. A background goroutine polls each backend's status, computes height lag against the observed head, and swaps a fresh healthy set into an atomic.Value. The front loads that set, picks a healthy backend, and retries the next one on failure rather than handing the error to the caller. The whole build is the Ops lesson's health gate made executable.

The Go idioms carry the weight that prose carried in the Ops lesson. The atomic copy-on-write swap is the pull-and-return discipline. The empty-set 503 is the stale-answer discipline. The retry loop with its idempotency caveat is the relayer lesson's deduplication shape, recurring because the problem recurs.

For now: read the ServeHTTP retry loop above and trace what the caller sees in each of three cases — a healthy pool, an empty pool, and a pool where every node is at the head but unreachable. Three different status codes, three different operator responses. Where your own balancers collapse those cases into one error, the operator loses the signal that tells them which failure they are in.

Paired Ops → δ-Chain/Synthesis-Lessons/2026-06-09-rpc-and-full-node-infrastructure-operations-for-sovereign-chains
Paired Cert → Cert-Prep/AWS/2026-06-09-aws-edge-networking-and-multi-region-resilience-route53-cloudfront-global-accelerator-and-the-dr-topology-across-sap-and-dop

🫡 ⚖️ 📜
Leo.Syri — Praetor Consulate, Imperium Luminaura
Filed 2026-06-09 Tuesday Fajr · Go through δ (Chain) · Lang-W1 (Tue+Fri Go) · seventh Go lesson
Backward-Synergy-Reach → errgroup/semaphore relayer pools (Go Sat 06-06) · middleware chains (Go Thu 05-28) · worker-pool model router (Go Mon 05-25)
HEDRONITE-AETHER-THEME v2.1 applied · metal-accent code-block borders per Go through δ-Chain pairing · clean code blocks (no inline comments)