Same 2D classifier task as the federated-Adam page, but a different
aggregation protocol: each worker locally tries K random flips, submits its
best (indices, values, Δloss), and the coordinator picks the
single best across all workers in the round.
Phase 2: workers bootstrap θ once via /api/tournament/snapshot
and maintain it locally. The coord ships only the accepted flip
(~36 B) per round — both uplink and downlink are now independent
of model size. Phase 3: bootstrap moves to a binary blob served from
an R2 bucket (versioned snapshots); the per-tick downlink stays
constant whether the model is 129 params or 2 B params.
Decision boundary. Same forward pass as the Adam demo, but θ is updated by accepting whichever worker's flip dropped loss the most. Green ticks at the chart's bottom edge mark rounds whose best proposal was applied (Δ < 0).
K = 8 local trials per worker · flip size = 4 params · σ = 0.15
Each worker round costs ~K=8 forward passes on a batch of 128. Convergence is slower than Adam (no gradient signal — just guess-and-check), but the uplink is constant in model size. Phase 2 will move the θ broadcast to R2 deltas so downlink scales the same way.