Edge-LLM pilot to
fleet rollout across 1,200 sites.

Sector

Industrial / Edge AI

Timeframe

22 weeks · Q2–Q3 2024

Team

6 operators

Scale

1,200 sites

§ 01Challenge

The pilot worked on 8 sites. It failed on 1,200.

Northwind operates 1,200 industrial sites — manufacturing facilities, logistics nodes, maintenance depots — with heterogeneous hardware, inconsistent connectivity, and legacy SCADA systems running on infrastructure that cannot be replaced on a deployment schedule.

The original pilot had run on 8 hand-selected sites with reliable connectivity and compatible device classes. It worked. The expansion to 1,200 revealed three blockers that the pilot hadn't surfaced: the pilot model (8.4GB) exceeded memory limits on 40% of devices in the wider fleet; there was no offline fallback when connectivity dropped, which happened daily at roughly 20% of sites; and there was no fleet management plane — every model update required a physical site visit.

The mandate was to solve all three, in sequence, before any further rollout.

§ 02Approach

Rebuild for the worst site, not the best.

The first six weeks were model and architecture work. We distilled the original model to a 2.1GB variant — same task accuracy on the primary use cases, a 75% reduction in memory footprint — and redesigned the inference loop for offline-first operation: local queue, cache of recent context, sync protocol that reconciled with the central system whenever connectivity was available.

Weeks 7–10 were fleet infrastructure: a control plane for remote model versioning, rollout gates (a site had to pass a health check before receiving the new version), per-site telemetry piped into a central dashboard. Site visits were no longer required for updates.

The rollout itself was staged and gated: 50 sites, then 200, then 600, then full fleet. Each gate required the previous cohort to hit defined success metrics — inference uptime, sync reliability, accuracy on a held-out eval set — before the next cohort received the deployment.

§ 03What we shipped

A fleet-native edge deployment system.

The deliverables were three interlocking systems: the distilled model and offline-first inference runtime; the fleet control plane (versioning, rollout gates, remote health checks, site telemetry); and the staged deployment protocol with documented gate criteria for each rollout phase.

The control plane was built to be operated by Northwind's own infrastructure team — no dependency on Qualiax tooling or continued engagement for routine operations. Full documentation and a two-week handover period were included in the engagement scope.

§ 04Outcome

1,200 sites live. 17% operational efficiency gain.

The full fleet reached production within the 22-week engagement window. Mean inference time across all site classes was under 80ms — including the lowest-spec devices in the fleet. Inference uptime across the fleet averaged 99.1%, with the offline-first architecture absorbing connectivity gaps without user-visible interruption.

The 17% operational efficiency improvement measured at the original 8 pilot sites replicated at fleet scale. Northwind's operations team now manages model updates remotely via the control plane, with a typical site cohort receiving a new model version within 4 hours of release — down from weeks when physical visits were required.

Composite engagement based on representative project structures. Client and outcome details are illustrative.

← Back to selected work

§ 04Initiate

Tell us what you're
about to build.

We respond to every brief in 48 hours. Long shots welcome — we keep the calendar light on purpose.

Send a brief →or read the manifesto

Edge-LLM pilot tofleet rollout across 1,200 sites.