The pilot worked on 8 sites. It failed on 1,200.
Northwind operates 1,200 industrial sites — manufacturing facilities, logistics nodes, maintenance depots — with heterogeneous hardware, inconsistent connectivity, and legacy SCADA systems running on infrastructure that cannot be replaced on a deployment schedule.
The original pilot had run on 8 hand-selected sites with reliable connectivity and compatible device classes. It worked. The expansion to 1,200 revealed three blockers that the pilot hadn't surfaced: the pilot model (8.4GB) exceeded memory limits on 40% of devices in the wider fleet; there was no offline fallback when connectivity dropped, which happened daily at roughly 20% of sites; and there was no fleet management plane — every model update required a physical site visit.
The mandate was to solve all three, in sequence, before any further rollout.
Rebuild for the worst site, not the best.
The first six weeks were model and architecture work. We distilled the original model to a 2.1GB variant — same task accuracy on the primary use cases, a 75% reduction in memory footprint — and redesigned the inference loop for offline-first operation: local queue, cache of recent context, sync protocol that reconciled with the central system whenever connectivity was available.
Weeks 7–10 were fleet infrastructure: a control plane for remote model versioning, rollout gates (a site had to pass a health check before receiving the new version), per-site telemetry piped into a central dashboard. Site visits were no longer required for updates.
The rollout itself was staged and gated: 50 sites, then 200, then 600, then full fleet. Each gate required the previous cohort to hit defined success metrics — inference uptime, sync reliability, accuracy on a held-out eval set — before the next cohort received the deployment.
A fleet-native edge deployment system.
The deliverables were three interlocking systems: the distilled model and offline-first inference runtime; the fleet control plane (versioning, rollout gates, remote health checks, site telemetry); and the staged deployment protocol with documented gate criteria for each rollout phase.
The control plane was built to be operated by Northwind's own infrastructure team — no dependency on Qualiax tooling or continued engagement for routine operations. Full documentation and a two-week handover period were included in the engagement scope.
1,200 sites live. 17% operational efficiency gain.
The full fleet reached production within the 22-week engagement window. Mean inference time across all site classes was under 80ms — including the lowest-spec devices in the fleet. Inference uptime across the fleet averaged 99.1%, with the offline-first architecture absorbing connectivity gaps without user-visible interruption.
The 17% operational efficiency improvement measured at the original 8 pilot sites replicated at fleet scale. Northwind's operations team now manages model updates remotely via the control plane, with a typical site cohort receiving a new model version within 4 hours of release — down from weeks when physical visits were required.
Composite engagement based on representative project structures. Client and outcome details are illustrative.