We designed, implemented, and validated security protection measures through hardware, firmware, software, ML, data layer, and communication. As a result, the Hive and the Queen perform essential missions in the fields, react to potential compromise, and work only with authenticated personnel.
Note: as the project has a certain level of sensitivity, some technical details are purposely omitted, but the high-level design is presented as is.
The Hive of devices communicates with the Queen server: sends telemetry data, receives firmware and software updates. The Queen stores information about devices, their activity (last time online), their status (active, passive, compromised), usage patterns (audit logs).
Designing security controls starts from threat modelling, especially for a large project like this one. During threat modelling, we focused on UX: how we can distinguish an actual operator from a competitor. It is how we thought about using multi-factor authentication with an external “activation” device.
Threat modelling showed risky areas:
- a device-as-a-blackbox (its hardware, firmware, software, data);
- a communication channel between the Hive and the Queen;
- receiving updates of data and software (which could be intercepted, stolen, or corrupted).
As the ecosystem will be further updated and improved, we pushed security practices to each step of the hardware and software life cycle. SSDLC includes security design documentation, threat modelling and intelligence, secure coding and CICD pipeline, and generally pushing security into developers’ minds.
We paid extra care to CICD security and automated security testing: scanners and linters, dependency and vulnerability management tools, memory and fuzzing tests on each pull request, automatic packaging and cross-version tests, and so on. CICD reduces mistakes by reducing the number of manual steps in the pipeline.
IIoT device provisioning pipeline
Devices come pre-assembled (which is out of the scope of this story), and the only things left are: to install and harden their firmware, install and configure software, download the latest ML models, and configure an external activation device (let’s say a USB stick).
The device provisioning pipeline—or a set of scripts, to put it simply—cares about each step.
The Linux hardening process is part of device provisioning. We’ve configured OS, hardened it, removed unused packages from the system, created and configured LUKS partitions, designed key lifecycle, implemented throughout logging and monitoring, anti brute force measures, restricted access, and honeypots to raise the bar for tampering.
The goal is to have a device that resists reverse engineering, works correctly day-by-day, and is easy to maintain by the [REDACTED] engineering team.
The Hive communicates with their Queen only when the internet connection is present. The main purpose of fleet management is to always know the latest status of devices, have access to their last logs, and react on unauthorised usage.
Application security measures cover device and fleet management software. The primary purpose of appsec in this project is to reduce the attack surface and ensure that access control measures cannot be easily bypassed.
Except for dozens of typical appsec measures (see OWASP ASVS), we focused on the security of downloading new ML models (to prevent tampering) and interactions with human operators (to prevent abuse).
Operating the Hive should be protected by multi-factor authentication: having physical access to the device alone should not be enough to use it. The users must present other authentication factors: passwords, PIN codes, or USB sticks.
Some events, like data decryption or triggering compromise sequences, are linked to the (in)correct user authentication.
Machine Learning security
ML model protection starts on a backend, where models are trained. Before ML models appear on devices during provisioning or updates, they are encrypted on a backend using unique encryption keys per device.
On the device, each ML model is re-encrypted for storage using separate device storage keys. Thus, reverse engineering one device won’t give access to keys or models used on other devices.
In addition, ML model’s weights are encrypted and obfuscated, making it pointless to intercept, as the model is “distorted” and requires decryption and deobfuscation before usage. This approach “assembles” an actual ML model in device memory right before execution.
Data at rest security
We configured LUKS for data at rest encryption, the partition decrypted only after successful authentication. We added additional application level encryption for application data: stored telemetry data, logs, ML models. To prevent tampering, we’ve added additional integrity checks for all data and signing for the data which is critical from trust / origin perspective.
Encryption and key management
While using a single design, the encryption system uses unique random long keys for each device. The key management scheme is built so that a functional encryption key appears only in device memory for a short time, being split into pieces in different locations.
Hive devices require lightweight cryptography, like AES-SIV, Super ChaCha or BLAKE2, suited for low-power devices. We build a cryptographic layer using a mix of cryptographic functions from Themis and LibSodium. We keep cryptography straightforward, using slightly different parameters on the device and the Queen backend.
We aim to reduce the risks of supply chain attacks, insider threats, and reverse engineering. The issue with IIoT devices is that it’s hard to patch them quickly—thus, the encryption scheme should be resilient and work for years.
Communication security spreads from “just TLS” to mutual authentication, TLS over VPN, and application level encryption of packets with sensitive data. These particular devices communicate over Wi-Fi or cellular networks.
Devices receive control commands and firmware / data updates from the Queen and send telemetry back. It was important to harden the protocol against active / passive MitM (encryption and service authentication), protect from replay attacks (sequences and sessions), and unauthorised “initiate self destroy sequence” commands to avoid bricking devices by a malicious actor.
Reverse engineering protections and self-destruction
We’ve equipped devices with ability to detect tampering and execute a self-destruction mechanism. Detection controls are present on several levels:
- Hardware triggers: someone opens a device box.
- On OS level: honeypots and brute force protections.
- On a software level: obfuscation, debugging detection, and honeypots.
The numerous reverse engineering defences are linked with a single reaction control. If a device notices compromise events, it triggers a self-destruct sequence that wipes and fills with zeros data and models, and sends the “I’m dead” beacon to the Queen.
A compromise event can be also triggered automatically based on device self-health checks, or remotely from the Queen.
Protection against side-channel attacks
Every working device produces enough information about its behaviour: warmth, power usage, connection signals, etc. While it’s complicated to eliminate side-channel attacks, we took specific hardware and software measures to make them less successful.
One of the examples is using constant-time operations and noise during cryptographic computations.
Protecting technology relies on real-world operational controls: we've designed processes for operators to use devices without exposing them to unnecessary risks in the field, because, you know, crop management and soil enrichment are hostile environments in some regions.