Cryptographic IP protection for AI/ML product
Brought to the limelight, the product team behind the [REDACTED]'s application was looking for a high-profile and heavy cover for their technology. This product uses AI/ML technology that alters media data – photos and videos. The technology is based on artificial neural networks, which are heavily optimised and improved.
The [REDACTED] was striving to safeguard their unique IP and asked Cossack Labs to improve AI/ML security and build ML protection technology. They got it by applying the defence in depth approach to protect their IP and sensitive parts of dataflow.
iOS native, Android apps
Python, Go backend
ML / TensorFlow
CCPA, GDPR, local privacy regulations
Encryption Export Regulations
Protecting unique IP (ML models) against leakage and misuse.
Soon after the launch, this highly sophisticated and powerful machine learning technology enjoyed tremendous viral growth and popularity. The team faced the need to optimise the system design under load, meanwhile, their app became an object of envy for attackers and plagiarism.
Overnight success turned into a challenge: how to secure this state-of-the-art tech without affecting the team and app performance, and stay adamant in meeting data security requirements for IP and PII. The team needed help in building specialized security defences to protect their ML models, APIs, and security coverage of the sensitive data life cycle across their apps, services, databases, and data lakes.
After careful study of companies that design security systems and work with cryptography, they asked Cossack Labs' engineers for security advisory and engineering.
IP protection system
The protection system for TensorFlow ML models should minimize their lifetime and make them difficult to misuse. This includes on-device protection and API anti-fraud system.
Security that doesn't ruin UI/UX
Security measures should be seamlessly integrated across mobile apps, API, and backend infrastructure. End users shouldn't feel the struggle.
Flexible cryptographic layer
Cryptographic layer should work across platforms and be easy to maintain, giving the Customer's team the necessary flexibility for improving their product.
Security as business value
Shotgun judgments and immediate decisions only do harm when trying to solve novel sophisticated problems. To ensure that we're focusing on issues of real relevance and priority to the Customer's business model, we started from risk assessment and threat modelling.
At this stage, the Customer's team got equipped with a risk analysis of their applications and infrastructure specific needs, as well as a security strategy, all allowing them to prioritize security measures.
Moving hand in hand with dev team
Then, together with the app team, we've focused on incorporating security into all steps of SSDLC: designing a well-rounded set of security controls and processes that enable IP protection, PII protection, and application security.
We've designed and built defence in depth security measures focused on ML models protection against IP leakage and reverse engineering techniques.
Elegant cryptographic scheme to link ML models with exact users:
- We built a cryptographic system with multi-layered encryption and a tailored key scheme that encrypts ML model per exact user.
- The cryptographic layer uses a combination of symmetric and asymmetric primitives (HPKE-like scheme), as well as a number of supporting crypto-schemes for various parts of the ML flow.
- To decrease server-side load and prevent building a complicated PKI architecture, we designed a key management scheme using ephemeral keys (no need to protect key storage if you don't store keys).
- ML models are encrypted on a server-side using unique keys per usage. ML models are stored encrypted until they get into the mobile app. Mobile app decrypts ML model and re-encrypts it for storage using the hardware-backed cryptographic module.
- Each ML model is encrypted as file and has layers of encrypted weights inside, making it useless to intercept.
- The cryptographic system is based on a free open-source cryptographic library Themis that provides a single API across programming languages while hiding cryptographic details under the hood.
- The resulting scheme encrypts ML model with DRM-like access control with encryption keys linked to the exact user on the exact device.
Reverse engineering protections on mobile devices:
- Built-in protections that prevent running the app on jailbroken/rooted device, or in debug mode. The reverse engineering protections are based on the latest iOS/Android behaviour and balance between paranoia and false positives.
- Except for using Keychain/Keystore, the hardware-backend cryptographic modules are used: Secure Enclave and hardware-backed Keystore. They provide more security guarantees for stored encryption keys used for model re-encryption.
- Mobile application utilizes device attestation API: SafetyNet for Android and App Attest for iOS to prevent installing the application from untrusted sources.
API protection and anti fraud system:
- Since nothing exists in a vacuum, we made sure that new IP protection controls rely on a solid foundation.
- We improved API security and user authentication, ensured that every API call is authenticated, introduced API limits, throttling, firewalling and monitoring.
- To minimize chances that ML models could be leaked from cloud storage, their TTL is set to minutes.
- We suggested and co-designed the anti-fraud system, which addresses both security concerns and prevents resource spending on malicious users.
- The anti-fraud system analyses user behaviour and stops serving ML models for malicious users. It accepts and analyses events coming from mobile apps and the backend.
Defense in depth security measures:
- We provided security recommendations helping to follow the defence in depth approach for sensitive parts of dataflow, to cover the applicable security standards (like OWASP MASVS 1.3 L2 & R and OWASP ASVS 4.0), and to prioritise security work based on risks&threats profile of the company.
Additional relevant materials
These conference slides explain more business and engineering details about the protecting ML models case. Anastasiia Voitova presented the talk at the OWASP London meetup. The YouTube video is available as well.
Products and services involved
Themis, a cross-platform crypto library
Themis is a cross-platform high-level open-source cryptographic library. We used Themis as a building block for cryptographic protocol, focusing on the data flow and performance while having cryptography covered.Read more
We've built risk, threat and trust models, analysed and prioritised attack vectors, planned security controls, and assisted with implementation and verification of controls.Read more
We've designed cryptographic protocol and key management layout for ML models encryption, assisted with implementation and verification.Read more
We've recommended numerous platform-specific security controls for mobile apps, assisted in improving backend API security and designing the anti-fraud system for protection against malicious users.Read more
The designed data security solution allowed to prevent stealing and misusing ML models (unique business IP) and lower operational costs by preventing malicious users from abusing the API and paid functionality.
Results and outcomes
The resulting IP protection system is multi-layered and runs through applications and gateways of the Customer's systems. It is designed to stay out of sight and not introduce any unnecessary discomforts for developers and end-users. It combines cryptography, mobile application security, API security and anti-fraud modules.
In contrast with many features, security is a context-dependent non-functional requirement that is not something that could be "finally done". Integrating security into the existing system often leads to re-engineering and optimizing some modules, improving them from both security and UX point of view.
As an additional benefit from our engagement, Customer's engineering team had the experience of building sophisticated security controls that run through mobile, backend and cloud.
Developers should not struggle with security
It's possible to build secure and usable systems without frustrating developers on each step. Introducing a data security layer is more than just deploying a docker container; it's shifting the engineering culture inside the company. Talk to us if you are looking to take your data security to the next level.