5 Sep 2019

Implementing End-to-End encryption in Bear App

Bear with us! 🐻

The latest release of a popular note-taking app Bear contains a new feature — end-to-end encryption of user notes. Cossack Labs team worked closely with the amazing Bear team to help deliver this feature. We are rarely allowed to disclose the details of our custom engineering work, but Bear team was awesome enough to let us highlight some important aspects of work done for them.

This is a technical article about the encryption scheme, key management process, and usability features behind this update. Yes, in this blog we often explain incredibly complicated high-level encryption schemes, but even a seemingly straightforward encryption of user notes, synchronisation, and key storage for a note-taking app is a serious undertaking if you want to do it properly.

If you’ve never tried Bear app, we recommend that you do. Bear app is featured by Apple as one of the competitors to Apple’s native Notes app and has won an Apple Design Award.

Source

The Bear app team values the privacy and information belonging to their users: they use CloudKit for synchronising notes between users’ devices, have no access to users’ notes, and don’t run their own backend servers. To provide better protection for the users’ notes, together with Bear engineers, we’ve implemented end-to-end encryption scheme. The trick was to build a reliable note encryption into the application flow in a way that doesn’t ruin the user experience.

Highlights of the encryption flow

  • Seamless integration in the user flow, including with fast synchronisation between multiple devices (when users switch devices, they continue to work with their notes as usual within the same account);
  • Smooth UX is a priority. Bear app doesn’t stress out the users asking to type their passwords every time. They use different techniques to provide a smooth experience (i.e. Touch ID / Face ID, multiple cache/keychain levels, silent auto-decryption, etc);
  • Deep support of Apple platform security features (biometric authentication, integration with SecureEnclave and iCloudKeychain, CloudKit security practices);
  • Strong cryptography under the hood: app uses Themis library (AES-GCM-256 with KDF, where each note of each user is encrypted with a unique encryption key);
  • Defense in depth approach: using multiple overlapping security practices to protect users’ data (cryptographically, appsec, and platform-specific features);
  • Encryption engine is easy to maintain and support by non-cryptographers (choosing Themis allows to forget about handling cryptographic details, minimizes the chance of making a mistake, and makes it easy to reuse the current solution on other platforms like Web/Electron app).

Bear the Guardian!

Starting point, goals, and concerns

Our goal was to protect user notes by making it impossible for Bear engineers, Apple, or a man-in-the-middle to read the notes. As Bear app was by no means created as a military/banking app, it didn’t have strict security requirements from the get-go and neither was it designed to be a “super secure data container”. It is an easy-to-use note-taking app with state-of-the-art data protection.

That’s why — instead of bullying users with ridiculous password rules (“12 characters, 1 upper case, 1 lower case, 1 emoji, 1 Chinese proverb”, etc.) — we’ve assumed that users will come up with bad passwords and/or reuse passwords with other services.

UX is important so we created a security scheme that is more complex from an engineering perspective, but less stressful for the app users.

We dedicated a lot of time and effort to the synchronisation process. Imagine that the user is writing a note on their iPhone, but wants to continue writing on their Mac. Bear syncs the encrypted note content and info about note’s encryption key, so the user can continue editing a note on their Mac right after proving their identity (by entering a password or using Touch ID / Face ID).

Risks to data, trust model, non-trusted environment

Considering the usability restrictions and what we know about the Apple stack, let’s define the trust model and risks to data. As we can’t protect everything everywhere, we need to define the most critical bits and most trusted environments.

Objects / Threats Access Disclosure Modification Access denial
note_text,
plaintext
Moderate
(note is inevitably
displayed on the device
screen at some point)
Critical
(this is exactly why we’ve built this thing)
Critical High
(losing note text —
making users angry)
user_passphrase,
password that user inputs
Moderate
(having password alone
won’t help to decrypt notes)
Critical
(users tend to reuse
passwords, we should
avoid having them
in plaintext)
Critical
(can’t decrypt
notes linked to
this password)
Critical
(losing password —
losing all encrypted notes)
note_encryption_key,
unique encryption key per each note
Moderate
(used for encryption
of one note)
Low
(used for encryption
of one note)
Low
(wrong key — decryption
of a specific note
is impossible)
Moderate
(lost key — decryption
of a specific note
is impossible)

user_passphrase is essentially a password that the user inputs inside Bear app to start using the e2ee functionality. Unfortunately, developers often tend to confuse the notions of “password” and “encryption key”, so for clarity’s sake, we use the term “passphrase” indicating that this is a raw string received from the user.

note_text and user_passphrase are pieces with the highest risk. Losing passphrase means we can’t encrypt/decrypt notes linked to this passphrase. That’s why, instead of using/storing original user passphrase, we derive unique encryption keys for each note and work with them.

Device storage Device process memory Device keychain & secure enclave Transport, iCloud database iCloud Keychain
Trust model Medium
(less trust to
jailbroken devices)
High
(we minimise
the amount of time
when plaintext notes,
encryption key, and
password are stored
in memory)
High
(less trust to
jailbroken devices)
Medium
(cloud environment
we can’t control)
Medium
(an attacker can
get access to
iCloud Keychain
by sending one
convincing phishing email)

Let’s rephrase this using plain language — we have more trust towards the data stored on the device than to the data stored in a cloud.

While jailbroken devices are still a risk, the attack vector is low: it’s unlikely that someone will jailbreak device of the target and tries to steal their Bear notes (but it is possible). However, there are different ways to break into target’s iCloud account or iCloud Keychain (using bug of iOS 13 beta or cracking macOS Keychain) so all the data leaving the device must be encrypted.

There are also ways to access the data of a local keychain, i.e. when users sell their phones without a proper cleanup. That’s why we won’t store any sensitive data in plaintext in Keychain ;)

Encryption vs locking. User passphrase vs encryption key

There’s a difference between encrypting notes and locking notes. Most applications only implement the locking feature because it’s simpler.

Locking a note means that an app doesn’t show a note until the user passes authentication (using Touch ID / Face ID, entering a pin code, etc.). Many apps lock their screen and ask the user for passphrase or biometrics to proceed.

Encrypting a note means that the app stores an encrypted note and the user needs to enter the correct password to decrypt this note (or to unlock the decryption key for decrypting the note, or even to decrypt the decryption key that decrypts the note).

Bear app implements both encryption and locking: selected notes are encrypted and users need to unlock the decryption key in order to decrypt the note content.

Hide Mom's best recipe from prying eyes!

User passphrase is not used as an encryption key for many reasons, mostly because any user-generated input won’t be “good enough” from the cryptographic point of view.

App uses key stretching functions, mathematical functions, static keys, and unique per-note data to derive different encryption keys for each note. 

More details below.

Cryptographic library: Themis

We use Themis as a cryptographic library because it follows Daniel Bernstein’s “boring crypto” concept: easy-to-use, hard-to-misuse, has secure by default settings (like built-in KDFs, supports AES-GCM, doesn’t allow developers to select key length or cipher mode), uses traditional cryptographic primitives from Open/Boring/LibreSSL, and supports similar high level API on 11 languages (which makes it very easy to maintain a project on various platforms).

Side note: When we were working on this project, CryptoKit was not released yet. We like CryptoKit for its API and ARM-specific optimizations and will take a closer look at it for implementing cryptographic engines in Swift-only projects.

Key material

Now, let’s define the key material: user inputs a user_passphrase, which is then used for derivation of a symmetric key that encrypts and decrypts the note. Each note has a reference to the SFPassword object that stores meta-information about how the note was encrypted. SFPassword object allows to unambiguously indicate which user_passphrase was used to encrypt that exact note. It stores encryption version, creation date, passphrase hint, and allows to quickly check if the user-entered passphrase is correct.

Note and SFPassword data model (simplified)

Pay attention to the fact that SFPassword doesn’t know the exact user_passphrase. Moreover, Bear app doesn’t store user_passphrase in plaintext even in Keychain because users tend to reuse their passphrases and because obtaining access to user_passphrase introduces a risk of decryption of all notes from this user on their every device.

Also, we don’t want to store the note_encryption_key for a long time because getting a hold of the note encryption keys will allow decrypting many notes. Thus, we need to have an intermediate key, called app_encryption_key, calculate the encryption keys before using them, and drop them immediately after using.

Encrypting notes

Each note has its own unique random note_encryption_id (generated as [[NSUUID UUID] UUIDString]) and it knows the ID of note_encryption_key that is used for encrypting it. note_encryption_id is used as “encryption context” (unique for each note) so the attackers need to know both note_encryption_key and note_encryption_id to decrypt the note. If the user marks the note as “encrypted”, Bear will encrypt the note right away and clean its plaintext.

Bear app uses Themis Secure Cell mode (AES-GCM-256 with NIST SP 800-108-based KDF (see ZRTP/RFC6189 and read this thread to learn why), random IV under the hood) to encrypt/decrypt the note content. The note_encryption_key is used as an encryption key associated with note_encryption_id as AD for encrypting/decrypting the note text.


encrypted_note = SecureCellSeal(data: note_text, context: note_encryption_id, key: note_encryption_key)
decrypted_note = SecureCellSeal(data: encrypted_note, context: note_encryption_id, key: note_encryption_key)

As random IV is used in Themis under the hood, the result of encrypting the same note is different for each encryption (meaning, if the user edits some note often, the encrypted bytes will be different after each edit).

Deriving note encryption key

The app minimizes the amount of time when note_encryption_key is stored in memory and never saves it. That’s why the app needs to calculate an intermediate key app_encryption_key first and then use it to derive the note_encryption_key.

app_encryption_key is derived from user_passphrase and some static non-secret data that app knows. If user comes up with a short and bad passphrase (as they surely will), the app uses Themis Secure Cell Context Imprint (AES-CTR-256 and KDF (which is specified by NIST SP 800-108 and well-described in ZRTP RFC) to create a deterministic encrypted key representation with key stretching.

Side note: Themis’ API doesn’t allow developers to explicitly call KDF because the maintainers believe that it requires an understanding of what KDF is and contradicts the “hard-to-misuse” principle. However, the maintainers might add API to use a modern KDF like Argon2 in the next versions of Themis (there won’t be a need to use workaround with AES-CTR).


long_data = user_passphrase + generated_passphrase_password + generated_app_context
app_encryption_key = SecureCellContextImprint(data: long_data, context: generated_app_context, key: user_passphrase)

note_encryption_key is different for each note even if users use the same user_passphrase:


long_data = app_encryption_key + generated_passphrase_password + generated_app_context
note_encryption_key = SecureCellContextImprint(data: long_data, context: note_encryption_id, key: app_encryption_key)

Side note: The app uses generated strings as secret keys built in the app. To make debugging and reversing more complicated, the app generates those strings right before usage and zeroes right after. This approach is described by Anastasiia Vixentael in her workshop for iOS developers.

Generating app-specific context:


- (NSData *)applicationSpecificContextForVersion:(NSNumber *)version {
    switch (version.integerValue) {
        case 1: {
            float res = (float) (1.21 / 62.3);
            return [[[NSString alloc] initWithFormat:@"%.3f%s%.5f%s", res, "i37fd29=", res, "|d45#jD", res] dataUsingEncoding:NSUTF8StringEncoding];
        }
        default:
            break;
    }
    return nil;
}

Not sure if your app is secure?
Consult with our engineers.

Storing user passphrase

As it was mentioned above, Bear app doesn’t store user_passphrase in plaintext but encrypts it using Themis Secure Cell Seal (AES-GCM-256) with generated keys (random for each user) before saving it to Keychain.


encrypted_passphrase = SecureCellSeal(data: user_passphrase, context: nil, key: generated_passphrase_key)
decrypted_passphrase = SecureCellSeal(data: user_passphrase, context: nil, key: generated_passphrase_key)

Storing passphrase hint

Users tend to forget their passwords so the app offers the user to assign a hint to help remember the passphrase. Users often create very obvious hints, but since a hint is usually considered to be a non-secret, this opens an easy-to-exploit vector for a possible attacker. To counter this, we add a new layer of defense and encrypt a hint.

Side note: “Defense in depth” is a security design approach in which sensitive assets are protected with multiple defenses. (Prize alert and activity suggestion — try to count how many defenses we’ve built around users’ notes in Bear app and ping @vixentael with your answer for a special something from us). You can read more about the defense in depth for data protection in our blog post or watch a talk by Anastasiia Vixentael.

As we’ve already mentioned above, each SFPassword stores an encrypted_hint. The app encrypts the hint after the user enters user_passphrase and hint and only decrypts it when user forgets a passphrase and wants to see the hint. The app uses Themis Secure Cell Seal (AES-256-GCM) and generated application-specific keys, which allows decrypting the hint at any moment of the app flow.


encrypted_hint = SecureCellSeal(data: hint, context: nil, key: generated_hint_key)
decrypted_hint = SecureCellSeal(data: encrypted_hint, context: nil, key: generated_hint_key)

Syncing SFPassword

Each SFPassword contains no sensitive data except for the (encrypted) passphrase hint. Bear app stores SFPasswords and syncs them between devices using CloudKit. Each SFPassword is immutable, with a creation date and a unique ID, so when users update their passphrase — new SFPassword is created. Having a reference tying each note to the corresponding SFPassword makes it easy to solve synchronisation conflicts when users update their passphrases.

Side note: You might wonder what is this “encrypted gibberish data” in SFPassword in the data model scheme above? Great engineers at ShinyFrog came up with a cool scheme for telling a user if their passphrase is correct in a fast way and without decrypting notes (which is handy as some notes can be large). SFPassword is created every time a user creates or changes their passphrase. Bear app generates a small (32 bytes) blob of random data and encrypts it with app_encryption_key derived from a fresh passphrase. Next time when a user enters a passphrase, app derives app_encryption_key using the same rules and tries to decrypt the “encrypted gibberish data” to check if the passphrase is correct. The app doesn’t check the content of the decrypted data but catches the decryption errors that might come from Themis.

Security vs usability challenge: multiple cache levels

When users start working on their encrypted notes, we don’t want to bother them with “enter your passphrase” prompt every time, therefore we try to minimise distractions for a user.

What user_passphrase encrypted
user_passphrase
app_encryption_key note_encryption_key
Stored in user’s mind or iOS Keychain, or password manager app SecureEnclave protected by biometrics and in iCloudKeychain if enabled short-term memory key cache while notes are unlocked calculated before usage, zeroed after usage
Derived from user’s mind or suggestion by iOS Keychain, or password manager symmetric encryption with generated by app keys key stretching from user_passphrase and generated by app keys key stretching from app_encryption_key and generated by app keys

Table of keys and caches (simplified version, the original one also includes generated keys)

Remember, Bear app doesn’t store the real plaintext user_passphrase because we know that users are quite happy to use the same “qwertyqwerty” password for many different services. So, first of all, the app uses built-in iOS Keychain functionality and suggests creating a strong passphrase and saving it in iOS Keychain or in password managing apps.

Next, after user inputs their passphrase the first time, app derives app_encryption_key and caches it and its corresponding SFPassword objects in memory (mutable array “encryption keys cache”). Cache lives for a short time until the notes are unlocked (and while the user session is valid). When a user wants to edit another encrypted note, the app takes app_encryption_key from the cache, derives note_encryption_key from it and encrypts/decrypts the note.

The next level of cache stores encrypted user_passphrase in key storage (SecureEnclave protected by biometrics and iCloudKeychain). After biometric authentication, the app generates a static key to decrypt user_passphrase. Next, it derives app_encryption_key and note_encryption_key and decrypts the note. If a user doesn’t allow the use of biometric data or doesn’t use iCloudKeychain, then the app doesn’t store encrypted user_passphrase and only relies on the “encryption keys cache”.

Cleaning up secret data

“Encryption keys cache” is cleaned up when the notes are locked or when the app is removed from memory (by a user or by iOS). As a developer, you can’t clean up secrets from SecureEnclave/Keychain when an app is uninstalled, but you can use a “trick” to remove data if the app was re-installed. First, save a flag in UserDefaults, and rely on it to clean up the keys from SecureEnclave as described in OWASP MSTG.


NSMutableData * encryptionKey = ...
[encryptionKey resetBytesInRange:NSMakeRange(0, [encryptionKey length])];

Zeroing secret data in the memory, not very helpful if you use strings

If a user has multiple devices and allows using iCloudKeychain — encrypted user_passphrase is never cleared (if you know a way to clean up data from iCloudKeychain if the app was removed, please share this secret knowledge with us).

Auto-locking timer

Caching encryption keys in the app memory makes them potentially accessible to attackers. That’s why the app has a “lock” button to manually lock the notes and invalidate caches. After a certain amount of time, it locks them automatically (locking also happens after user quits or removes the app from memory).

Defining the correct time interval is crucial for the balance between security and usability: the more keys are stored in memory, the easier it is to locate them. At the same time, we shouldn’t distract users too often.


#include 
#import "NSDate+Kernel.h"

@implementation NSDate (Kernel)

+ (NSDate *)currentKernelBootTime {
    struct timeval boottime;
    int mib[2] = {CTL_KERN, KERN_BOOTTIME};
    size_t size = sizeof(boottime);
    time_t now;
    time_t uptime = -1;
    (void)time(&now);
    if (sysctl(mib, 2, &boottime, &size, NULL, 0) != -1 && boottime.tv_sec != 0) {
        uptime = now - (boottime.tv_sec);
    }   
    return [NSDate dateWithTimeIntervalSince1970:uptime];
}
@end

Using monotonic clock. Using Swift? See this code. By the way, this Twitter discussion revealed that Telegram iOS app uses real date for locking, therefore users can backdate the system settings to unlock the app.

To calculate time when the notes should be locked, the app compares dates (current date >= locking date + locking interval), but the trick is to use monotonic clock instead of real time. The monotonic clock is a timer with constantly increasing values. It counts seconds after a device reboot so it is not affected by time zones and manual time change.

Bear app users can control the locking interval, which introduces a risk that an attacker will change the interval to have more time for reversing. So, instead of saving auto-lock settings in UserDefaults, the app saves them in the local Keychain and protects by biometrics (repeat after us — this is called “defense in depth approach”, not “paranoia”).

Failed attempts’ counter

Each time a user enters the user_passphrase incorrectly, the app increases the value on the failed attempts’ counter. When failed attempts counter >= max attempts counter (currently set to “3”), app blocks the ability for user to input user_passphrase for T seconds (currently T == 5). This is a typical approach for prevention of manual passphrase brute forcing.

When user inputs wrong passwords too many times, Bear disables password input for 5 seconds.

Compatibility & incident response

Imagine that vulnerability or bug is discovered in the encryption library or in the app — we’d need to update the application and to migrate the users to a new cryptographic core really quickly. Sometimes the errors could sit unnoticed for months (i.e. as in AFNetworking story), but it’s better to prepare in advance.


- (NSString *)decryptedStringForData:(NSData *)data
                           secretKey:(NSData *)secretKey
                             context:(NSData *)context
                   encryptionVersion:(NSNumber *)version
                               error:(NSError **)error {
    NSString *decryptedString = nil;
    switch (version.integerValue) {
        case 1: {
            TSCellSeal *cellSeal = [[TSCellSeal alloc] initWithKey:secretKey];
            NSData * decryptedStringData = [cellSeal unwrapData:data
                                                        context:context
                                                          error:error];
            
            if (decryptedStringData) {
                decryptedString = [[NSString alloc] initWithData:decryptedStringData
                                                        encoding:NSUTF8StringEncoding];
            }
            break;
        }
        default: {
            *error = [self errorWithDescription:SFWrongEncryptionVersionDescription 
                                           code:SFNoteEncryptionErrorWrongEncryptionVersion];  
        }
    }
    
    if (!decryptedString)  {
        if (error != NULL && *error == nil) {
            *error = [self errorWithDescription:SFImpossibleDecryptWithNoErrorDescription
                                           code:SFNoteEncryptionErrorDecryptionFailure];
        }
    }
    return decryptedString;
}

Version-specific decryption

That’s why each SFPassword object has a reference to a particular encryption version and the app checks the encryption version before trying to encrypt/decrypt the data. When Bear engineers decide to update the encryption scheme, they will introduce new encryption version, write code that implements changes, and push the app update. The updated app can support both encryption versions and can silently re-encrypt the users’ notes to the latest encryption flow.

In case of an urgent update, Bear app could ask the users to update to the latest app version and run the data re-encryption procedure explicitly on launch.

Applying "Defence in depth" approach, Bear app suggests users to set up app locking. Beap app protects users privacy asking for Touch ID after application launch, which is completely independent from notes encryption and locking.

Testing & testing & testing

How to find out if encryption is working well? Easy — create a bunch of unit tests with different kinds of data (including empty, malformed, and large), create tests with different user passwords. If you use a well-tested library, the library maintainers are likely to be supporting a large test suite already.

How find out if the application handles encryption well? In a bit more complicated manner — simulate user behaviour, simulate errors of reading the password from Keychain and simulate the synchronization problems between devices. For example, we created integration tests to check multi-device behaviour: encrypt some data on the iOS app and save it into a file, then read the file from the macOS app and decrypt the data.

Don’t forget to test the app behaviour in corner cases: different platforms (x32-x64 environments), different iOS/macOS versions, different encryption versions (what happens if Bear app with older encryption version receives the note encrypted by Bear app with a newer encryption version), connectivity issues, password changing flow (typical conflict: user has two devices disconnected from the internet, user changes note content on the first device and note password on the second device — what happens when both devices go online?).

Summary and outro

If you are lost in all the changes and details — go back to the "Highlights of the encryption flow" chapter. This cryptographic engine prioritises users’ comfort and also makes the app easy to maintain and support for developers.

We enjoy working with the Bear app team, they are dedicated to their project and extremely care about the users, their privacy, and data security. When building an encryption solution, we were taking into consideration their future plans — i.e. creating Bear web app and sharing encrypted notes between users.

Modern libraries hide the complexity of ciphers quite well and make encryption accessible to developers. The actual encryption of notes in this scheme takes less than 10 lines of code (calling Themis encryption/decryption functions and passing 3 arguments to each).

Other ~3000 lines are spent on key management, locking of notes, multiple caches’ structure, accessing SecureEnclave/LocalKeychain/iCloudKeychain, biometric authentication, locking timers, failed attempts’ counter, proper synchronization, and error handling.

As cryptographers say — “Encryption is easy, key management is hard” — every specific use case requires a specific solution that takes into account data model, risks, potential attack vectors, platform features, future plans, etc.

Learn more

No matter how complex from the outside looking in, cryptography is just a chapter in OWASP MASVS (#3 out of 8 chapters total). Even the best cryptography can fail if the basic security controls are misimplemented ;)

We focus on data protection to help companies mitigate their business risks. Our software tools, security engineering services, and secure software development training aim at enabling strong security for companies without a dedicated security team and with limited infrastructural ops resources.

With or without a goal to achieve some specific security/privacy compliance, we aim to minimize your security expenses and lift most of the application and data security burden off your team and make your users confident that their data is safe with you.

P.S. We can’t disclose all the details of the solution we’ve implemented for Bear app, but if you have questions or believe that you’ve found a security issue, please write us or contact the Bear team.