Classic backend security design patterns

In modern client-server applications, most of the sensitive data is stored and (consequentially) leaked lives on the backend. At Cossack Labs, we’re working on different novel techniques to help protect data within modern infrastructures and we talk to engineers across industries about these techniques quite a lot. However, it is still not uncommon to see infrastructures without even basic classic database defense patterns.

In the next few posts, we’ll go through classic and modern ideas of defensive database design. 

Why do we protect data on backend systems?

Web services and mobile applications provide convenient front-end mechanisms to access and manipulate data stored in backend systems. Among that data are sensitive data assets (such as customers personal information, identity or access credentials), which typically represent the greatest value for potential attackers. So, unsurprisingly, front-end applications become a target for adversaries seeking a way to gain access to back-end systems and the databases held there. 

It doesn't matter if you're using fancy NoSQL database with NodeJS front-end, old school LAMP or corporate Oracle with Java. This article talks about design patterns and security decisions. Most modern client-server applications (the web, mobile, or any user-focusing apps) can be presented in a similar architecture, where front-end app could be API server for mobile app or Perl code rendering web page:

Possible vectors of attack

For an attacker, there are 4 general ways to gain access to data:

1. Alter front-end behavior in a way which allows to data to be extracted via the front-end application itself. This can be done via:

  • SQL injections (combination with third attack vector, see below)
  • getting control of app execution flow
  • or enumerating records in requests.

2. Sniff traffic between database and front-end app to:

  • collect data requested by legitimate users;
  • steal credentials to access the database pretending to be a legitimate application.

This is typically done by getting into internal network infrastructure and/or rooting one of two hosts, and silently listening to traffic, if it’s not encrypted properly.

3. Alter database behavior to bypass access control in any way:

  • pretending to be legitimate application/user
  • forcing the system to change access privileges
  • sending malformed requests from the app (in combination with method 1 above).

4. Steal assets from files by accessing the database host directly and getting database files at rest and extracting meaningful data out of it.

Classic tools to mitigate the risk

Protecting against these threats, there are four typical defenses:

  • Use a firewall to restrict access to the database server.
  • Use  authentication to restrict access to data and compartmentalize databases within the DBMS to minimize the risk of lost credentials impacting every database.
  • Encrypt  critical columns/rows with a unified symmetric key.
  • Encrypt the whole partition containing the database files with a unique symmetric key.

Each of the approaches is strong against some attacks, but has a number of problems:

Firewall:

  • Pros: A firewall helps to limit the proliferation of access within a network: it makes sure only trusted addresses gain access to the database ports. 
  • Cons: However, frequently the path to compromise the system is from some front-facing code, which has legitimate rights to access the database. If attackers are able to alter the behavior of legitimate app host (either by forcing it to execute something malicious or by gaining shell access with sufficient privileges), they do gain access to the database anyway.

Login/password authentication: 

  • Pros: Authentication helps to protect against unauthorized access from parties, which don’t have proper credentials. Authentication also allows enforcing certain access granularity: ensuring that only specific users can access a particular database.
  • Cons: Frequently the credentials to access the database are stored somewhere in web app’s/middleware configuration files, so they themselves become a target for an attacker. 

Selective row/column encryption:

  • Pros: Data is protected but … the encryption keys must be held either on the backend or the frontend. 
  • Cons: The keys become a target. If an attacker gains access to the front-end host and the keys, mounting an attack from there is unproblematic.

Partition encryption: 

  • Pros: if the storage devices are unmounted, there is no way to read the data - for example, stolen drives/servers, unauthorized access to the database server with system restart.
  • Cons: Supplying credentials when mounting the device can add maintenance/system administration complexity and clearly this does not provide any additional defense against attacks on the device once mounted.

These techniques all provide useful defenses against particular types of attacks. But, as we’ve seen, they open up the risk of different types of attack. Let's consider the types of security instrument that might address these risks.

If we accept that rows/cells/records containing sensitive (or any) data should be encrypted then the challenge becomes how the associated encryption keys are generated and managed securely. There are classic ways to do that, including HSMs, dedicated trust nodes with keys, there are some novel techniques, we will talk about that in greater depth in the next article. 

Looking again at attack vectors classic defense strategies we see: 

Infrastructure component

Attack 

Classic defences

Front-end application

Alter app behavior to extract the data from the database.

Database compartmentation: isolate scope of visible data to minimal functionally required amount; use authentication to minimize leakage scope.

Alter app behavior to run the code.

Keep keys away from app code.

Front-end app host

Steal DB credentials and execute code on consumer app host to access the main database.

Database compartmentation; Database authentication with fine-grained access rights.

Database host

Physical access to the filesystem at rest.

Encrypt partition.

Physical access to DB files.

Encrypt cells w/ external key, stored somewhere else.

Database software

SQL injection.

Encrypt cells w/ external key, accessible via compartmented functionality with strong input sanitisation.

Target network

Unauthorized access to database daemon.

Firewalls; Passwords.

Most of the tools used are part of core database, application and OS infrastructure. By carefully composing the defense systems from these tools, you can eliminate some of the most common risks. 

But what can go wrong?

As we’ve already noted, deploying these defenses prevents many attacks, yet is barely a problem for  sophisticated with a few drops of luck attacker. For example: 

Available defence

Attack trajectory

Cell encryption w/ key on DB host

  1. compromise DB host
  2. seize DB files + key store

Cell encryption w/ key supplied in SQL

  1. compromise DB host
  2. seize key from network traffic (fake proxy  listener) and DB files from disk

Cell encryption w/ key supplied in SQL

  1. compromise app host
  2. seize key from network traffic or config
  3. download encrypted data from DB and decrypt OR utilize the legitimate decryption code on the app side

Partition encryption

Just ignore the physical server

Login/password authentication

  1. compromise app host or alter app behavior to seize the authentication data
  2. use it from app host, bypass any firewalls on the way.

Firewall

Just attack the database from hosts, which have legitimate rights to access the database

Classic methods: hardcore edition.

While this repertoire of classic defense techniques leaves open a range of theoretical attacks, they none-the-less limit the opportunities for an attacker. This is even more so if we harden our defenses by:

Compartmentalizing the data via database isolation and fine-grained rights

First and foremost we need data isolation: limiting table/database access via fine-grained right on databases and table spaces. Sometimes it’s not the answer - database-level isolation hurts auto-sharding, DB management automation, and other modern scalability demands, but .

Fine-tuning access rights: more tips

  • Limiting privileged database access (DBA roles) to addresses unrelated to production servers and involving separate authentication mechanism (port-knocking is a good choice for some).
  • Controlling per-application access to open outbound connections to certain addresses.
  • Adding additional step of verification on DB driver/pooler.

Adding IDS, HIDS and monitoring

Monitor filesystem changes, suspicious fast-growing files, log analysis for activity that may look like producing DB dump. Monitor outgoing traffic for obvious signs of DB dumps. This would help to detect an ongoing leak if the attacker is lazy enough to create dump with commodity tools and try to download it as-is. 

Encrypting the data

But, above all, data encryption is still possibly the most important instrument in preventing the database leaks. If everything is broken and compromised, yet keys are kept safe - stolen data is of no use for attackers without the keys.

Storage encryption

There is a number of tools which enable database file protection - either on file or filesystem level. They will prevent attackers from getting the data from outside the DBMS,- each approach with it’s own drawbacks like performance penalties or maintenance inefficiency, but they help.

There’s plenty of instrumentation within corporate database sector with classical trust isolation in some key management node or HSM, and there are interesting novel tools we’ll talk about in the next article.

In-database record encryption

Every database has it’s own means of encrypting sensitive data, either by including statement in SQL or by pointing at sensitive fields somewhere in configuration. Typical examples include: 

App-level records encryption

We believe that in most cases database encryption  should occur on an application level, encrypting before sending it to the database and decrypting after receiving. There are some reasons for that: 

  • Minimize attack surface: minus one place with unencrypted data;
  • Storing keys together with data gives attackers less work to do.
  • Frequently, deriving trust from user-known secrets is a good pattern: password user inputs into your website is a nice secret to store user’s data, without user’s password your backend/front-end won’t be able to decrypt it at all.
  • App-level gives you more flexibility in cipher choice and ability to pick strong and efficient ciphers/cipher modes

Apart from “just encrypting” the data, there are some important techniques and considerations:

Context-aware encryption

One of the easiest things to complicate decryption of stolen records is binding their secret material to context: data chunks, which are easily derivable from the environment, but hard to reproduce if the data is stolen. For example, row numbers, automatically assigned keys or other kinds of data, used as encryption context, could theoretically be recovered from the database dump, but would take much more time and effort to figure out, and would seriously complicate altering app behavior sometimes. 

Split auth token scheme

Sometimes, encrypting data doesn’t fit into existing database scheme. Quite frequently, field length is the biggest problem, either because it’s predetermined and there is a lot of code to fix or is up to max field length, and there is a lot of database layout to change.

Authentication tag (control information, which ensures that record was not tampered with) adds extra length to the encrypted string, thus creating a problem. It might make sense to store those separately, and cryptographic design must enable that.

Not incidentally, Themis Secure Cell allows implementing these techniques easily.

It is useful to remember that goal of security is for attacker to give up the attack, not 'achieve theoretical security in all cases' (https://twitter.com/mubix/status/745403991475904513):

Summary

Backend protection is very important. Even more so, it is crucial for your sensitive data. Using classic techniques, we can prevent most typical risks and start building a base for really solid security foundation in your product. To implement security measures well, you need to:

  1. Understand the risks and map them to your architecture
  2. Understand which control mechanisms you've got and how reliable they are
  3. Configure them to provide best security guarantees for your data
  4. Implement additional mechanisms, if what's available out of the box is insufficient for you to sleep well.

By the way, we can help you with understanding what security measures you need and how to implement them best, just drop us a message.

Coming next:

Managing secrets

Encryption and authentication both rely on secrets: passwords, keys, access tokens. If the system's implementation is proper, the system is as good as key protection scheme. In the next article in series, we'll talk in-depth about various strategies of managing your secrets and using them.

Modern techniques

Today, more and more database protection tools are emerging, both based on enhanced math and insights into applied cryptography. Further in these series, we will talk about modern approaches and few technologies we’re developing at Cossack Labs.

Copyright © 2014-2017 Cossack Labs Limited
Cossack Labs is a privately-held British company with a team of data security experts based in Kyiv, Ukraine.