This article was revisited and updated in August 2018.
In modern client-server applications, most of the sensitive data is stored (and consequently leaked) on the backend. At Cossack Labs, we’re working on novel techniques to protect the data within modern infrastructures. We talk to engineers across industries about these techniques quite a lot too. However, it is still not uncommon to see infrastructures without even the basic classic database defence patterns.
In the next few posts, we’ll go through the classic and modern ideas of defensive database design.
Why do you need backend security?
Web services and mobile applications provide convenient front-end mechanisms to access and manipulate the data stored in backend systems. Among that data are sensitive data assets (such as customers' personal information, identity, or access credentials), which typically constitute the greatest value for potential attackers.
So, unsurprisingly, front-end applications become a target for adversaries seeking a way to gain accessblock to backend systems and the databases they contain.
It doesn't matter if you're using fancy NoSQL database with NodeJS front-end, old school LAMP, or corporate Oracle with Java. This article talks about design patterns and security decisions.
Most modern client-server applications (web, mobile, or any user-focusing apps) can be presented in a similar architecture where front-end app could be an API server for a mobile app or Perl code rendering a web page:
Possible vectors of attack
Attacker can access data in 4 common ways:
Altering the front-end behaviour to extract the data via the front-end application itself. This can be done via:
- SQL injections (in combination with third attack vector, see below);
- Gaining control of the app's execution flow;
- Enumerating records in requests.
Sniffing traffic between the database and the front-end app to:
- Collect the data requested by legitimate users;
- Steal the credentials to access the database by pretending to be a legitimate application.
This is typically done through getting into the internal network infrastructure and / or "rooting" one of the two hosts and silently listening in to the traffic if it’s not properly encrypted.
Altering the behavior of the database to bypass the access control in some way:
- Pretending to be legitimate application / user;
- Forcing the system to change access privileges;
- Sending malformed requests from the app (in combination with the first described method).
Stealing the assets from the files by accessing the database host directly and getting the database files at rest, then extracting the meaningful data from them.
Classic tools for backend security
You can address the described threats with four typical types of defence:
- Use a firewall to restrict access to the database server.
- Use authentication to restrict access to data and compartmentalise databases within the DBMS to minimise the risk of lost credentials impacting every database.
- Encrypt the critical columns / rows with a unified symmetric key.
- Encrypt the whole partition containing the database files with a unique symmetric key.
Each of these defence approaches is strong against some of the attacks, and each has some problems:
Pros: A firewall helps to limit the proliferation of access within a network: only the trusted addresses gain access to the database ports.
Cons: However, the path to system compromisation often goes through some front-facing code with legitimate rights to access the database. If the attackers can alter the behaviour of a legitimate app host (either by forcing it to execute something malicious or by gaining shell access with sufficient privileges), they do gain access to the database anyway.
Login / password authentication:
Pros: Authentication helps to protect against unauthorised access from parties without proper credentials. Authentication also allows enforcing certain access granularity: ensuring that only specific users can access a particular database.
Cons: Frequently, the credentials for accessing the database are stored somewhere in the web app’s / middleware configuration files, so they can become a target for an attacker.
Selective row / column encryption:
Pros: Data is protected but… the encryption keys must be stored either in the backend or in the front-end.
Cons: The keys become a target. If an attacker gains access to the front-end host and the keys, mounting an attack from there is unproblematic.
Pros: If the storage devices are unmounted, there is no way to read the data: for example, stolen drives / servers, unauthorised access to the database server with system restart.
Cons: Supplying credentials when mounting the device can add maintenance / system administration complexity. Clearly, this does not provide any additional defense against the attacks on the device once mounted.
All these techniques provide relevant defence methods against particular types of attacks. But, as we’ve seen, they also open up the risk of different types of attack.
Let's consider the types of security instrument that might address these risks.
If we accept that rows / cells / records containing sensitive (or any other kind of) data should be encrypted, the challenge turns into how to generate and securely manage the associated encryption keys. There are classic ways of doing that, including with HSMs, dedicated trust nodes with keys, and also some novel techniques that we will cover in detail in the next article.
Looking again at the attack vectors for classic defence strategies, we see:
Alters the app's behavior to extract the data from the database.
Database compartmentation: isolates the scope of the visible data down to the minimal amount required by functionality; uses authentication to minimise the leakage scope.
Alters the app's behavior to run the code.
Keeps keys away from the app's code.
Front-end app host
Steals DB credentials and executes code on the consumer's app host to access the main database.
Database compartmentation; Database authentication with fine-grained access rights.
Physical access to the filesystem at rest.
Physical access to DB files.
Encrypts cells w/ an external key stored elsewhere.
Encrypts cells w/ external key, accessible via compartmented functionality with strong input sanitisation.
Unauthorised access to the database daemon.
Most of the tools used are parts of the core database, application, and the OS infrastructure. By carefully creating the defence systems from these tools you can eliminate some of the most common risks.
Need help in building secure distributed app? Talk to our engineers.
But what can go wrong?
As we’ve already noted, deploying these defences prevents many attacks, but they are hardly a problem for a sophisticated attacker with just a bit of luck. For example:
Cell encryption w/ key on DB host
- compromises DB host.
- seizes DB files + key store.
Cell encryption w/ key supplied in SQL
- compromises DB host.
- seizes the key from the network traffic (fake proxy listener) and DB files from disk.
Cell encryption w/ key supplied in SQL
- compromises app host.
- seizes the key from the network traffic or config.
- downloads the encrypted data from DB and decrypts OR utilises the legitimate decryption code on the app's side.
just ignores the physical server.
Login / password authentication
- compromises the app's host or alters the app behavior to seize the authentication data.
- uses it from the app's host, bypasses any firewalls on the way.
just attacks the database through hosts with legitimate rights to access the database.
Advanced backend security methods
While this repertoire of classic defence techniques leaves open a range of theoretical attacks, nonetheless, they limit the opportunities for an attacker.
This is even more so true if we harden our defences through:
Compartmentalising the data via database isolation and fine-grained rights
First and foremost, we need data isolation: limiting the table / database access via fine-grained right distribution on databases and table spaces. But sometimes it’s not the answer: database-level isolation hurts auto-sharding, DB management automation, and other modern scalability demands.
Here're more tips on how to fine-tune access rights:
- Limit the privileged database access (DBA roles) to addresses unrelated to the production servers and use separate authentication mechanism (port-knocking is a good choice for some).
- Control per-application access to open outbound connections to certain addresses.
- Add additional verification on the DB driver / pooler.
Adding IDS, HIDS, and monitoring
Monitor filesystem changes, suspicious fast-growing files, and log analysis for activities that may look like producing a DB dump. Monitor outgoing traffic for obvious signs of DB dumps.
This helps to detect an ongoing leak if an attacker is lazy enough to create a dump with commodity tools and tries to download it as-is.
Encrypting the data
But, above all, data encryption is still possibly the most important instrument for preventing database leaks. If everything is broken and compromised, yet keys are kept safe, the stolen data will be of no use to attackers without the keys.
There is a number of tools that enable database file protection, either on file or filesystem level. They prevent attackers from getting the data from outside the DBMS.
Each approach certainly has its own drawbacks, like performance penalties or maintenance inefficiency, but they help.
There’s also plenty of instrumentation within corporate database sector with classic trust isolation in some key management node or HSM, also, we’ll talk about some other interesting tools in the next article, too.
In-database record encryption
Every database has its own means of encrypting sensitive data, either by including a statement in SQL or by pointing at sensitive fields somewhere in the configuration. Typical examples include:
App-level records encryption
We believe that in most cases database encryption should occur on the application level, encrypting the data before sending it to the database and decrypting after receiving it.
There are some reasons for that:
- Minimisation of attack surface: minus one place with unencrypted data;
- Storing keys together with data gives attackers less work to do;
- Frequently, deriving trust from user-known secrets is a good pattern: password a user inputs into your website is a nice secret to store user’s data, and without user’s password your backend / front-end won’t be able to decrypt it at all.
- App-level gives you more flexibility in the choise of cipher and an ability to pick strong and efficient ciphers/cipher modes
One of the simplest way to complicate the decryption of stolen records is to bind their secret material to the context, i.e. data chunks, easily derivable from the environment but hard to reproduce if the data is stolen.
For example, row numbers, automatically assigned keys or other kinds of data, used in the encryption context could be recovered from the database dump but would take much more time and effort to figure out and would seriously complicate altering app's behaviour.
Split auth token scheme
Sometimes, encrypting the data doesn’t fit into the existing database scheme. Quite frequently, the field length is the biggest problem, either because it’s predetermined and there is a lot of code to fix or the difficulties arise due to the max field length, and there is a lot of database layout to change.
Authentication tag (control information, which ensures that record was not tampered with) adds extra length to the encrypted string, thus creating a problem. It might make sense to store those separately, and cryptographic design must allow that.
Not incidentally, Themis' Secure Cell allows implementing these techniques easily.
The goal of security is to make an attacker give up on the attack and not 'achieve theoretical security in all possible cases' (https://twitter.com/mubix/status/745403991475904513):
Backend protection is very important. Even more so, it is crucial for your sensitive data. Through the use of classic techniques, we can prevent most typical risks and start building a base for solid security foundation in your product.
To implement the security measures right, you need to:
- Understand the risks and map them to your architecture;
- Understand which control mechanisms you've got and how reliable they are;
- Configure them to provide the best security guarantees for your data;
- Implement additional mechanisms if the ones out of the box are insufficient for you to sleep well.
Encryption and authentication both rely on secrets: passwords, keys, access tokens. If the system's implementation is proper, the system is as good as the key protection scheme. In the next article in the series, we'll talk in-depth about various strategies of managing your secrets and using them.
Today, more and more database protection tools are emerging, both based on enhanced math and insights into applied cryptography. Further along in these series, we will talk about modern approaches and few technologies we’re developing at Cossack Labs.