Introduction to automated security testing | Cossack Labs

🇺🇦 We stand with Ukraine, and we stand for Ukraine. We offer free assessment and mitigation services to improve Ukrainian companies security resilience.

List of blogposts

Introduction to automated security testing

Dangerous security bugs can sit in a code until someone finds them and turns into vulnerabilities that cost a piece of mind, budget or lives. To avoid a disaster, security engineers and DevSecOps engineers do their best to find and prevent weaknesses in software in the earlier stages of development.

Separate security testing tools and processes ensure that new commits and builds don’t introduce new security problems or bring back old ones as security regressions. It’s an essential practice for software with high demands to security.

cossack labs introduction to automated security testing

Over the years, we have developed an arsenal of approaches and tools to find and prevent defects early in the security software development process. In this article, we will try to get an overview of approaches and tooling worth looking at.

If you are interested to learn how we test our software — Themis and Acra — read Themis security testing doc and Acra security testing doc.

  1. Getting into automated security testing
  2. Testing security controls using DAST
  3. SAST/SCA: finding bugs in code
  4. Dependency scanning
  5. Using fuzzing to find vulnerabilities
  6. Performance testing for security purposes
  7. Incident recovery testing
  8. Сhallenges of automated security testing
  9. How we test
  10. Summing things up

Getting into automated security testing #

Every component of software contributes to its security or the lack thereof.

Most known security issues arise from undetected security bugs, because they do not affect software functionality and are not visible to the users. Everything seems to be working as intended, and buggy software gets shipped and used.

A good place to start with automated security testing is to find out what security controls are already present in your product (application, website, etc.). The next step is to understand and test how these controls are affected by execution flow and input data, whether they can be bypassed or fail unexpectedly.

Security tests aim to find gaps and check for the absence of certain behaviours and weaknesses known to lead to security risks.

The process of security testing can be divided into 4 categories:

  • Functional security tests verify that the security controls of your software work as expected (i.e. that JWT verification function returns an error if the date is expired).
  • Non-functional security tests look for known weaknesses and faulty component configurations (i.e. usage of weak cryptography, source code analysis for memory leaks and undefined behaviour, etc.).
  • Holistic security scanning looks for vulnerabilities across the entire attack surface, when an app or infrastructure is tested as a whole by offensive security testers and their tools.
  • Manual testing, architecture and code review—sophisticated work that still cannot be quite algorithmised and delegated to machines as it requires human attention.

As a part of the continuous integration (CI) and running tests repeatedly, automated security tests allow to identify and fix such issues as memory bugs, input bugs, performance-hindering issues, insecure behaviour, and undefined behaviour.

cossack labs security testing cryptographic library themis

Slide from @vixentael talk about maintaining cryptographic library Themis and its automated security testing layers.

Testing security controls using DAST #

The goal of dynamic application security testing (DAST) is to emulate attacks and identify potential vulnerabilities, treating the system as a whole. Vulnerability scanners can help with automating security testing by checking your apps and network for a huge number of already known risks. As a result, you’ll get a list of detected vulnerabilities and recommendations on how they can be patched or otherwise secured.

This is specifically relevant for software that is being composed rather than top-down written and consists of many services, libraries, and chunks of code. For the maximum impact, the infrastructures should be checked when they are complete and functional (live or near-live).

Examples include active/passive attacks on API calls wrapped in HTTPS, passing SQL injection patterns into user input, manipulating parameters to mount path traversal attacks, triggering security events, etc. OWASP Cheat Sheet Series give great prompts on this kind of research.

Dynamic application security testing tools #

You could go with DAST with the following open-source tools:

These tools are not a substitute for careful inspection by a security professional, but at least they ensure a certain level of formalizable verification for security controls.

DAST is a good trade-off between time consumption and severity of discovered vulnerabilities: all low-hanging fruits will be discovered by tools, while security engineers can focus on more complex and multistep issues.

SAST/SCA: finding bugs in code #

Buffer overflows and remote code execution are among the most dangerous and damaging security issues the code might create. Detecting memory problems, unidentified behaviour and other glitches that lead to attacks on execution flow can be automated through source code analysis (SCA) or static application security testing (SAST).

Modern high-level languages and platforms mostly shield developers from memory issues, however, libraries often stand on the shoulders of giants and reuse code in unsafe, low-level languages.

Static application security testing tools and resources #

Your starting points to automate this process should be: GitHub code scanning and GitLab code scanning, OWASP list of some SAST tools, and a NIST list of classic tools.

Secrets detection tools #

Secrets detection and their cleaning up before the code hits public is important to avoid sensitive data leaks. In essence, the goal is to make sure that no IP address, token, password or key gets leaked when the code is pushed into a public repository.

To automate the process and expand CI with it, these tools can be of help: Yelp detect secrets, Gitleaks, SecretScanner or git-hound.

cossack labs security testing cryptographic library themis CICD process

Themis is covered with multiple layers of tests, but different tests run on different CI systems: short tests run during PRs, long tests like fuzzing and integration run nightly.

Dependency scanning #

Often, modern software uses a lot of dependencies that use dependencies, and no one knows what exactly is hidden under 5 layers of dependency hell. Making sure that no vulnerable dependency sits somewhere down the line is essential.

We strongly recommend setting up the dependency management and vulnerability management process to prevent or quickly detect and mitigate potential security weaknesses. Vulnerable and Outdated Components is one of the Top10 security risks in 2021 according to the OWASP.

“Dependency management is critical to the safe operation of any application of any type. Failure to keep up to date with outdated or insecure dependencies is the root cause of the largest and most expensive attacks to date." — from OWASP ASVS V14.2.

NIST SP 800-218 SSDF v1.1 PW.4.1 emphasises the need to use well-secured libraries and periodically review them as a part of the SSDLC process. Dependency management process brings peace of mind to constantly-shifting ecosystems like iOS and Android, Node.js and Python.

react native libraries security: bad cryptographic libraries - cossack labs

Manual research of cryptographic libraries used in one fintech application: 25% are abandoned, 37.9% contain already known vulnerabilities, and 12.5% count more than 100 reported and unresolved issues.

Dependency scanning tools #

You can get some dependency checking for free via GitHub’s automated dependency checker Dependabot, using OWASP Dependency-Check, or trying out Snyk and Mend (ex WhiteSourceSoftware).

Using fuzzing to find vulnerabilities #

“Never trust your input” is one of the cardinal rules in computer programming.

Fuzzing is an automated process in software testing that takes advantage of this rule and searches for exploitable bugs through feeding random, invalid, and unexpected inputs to the tested software.

Fuzzing helps surface the vulnerabilities that would be undetectable otherwise. Compared with static analyzers, fuzzing is almost fully devoid of false positives as it executes actual code.

Fuzzing is especially useful for long-living and frequently used code modules—like libraries, reusable modules, system apps, security controls in your apps.

Security-wise, fuzzing is both about testing the security controls and testing memory behaviour, as bugs like famous Heartbleed (which combines both poor security controls and unexpected memory behaviour) could’ve been fuzzed.

For security testing, fuzzing is especially useful when it comes to feeding into the app pseudo-valid inputs that cross the trust boundary.

A trust boundary violation takes place when the tested app is made to trust the unvalidated data fed into it. This approach mimics an adversary trying to feed malicious content into the app to achieve privilege escalation or plain malfunction, crash, etc.

For example, check this Google repo where they posted fuzzing dictionaries for some popular tools.

Fuzzing allows finding even tricky vulnerabilities, but to get fuzzing done right, you need to understand what you’re trying to accomplish. Fuzzing tests must be well thought out, well planned, and well written.

Keep in mind that fuzzing can not be a magic bullet for all your automated security testing, but it will give you a lot. To get started with fuzzing, you may want to visit this curated list of fuzzing resources.

Read how we ensure security for companies in demanding markets.

Performance testing for security purposes #

Performance testing is something that doesn’t really spring to mind in the context of “security testing”, but performance reliability is really the first step towards ensuring a safe and secure functioning of a system.

One of the risks to consider is a denial of service caused by an overload or attacks. Regardless of the possible cause, it is necessary to have an exact estimation of the future calculated load level and the point beyond which the DoS happens, at the moment when the system is designed.

Testing results, recorded and published along with the characteristics of the testing platform, will be of help for the system architects.

Running performance tests on the target equipment after the installation and configuration of the software will also yield more precise threshold levels. This, in its turn, will help to configure the load-limiting and alert systems accordingly.

Quantitative results of performance testing #

Quantitative results of performance testing can show you the number of operations performed per unit of time, the number of errors during the execution of an operation, and the amount of resources necessary for each testing mode.

Tests could be run to find the “bottlenecks” and measure performance of particular modules, or to evaluate the interaction between separate modules.

Qualitative results of performance testing #

Note the changes in performance: a sudden spike in the app’s performance can serve as an indicator of an error.

A sudden decrease in the performance stats can be caused by errors leading to performance degradation and overload failures. A sudden increase in performance can indicate changes in the logic of the working pieces of code—i.e. an erroneous exclusion of the incoming data validation step.

cossack labs building secure and reliable systems example from the book

A good example of how performance tests discovered memory leaks. A piece from Building secure and reliable systems book.

Performance testing patterns #

Among the numerous testing patterns, these three are the most relevant to the subject:

  • Stress testing. A step-by-step load increase helps to identify the performance limits. It also allows evaluating the estimated nominal load level.
  • Endurance testing. This is a long-term software performance evaluation allowing to find memory leaks and cumulative errors.
  • Spike testing. Testing with sudden spikes in the load helps surface the problems that can arise during the breakdowns in the normal functioning of balancing systems, routing, and during (D)DoS attacks.

We recommend expanding the classic performance testing methods by combining them with other kinds of testing to get a more complete picture of how the system works.

The popular attack methods are often based on the attempts to cause (D)DoS through invoking non-standard operation modes. The attackers rightly assume that in this case, most products experience a striking performance drop and the deep branching of the program logic may be untested for such cases and, as a result, vulnerable.

Incident recovery testing #

Most likely, you do backups for the data inside your system, but don’t forget to test that backups have worked by restoring data.

Backup testing should include testing of physical recovery, virtual recovery, data recovery, and full application recovery.

In a perfect world, every backup should be tested after it’s created, but a more practical approach would be to include backup testing into the regular backup cycle or perform it after significant changes in the application or application data.

You shouldn’t blindly assume that backups “are working well”. A story of struggle and loss (of several hours worth of backups) would be the time when some untested backups failed at GitLab.

cossack labs themis automated security testing

Script-generated image of Themis testing steps: installation, running tests for each language, running security tests, fuzzers, building and installing rpm/deb packages.

Сhallenges of automated security testing #

Running security tests has its own issues and pitfalls.

Embrace the chaos #

While testing for vulnerabilities in your product, automated vulnerability tests are basically trying to wreak as much havoc and do as much damage as possible in the process. Still, it is better to see everything messed up and broken once, intentionally, and to fix it knowing that the worst had already happened, without terrible consequences and with your total control and blessing.

If the testing is being carried out on a live infrastructure, it can result in breaking down your application. Your email can be flooded, logs overflowing, sensitive links crawled and exposed for the whole world to see, and the server down.

The Principles of Chaos Engineering strongly prefers to experiment directly on production but minimising and containing the experiments.

Combine automated and manual testing #

Security testing is not a fully automated endeavour. Similarly to code reviews, having a human eye on security tests and code changes is a must: some behaviours just cannot be detected automatically.

Take your time #

Often, security tests take a long time to run, and sometimes, they take considerable time to accumulate the data. Sometimes, a security testing process leads to the detection of vulnerabilities not directly related to your code.

For example, once we had to deal with a change in compiler’s behaviour towards external dependencies, which could’ve manifested in a serious security issue if testing and benchmarking didn’t include volume tests on input. Hadn’t we tested this beforehand, this would be a ticking time bomb, even though Go is known to be an extremely memory-safe language.

How we test #

At Cossack Labs, working to build security products ourselves, we carry out automated security testing wherever we can, using both ready-made third-party testing suites and our own heavily customised solutions.

For example, we use automated testing of Themis—our multi-platform cryptographic services library, which is a foundation of many of our products. We heavily rely on memory tests, fuzzing, integration and compatibility tests. Themis product docs have a nice explanation of what tools we use.

Want to learn more about maintaining and testing cryptographic libraries? Check video or slides below from the talk by Anastasiia Voitova “Maintaining cryptographic library for 12 languages”.

We take a different approach with Acra database security suite. As Acra works with numerous databases and supports several SQL protocols, we focus on behaviour and integration testing. Read Acra product docs to learn details about Acra security testing.

Summing things up #

For the larger part of our work, security issues are the first that come to mind when we develop and test software. It is kind of backwards as compared to the non-security-related developer community that is focused on shipping fast (always), consistent (sometimes), and reliable (rarely).

Reading the news about yet another breach, the maxim “everything will be broken” rings as true as before, with no chance of changes in the foreseeable future due to the insulting carelessness ubiquitously practised security-wise.

Apart from usability trade-offs and plain sloppiness, it’s always a question of knowing how and what should be tested. Testing security controls and security components boils down to making sure that security controls behave as expected under chosen circumstances.

Well, now you’ve got a few reference points. You might also like our previous blog posts: classic backend security design patterns and 13 tips to enhance database and infrastructure security.

Also, check out Acra if you are interested in a data security tool that easily integrates with your product and encrypts sensitive data fields.

Contact us

Get whitepaper

Apply for the position

Our team will review your resume and provide feedback
within 5 business days

Thank you!
We’ve received your request and will respond soon.
Your resume has been sent!
Our team will review your resume and provide feedback
within 5 business days