Performance Testing

Performance testing can better reflect real-world scenarios, simulating high-volume traffic and concurrent requests, should that be required. This makes performance testing different from other types of testing, which do not necessarily take traffic and many requests into consideration.

AquaQ Performance Interface — _{An example of Load Testing in BlazeMeter. Source: https://www.blazemeter.com/blog/how-analyze-results-load-test-using-blazemeter-0}

Performance Testing is only one type of testing. There are many other types of testing that should generally take place before Performance Testing is even considered. Examples of this would be functional and non-functional testing such as unit testing, integration testing, functional testing, end to end and acceptance testing.

Unit testing is one of the most common types of testing where segments of code are tested locally, or within a repository. A benefit of this is that these tests can be ran quickly, and typically ensure that the logic of the code is sound. A downside is that you’re testing the code directly, and this may not reflect real world performance.

Integration is another common kind of testing, somewhat similar to unit testing, but instead of testing individual fragments of code, you are testing a combination of software modules instead. Typically, this should take place after unit testing, to ensure that previously tested code now works in tangent with others.

Functional testing is a type of testing where instead of testing individual parts of code, you are instead testing the application against functional requirements. Unlike the previous types of testing, this can be done by more than just a developer. Often times, functional testing is done by QA (Quality assurance), and others such as business analysts and product owners. Testing based on requirements is generally supposed to ensure that features a user would expect to work, do work.

Like functional testing, End to End testing can be done using the application directly. Instead of testing specific requirements, this entails testing the entire application from beginning to end to ensure that the application acts and flows as expected. This kind of testing can identify issues that may not have been discovered during functional testing, as you may be testing functionality that encompasses many requirements at once.

Acceptance testing, such as User Acceptance Testing (UAT) is typically undertaken by someone who is not directly involved with the creation of an application. For example, an end user or someone within the client side of the product. UAT is generally done in the final phase of testing before a production release. Ideally, any bugs should have been identified and fixed before this phase, but this is not always the case.

A few general examples of performance testing may be the following:

Capacity Testing

Targeted at testing whether the application can handle the amount of traffic that it was designed to handle. Capacity testing is typically used to benchmark and test the number of users or actions that can be handled under a certain test or given set of circumstances. Essentially capacity testing ensures that the system can handle the amount of users and actions that it is designed to handle. Using this can also allow you to increase the scale of the application gradually before user experience is affected negatively.

Capacity testing could be used to identify bad code, or code that could be better optimised to run faster or handle a large amount of traffic or actions more efficiently. Capacity testing is typically done within the design phase. Stress Testing could be considered a more intense version of this.

Example: Ensuring that a web page can handle hundreds or thousands of users at once without affecting overall site performance.

Load Testing

Load Testing checks the application’s ability to perform under anticipated user loads. The objective is to identify performance bottlenecks before the software application goes live. Load testing is different from Stress Testing, as instead of purposely overloading the system, you are ensuring that it performs well under an expected load. Load testing is also different from capacity testing, as it is testing the code itself in order to find bottlenecks and badly written code.

Doing this may identify flaws and weak spots in the code which may have not allowed the application to adapt to the sudden change.

Volume Testing

Software testing that is performed to test the performance or behaviour of the system or application under the huge amount of data. Rather than simply users, volume testing focusses on data itself.

Large amounts of data may be populated in the database and the overall software system’s behaviour is monitored. The goal of volume testing is to check the application’s performance under varying database volumes.

Databases are very often used by a lot of applications and websites for querying and storing data, and a large application may have many requests at one given time. Any data loss or mix up of data could cause serious issues to the performance and integrity of an application. For this reason, volume testing is important to ensure that database write/read actions performance as expected, and the application itself reflects this correctly.

Volume Testing may identify many performance related weaknesses or strengths in an application. E.g., does the system have enough resources to perform data related transactions as expected? Are related timestamps recorded accurately? Is the application attempting to read data before it is written?

Example: How efficiently does the application handle multiple requests to a database and how quickly is this data reflected on the UI, does this cause many delays?

Stress Testing

Software testing that verifies stability & reliability of software application.

Generally, within stress testing, you would be deliberately trying to overload an application. This could be done to breaking point or nearly breaking point. This may be far beyond what would be expected in an actual production environment but could be a safety net for an unexpected number of users or requests in future. It is also common within stress testing to overload an application, then seeing how it acts going back to normal loads afterwards. Stress testing can also be used to ensure that a system fails gracefully when put past the limit.

If the application can handle a lot under load, this would prove how robust it is and should give a benchmark as to how much it can really handle under load. While doing this, the results should be observed.

Automation in Performance Testing

A lot of the tasks above may be automated, such as emulating hundreds or thousands of users, pushing the application close to its limit. Without automation, doing this within QA would be virtually impossible unless you have a ridiculous number of users to test at one time. However, this is a scenario that may be likely to happen in a production environment. For this reason, a level of automation is required to cover as many scenarios as possible.

Tools

There are a few different tools available that can be used for performance testing. Some of these tools can also be used for other types of testing too.

JMeter

Apache JMeter is one of the most popular tools for load testing. This isn’t surprising as Apache have a lot of other popular software and tools such as Apache Hadoop, Apache HTTP Server and others. JMeter is also very popular as it is open source and very versatile. It supports many protocols such as HTTPS, FTP, LDAP, SOAP and others.

JMeter is also scripted in Java, giving it a familiar advantage over other tools with lesser-known languages. This would also allow Java developers who are familiar with the language to pick it up faster than other testing tools. JMeter can be used to test both dynamic and static resources.

JMeter is also known for having an easy to use GUI, another popular feature which a lot of other testing tools are currently lacking.

Loadrunner

Loadrunner is another testing tool that can be used for performance testing. Loadrunner is mainly web testing focussed but can be used for testing legacy system software and others. Loadrunner is a performance testing tool used for detecting and preventing performance issues within web applications, typically used within development of the application.

An advantage Loadrunner has over some other testing suites is that it includes advanced scalability forecasting features. This allows users to see an accurate view and predict the up-scaling costs in relation to both software and hardware.

Unlike JMeter however, it is not open source. It is owned by HP (Hewlett Packard), and the code behind the inner workings cannot be seen. This may not be an issue with some users though, as it covers and features a lot of high-level testing options.

Gatling

Gatling is another open-source performance testing framework. It is written in Scala, a powerful language, but not one with as much as an audience as a language such as Java. This may cause a steeper learning curve in relation to other testing frameworks, but there is a lot of documentation available and threads on StackOverflow of answered questions.

Just like languages such as Java and Kotlin, Scala runs on any machine that can run a JVM (Java Virtual Machine), this allows a developer to move easily from one machine to another to continue writing tests with little issue.

Gatling has metrics out-of-the-box, without requiring any additional plugins or add-ons. After each performance test, a HTML document is created which is a report and overview containing the results of the test. These can be saved easily and can be compared with other previous reports. This is one of the most popular features of Gatling.

AquaQ Gattling Framework — _{A view of various test metrics built into Gatling. Image from: https://www.blazemeter.com/blog/eight-reasons-you-should-use-gatling-for-your-load-testing}

As well as out-of-the-box metrics, Gatling also has built in integration with CI (Continuous Integration) Pipelines. This is a very useful feature, especially now where more and more projects are moving towards a CI direction. Jenkins supports Gatling through a Jenkins Gatling plugin.

LoadNinja

LoadNinja is a relatively newer testing framework. Unlike some of the testing frameworks mentioned above, it does not use protocol testing, but instead spins up instances of web browsers for testing.

This has its own advantages and disadvantages. A big advantage of spinning up a browser instance is that it can accurately simulate an end-user experience instead of just making web requests. Each of these is recorded in a report too, making it possible for a developer to see potential issues or bottlenecks. However, spinning up multiple web browser sessions like this can cause bottlenecks and performance issues, especially if using resources that are limited or not very powerful.

_{LoadNinja’s record and replay screen. Image from: https://loadninja.com/articles/loadninja-vs-jmeter-when-to-use-each-of-them/}

Rather than being an alternative to a testing suite such as JMeter, LoadNinja could instead be used alongside it to increase test coverage. Browser tests may be ideal for less frequent runs, while protocol tests can be ran multiple times with a lower strain on resources. It may be ideal to sign-off on testing after confirming both protocol and browser test plans.

Conclusion

Many issues may be discovered by performance testing, and these should be dealt with accordingly. For example, issues identified within volume testing may signal issues such as slow servers or hardware in relation to database querying. An issue like this may even be caused by bad optimization in whatever query language is being used. In this case, these scripts would need updated before testing again. An important thing to consider is that issues identified by performance testing are not always code related issues, but instead can be caused by the infrastructure used itself. This is a very big advantage of performance testing, as many other kinds of testing may only be testing the quality of the code itself.

There are many different types of performance tests, and many different frameworks and tools which can be used, some of which can complement each other. Performance testing may not be necessary for small applications but can be incredibly useful for large-scale applications such as web-based applications which may be used by many users at the same time and process a lot of data.

When designing and building upon a large application, it may be worth considering researching and building a testing suite to cover performance tests, instead of just simple test cases. Test cases alone cannot typically account for large scale use in the way performance testing can.

Above, four different examples of relatively popular performance testing tools were mentioned, but you may find different testing tools that suit your application better. It’s always worth researching and planning beforehand to see what tools may work best, including the support and knowledge available in relation to these tools also.