Outside of the QA world (and unfortunately, sometimes in the QA world), I’ve heard people toss around ‘Performance Testing’, ‘Load Testing’, ‘Scalability Testing’, and ‘Stress Testing’, yet always mean the same thing. My clients do this. My project managers do this. My fellow developers do this. It doesn’t bother me–I’m not some QA psycho that harasses anyone that doesn’t use exactly the correct term–but I do smirk on the inside whenever one of these offenses occurs.
Performance testing is not load testing is not scalability testing is not stress testing. They are not the same thing. They closely relate, but they are not the same thing.
- Load testing is testing that involves applying a load to the system.
- Performance testing evaluates how well the system performs.
- Stress testing looks at how the system behaves under a heavy load.
- Scalability testing investigates how well the system scales as the load and/or resources are increased.
Alexander Podelko, Load Testing in a Diverse Environment, Software Test & Performance, October 2005.
Performance Testing
Any type of testing–and I mean any type–that measures the performance (essentially, speed) of the system in question. Measuring the speed at which your database cluster switches from the primary to secondary database server when the primary is unplugged is a performance test and has nothing to do with the load on the system.
Load Testing
Any type of test that is dependent upon load or a specific load being placed on the system. Load testing is not always a performance test. When 25 transactions per second (tps) are placed on a web site, and the load balancer is monitored to ensure that traffic is being properly distributed to the farm, you are load testing without a care for performance.
Stress Testing
Here is where I disagree with Alexander: stress testing places some sort of unexpected stress on the system, but does not have to be a heavy load. Stress testing could include testing a web server where one of its two processors have failed, a load-balanced farm with some if its servers dropped from the cluster, a wireless system with a weak signal or increased signal noise, or a laptop outside in below-freezing temperatures.
Scalability Testing
Testing how well a system scales also is independent of load or resources, but still relies on load or resources. Does a system produce timeout errors when you increase the load from 20tps to 40tps? At 40tps, does the system produce less timeout errors as the number of web servers in the farm is increased from 2 servers to 4? Or when the Dell PowerEdge 2300s are replaced with PE2500s?
Any type of testing in QA is vague. This includes the countless types of functional testing, reliability testing, performance testing, and so on. Often time a single test can fit into a handful of testing categories. Testing how fast the login page loads after three days of 20tps traffic can be a load test, a performance test, and a reliability test. The type of testing that it should be categorized as is dependent upon what you are trying to do or achieve. Under this example, it is a performance testing, since the goal is to measure ‘how fast’. If you change the question to ‘is it slower after three days’, then it is a reliability test. The point is that no matter where the test fits in your “Venn Diagram of QA,” the true identify of a test is based on what you are trying to get out of it. The rest is just a means to an end.