I was recently asked by one of our clients to perform a large performance test on their web service. They would like to simulate traffic of around 150k users that would log into their application and perform a few simple operations.
After reviewing my options I encountered k6 as the most recommended tool for that, but I have few problems that I would need some clarification on.
As I understand VUs (virtual users) simulate a single user performing some scenarios, if so that means that in order to simulate traffic of 150k users I would have to create a test scenario with 150k VUs? Per this resource Running large tests I understand that a single VU can generate multiple requests per second, but does that mean that it can perform operations on behalf of multiple users? Can a single VUs log as multiple users efficiently?
Are there some useful resources where I can see example test definitions where such a huge simulation is performed? Ideally, I would like to see how complicated setup for that is required in case if I were to run it using distributed approach and cloud.
You can think of a VU as a worker that will execute the default function in your k6 script many times for the duration of the test run. What you decide to do in the default function is really up to you. So you could login multiple users if you wanted to there, but most k6 users find it convenient to model the behavior of a VU as a single real-world user of the web service.
In the context of a web site, this generally means trying to simulate what the browser would do: interpret the HTML to load multiple assets in parallel (using http.batch()), and then interpret the JS assets to make backend requests.
Since you don’t need to actually interpret HTML and JS, if you know the flow of backend requests, you can write these from scratch using the k6/http module. Otherwise, it might be helpful to record the user journey using a browser recorder. You can use the generated script as base, but you will probably need to edit it to fit your testing needs (e.g. to remove downloading 3rd-party assets, which are not relevant for testing the performance of your site).
The “huge simulation” part of it is simply a matter of increasing the amount of VUs and test duration. The complexity with writing a script will depend on what you want to test, so someone else’s scripts will be much different from whatever you need to write for your site, and those examples wouldn’t be very useful.
We do have plenty of simple examples in our documentation. Particularly see the How to Load Test a Website guide and Test Types to get familiar.
If you want to see what you can do in the default function, I suggest you record a user journey using the browser recorder, and go from there. Then if you find you’ve reached the limits of optimizing your script and exhausted the resources of a single machine, you can try distributed execution as documented here, using the Kubernetes k6-operator, or trying our k6 Cloud product. The beauty of k6 is that you wouldn’t need to change your k6 script regardless of the scale you run it at, so you can start small and the same script would continue to work as you scale up.
Thank you for taking the time to write such an extensive explanation @imiric, really appreciate that.
The “huge simulation” part of it is simply a matter of increasing the amount of VUs and test duration. The complexity with writing a script will depend on what you want to test, so someone else’s scripts will be much different from whatever you need to write for your site, and those examples wouldn’t be very useful.
If I understand correctly single VU is unable to do multiple requests at the exact same second, unless those requests would be super quick in response, as those are going to be performed in sequence anyway, right? The only workaround for that would be to perform http.batch to simulate multiple users doing operations at the same time, is it even a sufficient option?
Thank you for taking the time to write such an extensive explanation
Sure thing
If I understand correctly single VU is unable to do multiple requests at the exact same second, unless those requests would be super quick in response, as those are going to be performed in sequence anyway, right?
No, http.batch() would actually perform the requests in parallel, assuming you’re on a multi-core system, or at the very least concurrently, if not. Go handles that transparently for us. This is not unlike how browsers send requests, though k6 doesn’t have the limitation of a maximum number of simultaneous HTTP/1 connections that browsers do (ranging from 2-8 depending on the browser), and this is less of an issue for HTTP/2 which can multiplex requests.
http.batch() isn’t a workaround to simulate multiple users, but a feature to simulate a single virtual user sending more than one request at a time. Increasing the number of VUs would also send more than one request at a time to your backend, so http.batch() is a second layer to control this per user.
I think you’d find it easier to model the test and your default function as a single user, and then increase the number of VUs as needed, instead of trying to model multiple users in the default function. This is generally the workflow we suggest, and what you’ll find in the documentation. If you use the browser recorder, you’ll get a better sense of how this is scripted.