Open model spike test - help needed

Hey! So I have this basic spike test:

scenarios: {
        contacts: {
            executor: 'ramping-vus',
            startVUs: 0,
            stages: [
                { duration: '10s', target: 100 },
                { duration: '1m', target: 100 },
                { duration: '10s', target: 500 },
                { duration: '3m', target: 500 },
                { duration: '10s', target: 100 },
                { duration: '3m', target: 100 },
                { duration: '10s', target: 0 },
            ],
            gracefulStop: '2m',
            gracefulRampDown: '2m'
        },
    },

However it is not working quite the way I want it. Right now, whenever the response time increases, the request/sec drops. I read that it is because the test follows a “closed model” when executed but that kinda defeats the purpose of a spike test (correct me if im wrong). Either way, my goal is to have my test kill my application and then see if my application can recover itself.

I’ve read that an open model would allow request/sec to stay constant or increase regardless of how high the response time is. But how do I tell k6 to run my test as an open model?
I tried to use another executor i.e. ’ Ramping arrival rate’, however I cant seem to wrap my head around how it works and I still see request/sec drop whenever response time increases, so I am really at a loss right now. :man_shrugging:

All I really want is to make the above test follow the open model, so the question is; is that possible and if so, how do I do it?

Hi @agan,

I guess you have read this explanation of arrival rate and open and closed model.

Basic k6 architecture in three sentences: k6 has VUs which are basically JS VMs being able to execute javascript code - the one you write as the script. And a VU can only execute 1 javascript instruction at a time … because that is how javascript works :wink:
Each time you call http.get this blocks the whole VU making it wait until it returns - which is (more or less) the time it takes to complete the http request.

Ramping-vus is increasing or decreasing the number of vus executiing code at any given time. Those Vus though still need to execute 1 line at a time and will wait for requests to finish before returning.
Even if you have 1m VUs if your server starts returning really slow they won’t be making 1m requests/s as the load generator waits for the response.

Because k6 in practice works with execution of code and one execution of the default function is called iteration, the way to say you want to start iterations at some rate, which will then translate to starting requests at some rate- is to use arrival-rate executors. Those executors start iteration either at a constant rate or at a changing(usually ramping) one.

Those still need VUs/JS VMs to actually execute the code so if you were not getting anywhere near what you want as RPS at 500VUs you likely will need to set preallocatedVUs to something higher than that.

This is likely why you still were seeing dropped iterations - k6 needed a VU to run code but no VU was free.

I would also recommend giving running large tests specifically you might need to tune your OS settings.

however I cant seem to wrap my head around how it works

k6 will try to hit the target rate of iterations per timeUnit (another configuration option) by starting a new iteration on a free VU. If no free VUs are available it will drop the iteration and emit a metric that it did so. If you have configured maxVUs it will also start initialiazing a new VU in the background. I would recommend against setting maxVUs and in practice it has turned out to rarely help. If you inititalize VUs mid test it still takes the same memory/CPU resources but it now also uses up resources while running the test instead of in the very beggining, potentially changing the results.

I still see request/sec drop whenever response time increases, so I am really at a loss right now. :man_shrugging:

at some point any tool will need hit a limit depending on how it is designed. In k6 the limit you are more likely to hit is not having a VU to run the code, but even without it you will:

  1. hit a number of open files
  2. not having a network port to make the request from
  3. CPU/mem limits
  4. network bandwidth between the two systems

I guess as a workaround you can set a timeout at which point k6 will abort the given request :person_shrugging: so you start a new one.

In general because the system under test(SUT) is doing a lot more then the load generator you will have hard time hitting those limits as the SUT will hit cpu/mem ones way before that. This might not be the case for you depending on how your SUT is designed and does.

Hope this helps you!

edit: I have opened an issue to update the documentation to not use ramping-vus for the spike testing example

1 Like

Heya @mstoykov,

Thanks a lot for your reply and thanks for taking your time to explain all this, I really appreciate it :slight_smile:

I’ve tried some of the things you said i.e. cranked up preallocatedVUs to 1000 and removed maxVUsbut unfortunately I still see the same “spiky” behaviour in my RPS and ontop of that I am still not getting anywhere near what I want in RPS. What I don’t get is that I used to be able to reach +/- 2k RPS with no problem using ramping-vus with much less active VUs, even when running local. But with ramping-arrival-rateI can hardly reach 1.5k RPS with 1000 VUs. Why is it so different and why are my RPS still dropping when using raming-arrivial-rate - It’s driving me crazy. :tired_face:

By the way, I should probably mention that I run my scripts in k6 cloud (I got the Team package).

I apologize for my lack of understanding.

@mstoykov
insecureSkipTLSVerify: true,
noConnectionReuse: false,
discardResponseBodies: true,
setupTimeout: ‘180s’,

scenarios: {
    contacts: {
        executor: 'ramping-arrival-rate',
        preAllocatedVUs: 1000,
        timeUnit: '1s',
        stages: [
            { duration: '10s', target: 5 }, 
            { duration: '1m', target: 5 }, 
            { duration: '10s', target: 22 },
            { duration: '3m', target: 22 },
            { duration: '10s', target: 5 },
            { duration: '3m', target: 5 },
            { duration: '10s', target: 0 },
        ],
        gracefulStop: '2m'
    },
},

This is what my ramping-arrival-rate test looks like currently, and the following is the output:

I just feel like this ain’t right. I don’t want the RPS to drop like that whenever response time increases - it should (at the very least) remain the same or continue increasing. Is this not possible or what? :no_mouth:

Without a copy of the script it will be hard to guess, but:

  1. it is common to add sleep in non arrival rate executors to pace them better - but here the arrival rate already does hte pacing, so adding it just makes the VU sleep and it being to available
  2. those targets you show are … low … for 1000 VUs unless you have a lot of requests that take longer
  3. on related note - do you have dropped iterations? You can see in the analytics tab and graph them. If they are missing it means that hte arrival rate did all iterations at the times it needed to. So your targets were hit and you likely misinterpreted them.

Again the target is how many iteration (execution of the default function) will be started per timeUnit (by default and as you have confugired it above - 1 second). So you want to start 22 iterations as the spike and keep that for 3 minutes so you will in that spike do 3 * 60 * 22 iterations - which is under 4k in my “quick” calculations. If those have 1000 requests … that is not small amount, but if it’s 1 it is very small amount and if it is 20-30 it is likely still on the smallish side.

Hope this helps you