My load test is designed to send millions of unique messages to the SUT. Each message is a JSON object at least 500 bytes in size.
Loading these millions of messages into the data passed to the default function will almost certainly require more memory than the system has. It seems to me the best approach would be to have the messages in files on the system for each VU iteration to read. Each file would hold chunks of n messages and the default function can determine which file an iteration needs to read, and then parse it.
Question 1: Is this the best approach? I’m trying to minimise the work in the test so I’d rather not fetch it from an API endpoint.
Question 2: Assuming this approach has merit, what options do I have with respect to modules to read the files, parse the JSON, and select the message that this iteration of the test should send to the SUT?
This is a common question, and your instincts are correct. There are a couple of features in k6 that can help you with this: SharedArray, which will minimise the memory needed to load the data, and the k6/execution module, which will give you a counter to help you pick unique records from the data. See this post for an example, and the rest of the thread for other, now deprecated, examples.
Have a search on the forum as well, there are other useful threads around this topic, though I think the above one is the most relevant.
I did find the scenario.iterationInTest property and was using it to determine which file to load. The problem I’m facing is that I can’t open/read a file in the default function. It sounds like you’re suggesting I load all of the messages into a SharedArray (in memory) for the test. Would that not lead to issues with thrashing if it exceeds physical RAM?
The problem I’m facing is that I can’t open/read a file in the default function.
Yes, this is not allowed as the default function is executed many times during the test, so reading or writing files from there would be prohibitively expensive.
It sounds like you’re suggesting I load all of the messages into a SharedArray (in memory) for the test. Would that not lead to issues with thrashing if it exceeds physical RAM?
Yes, that’s the suggested approach, but note that the file would only be loaded once, and the data will be shared by all VUs, which drastically reduces the memory requirements. Of course, if the file is large enough and resources are scarce you’d still run into memory limitations, but this is the most efficient way of doing what you want to do.
Also take a look at some of the documentation, such as this article, which might help you understand better how k6 manages memory.