How to reduce memory when uploading the same file content with N VUs in parallel

pe-te · July 14, 2023, 9:17am

Hello, we are load testing our document upload with a file of 5 MB size.
The test is working with around 20 VUs, but breaks already at 100 VUs because it needs 60GB of RAM then. Our goal was originally a lot higher, like maybe 500 or 1000 VUs.

We’ve read several discussions and github issues around the topic, some newer, some older, but cannot really figure out what the current state is and what would be the recommended best way. The topics were about ArrayBuffers, Proxies, Prebuild Bodies, Http Batch, Streaming, experimental File API, and so on.

Would be great if you had a recemmondation what to use and maybe even an example?
Also if there currently is no way to do so many tests, what feature we should wait for.

One thing when reading all that was the additional copy of the data in response.request.body, maybe there’s a way to overload the assignment operator and prevent the copy? Don’t know too much about javascript internals though.

Our use case:

For simplification we use exactly the same file in all uploads.
Every upload is to a different URL because an ID is included, e.g. https://server/document/:id

Elibarick · July 16, 2023, 4:34pm

Hi @pe-te Welcome to k6 community
During uploads which error occurs when system breaks for 100 VUs.

oleiade · July 17, 2023, 12:47pm

Hi @pe-te

Welcome on the forum

I’m afraid you’ve indeed hit one of the main limitations of k6 at the moment. Your analysis is correct; when using files within k6, especially binary ones, they are copied all over the place, mainly when using them for HTTP requests.

The good news is that we are actively working on it, but it will take some time and patience. The two main topics we’re addressing at the moment are:

Providing a File API that limits how many files are held in memory. The basic idea is that we want to supersede the current open function, which under the hood actually copies the file content in memory once per VU, and to offer an API that limits memory consumption as much as possible. We’ve had promising results so far and expect to deliver it soonish.
Providing a new HTTP API: as you have found out, a big part of the issue you’re encountering is due to the HTTP module copying and holding copies of stuff whenever you use data in the context of POST requests. We have been planning to implement a new HTTP API, which, among other things, would address that issue. We expect that project to take quite some time, though and don’t have a clear release date in mind yet.

Now in terms of workarounds, I’m afraid there aren’t many at the moment:

It is possible to encode and store binary data in redis, which we have a module for. In theory, reading files from there just in time for the request (in the VU function) should help reduce your memory usage.
Another possibility I can think of, which might be worth experimenting with, would be to the use aws library to get the content of the file directly from S3, just in time.

I’m not certain those would help much considering the HTTP module will continue to copy stuff around, but it should alleviate some of the memory usage.

Let me know if that answers your question and if I can help further

Topic		Replies	Views
Reusability of different binFile types & sizes OSS Support	1	269	August 8, 2022
How to load JSON from a file per VU iteration OSS Support	3	959	January 20, 2022
Stress-testing file upload functionality OSS Support	1	1570	May 17, 2021
Problems with 650Mb data set in k6 cloud Cloud Support	3	609	January 25, 2022
Open files in Init OSS Support	0	92	July 13, 2023

How to reduce memory when uploading the same file content with N VUs in parallel

Related topics