Accumulate ms to seconds in K6 detailed logs

Hello,

We want detailed logs in seconds - it helps us analyze RPS across a period of time and compare it to a bunch of interconnected services. Here’s what we did: we built a collector that streams data into a high throughput system. Now the sample that we get in the collector has data like this:

{"type":"Point","data":{"time":"**2021-03-05T18:47:13.059652-08:00**","value":0,"tags":{"group":"","method":"GET","name":"https://test-api.k6.io/","proto":"HTTP/2.0","status":"200","tls_version":"tls1.2","url":"https://test-api.k6.io/"}},"metric":"http_req_blocked"}
{"type":"Point","data":{"time":"**2021-03-05T18:47:13.059652-08:00**","value":0,"tags":{"group":"","method":"GET","name":"https://test-api.k6.io/","proto":"HTTP/2.0","status":"200","tls_version":"tls1.2","url":"https://test-api.k6.io/"}},"metric":"http_req_connecting"

For a specific metric, say “http-reqs” what’s the best way to accumulate the sample to seconds? Say if I had a Point json and 2 entries for http_reqs like this

{"type":"Point","data":{"time":"**2021-03-05T18:47:13.059652-08:00**",**"value":20,**"metric":"**http_reqs**"}
{"type":"Point","data":{"time":"**2021-03-05T18:47:13.059660-08:00**",**"value":23**,"metric":"**http_reqs**"}

If you see the above example there’s 2 lines for http_reqs for a specific second (split into milliseconds) -
2021-03-05T18:47:13 - millisecond A
2021-03-05T18:47:13 - millisecond B

Let us assume 2 different things are running k6 and they hit the endpoint with 2 different velocities at the same time, and its only a small millisecond diff or no diff at all.

I want to accumulate this data and say that at this second 2021-03-05T18:47:13 , the total number of http_reqs is 20+23 = 43.

Which is the best place to gather this in the collector? Is there anything k6 can do to help us get second wise samples?
End goal: For a test of say 10m, where first 5m it ramps up to 1000RPS and next 5m, it holds 1000RPS, we want to see this trend with timestamps from k6 through its logs. This has been beneficial for us.

Currently we are putting the detailed log as a json output - but if we run large tests this JSON is in the order of 25GB and above and k6 does not support rotating log mechanism.

Please suggest what’s the best approach.

Hi @Priya,

What you are describing is aggregation, which unfortunately is only partially supported for the cloud output - http metrics are aggregated. There is an issue to expand it for to support general Counter,Gauge and Rate metrics and in general the idea to move it out of the cloud output so it is actually usable by other outputs.

Unfortunately this has some problems(in no particular order):

  1. Trend metrics (of which k6 emits 7 per http request) are harder to aggregate, both because of what they are and because the output needs to have a way to write aggregated trend … somehow. Which is why only the cloud has http metric aggregation and moving it for all output will probable be … hard.
  2. From our experience in the cloud output, the grouping of metrics before you aggregate them (for Trend the grouping is almost all the aggregation) is currently dominated by the unfortunately very CPU heavy tag set comparison, in the case of the http metrics aggregated in the cloud we could optimize it heavily as we know of subset of the tags that are both always present and very likely to be the different ones. This is not true in general though which will mean that just adding aggregation will likely have very bad performance implications, especially in cases where you have multiple scenarios, groups and so on high level tag generation. We have some ideas how to address this, but before addressing it … aggregation everywhere is unlikely to happen.

So currently my suggestion is to add .gz to the end of the json filename so k6 gzips it on the fly which in general does compress … quite well and then use something to aggregate it on your end outside of k6. If you do that, please share it as we have had people requesting something similar in different forms and it will be nice to be able to just link to something working for the next time ;).

Yes we already have the JSON zipped (not via k6) but separately and we put it through a data warehouse system on the cloud and are able to see the publish rate - this is super useful. The biggest disadvantage of this is we cannot run tests more than 10 mins because the trend logs get real big.
So we implemented a streaming collector that can stream these logs into an Event Hub - but that streaming is quite slow - we analyzed and it takes CPU < 20% - So wondering if there is a way we can optimize the collector: https://github.com/HadwaAbdelhalem/k6/tree/feature/adding-event-hubs-output-plugin/stats/eventhubs

Can you give us some tips to optimize? For starters we only wanted the http_reqs logs so we kind of filtered that.

Also we noticed that the collector runs once K6 completes its testing - so after K6 completes 100% of its load generation, there is a pause where the collector is busy streaming and this is quite slow. How can we improve the performance of the collector?

As for the JSON stuff, I can write a personal blog and share it with you here for your review :slight_smile: - the publish rate of K6 has been quite beneficial trend for us.

I am mostly pointing out that gzipping through k6, does actually perform better as k6 needs to do smaller writes to the disk and the compression was somewhere in the 20x if I remember correctly which means that 35GB will be 2GB :wink:

From a quick look at the code:

  1. you should probably move to output extensions when we release v0.31.0 … this will probably help you in the development, hopefully we will release this week or early next one with some documentation on how output extensions should be made, but you can continue as-is for now.
  2. If all you need is http_reqs and http_req_duration you can only keep the httpext.Trail in the Collect akin to how it is filtered in the cloud output(where it is actually aggregated, also the refactoring renamed Collect to AddMetricSamples).
  3. http_reqs is literally 1 for each http_req_duration so I would argue you can drop that as well :wink:
  4. you also wrap all samples before checking if you are going to emit them :wink:
  5. logging on the hot path for each sample is also not a good idea (even in debug)
  6. send in batches - this will likely help a lot more than anything else :wink:
  7. I would recommend going with easyjson as it does really help a lot, but our internal json output will probably need that treatment as well :thinking:. , so maybe leave that as the last thing :wink: as if you only emit 1 metric, this should be a pretty small bonus.
  8. Do you mean that as it stops it streams for quite some time? I would recommend you just start login 1 line on how long the metric marshaling and sending takes. Maybe the ticker doesn’t tick periodically and you only push at the end … but it doesn’t seem like it :thinking:

As for the JSON stuff, I can write a personal blog and share it with you here for your review :slight_smile: - the publish rate of K6 has been quite beneficial trend for us.

I would really like that :smiley:

Thank you for all of the above tips - we did put in many of them. We have also checked the send batch and some of the SDK’s from our end needs improvement there. Finally we achieved very fast almost immediate streaming - If you check the latest code in that link, what worked for us is parallelizing the sending of the publish rate and changing the ticker to seconds.

So we have this collector that streams almost immediately all the K6 http_reqs and http_req_duration logs and we are able to see it in the Event Hubs and ingest and aggregate it etc - I will separately try the gz compression, but with this collector we eliminate any kind of disk space constraints.

Can you please let us know what is required from us to get this PR merged into K6? if accepted, event hubs will also become a collector/sink along with kafka etc and I believe this is super useful. Can you please let us know? :slight_smile:

Hi @Priya, sorry for the late reply :frowning:

I looked at the latest code and IMO this will generate a lot more goroutines than if you went the batch route. But if it works for you, you can leave it as is :wink:

Can you please let us know what is required from us to get this PR merged into K6?

There is already a PR to add event hubs support. But given that we now have output extension support, that still needs documentation, the requirements are now … even higher IMO. And we are currently not certain event hubs will ever be merged in the k6 code. Given the existence of output extensions it will certainly need to live as one for a while first.

Your code in particular both has the problems that I listed above, which probably will need to be fixed before we decide we can merge it at all and also specifically drops all but very few metrics which is another thing.

Given both points above, I don’t think it will be a good use of your time to try to make it get to k6 core at the current time. Making it into an output extension will definitely help with the maintainability of it and I will link you to the documentation/blog post when it is done.