Hello,
(I posted an earlier version of this question to the Loki mailing list about a week ago. No one replied so I am trying here with a v2 of the question. Please let me know if anything is unclear.)
I’ve been playing around with the Loki logging driver for Docker. One thing I’ve noticed is that the driver by default adds the labels container_name
and filename
to each log entry. Both of these labels have a high cardinality and are thus not suitable to be used for indexing.
Thus my line of thought is that they need to be dropped in the pipeline stages before sending the log entry to Loki. However, the container_name
label is useful to have so I would like to preserve it. There doesn’t seem to be support for annotation/metadata/non-indexable fields so the only way I can see that happening today is by rewriting the log message.
If I go with rewriting the message, I can drop the labels and add container_name
to the message with the following tidbit of logging driver configuration:
loki-pipeline-stages: |
- template:
source: output
template: '{{ .Entry }} container_name={{ .container_name }}'
- labeldrop:
- container_name
- filename
- output:
source: output
If .Entry
is a typical Golang log line such as the line below, then it can be parsed using the logfmt LogQL parser.
level=info ts=2021-01-05T08:23:07.811Z caller=main.go:429 msg=Listening address=:9093
However, that doesn’t work if .Entry
is in JSON format since it would then ruin the chance of using the json
LogQL parser for label extraction. I’m thinking that in this case the container_name
label should instead be added as an additional field to the JSON log line. I’m not sure how to do that with a Golang template and if it really should be done like that.
Am I on the wrong path in my line of thinking?
Can this issue of preserving additional labels without paying the penalty of indexing them be achieved today in a good, generic way?
Would it make sense for Loki to have a concept of non-indexable fields, i.e. fields that are not available to the log stream selector but are available to the rest of the log pipeline?