Hi @kai ,
recently, I was solving similar issue like you - use Promtail on Docker swarm with docker metadata. At first I tried loki logging plugin but I soon ran into problems. In case loki is unreachable, logging plugin is trying to sent logs to loki and depends on your configuration, two things can occur:
- Promtail will drop logs after X retries
- Promtail will keep trying forever
In both cases, this cause that I could not see logs in docker logs during time period when plugin was trying to send logs and even not able to kill the container because of this. This is known issue of loki plugin:
The driver keeps all logs in memory and will drop log entries if Loki is not reachable and if the quantity of max_retries has been exceeded. To avoid the dropping of log entries, setting max_retries to zero allows unlimited retries; the drive will continue trying forever until Loki is again reachable. Trying forever may have undesired consequences, because the Docker daemon will wait for the Loki driver to process all logs of a container, until the container is removed. Thus, the Docker daemon might wait forever if the container is stuck.
In my opinion, best solution is to use Promtail with Docker service discover. It is able to get metadata from docker like container name, id, network, labels, etc… To use it you have to use replabel config, example from doc:
scrape_configs:
- job_name: flog_scrape
docker_sd_configs:
- host: unix:///var/run/docker.sock
refresh_interval: 5s
filters:
- name: name
values: [flog]
relabel_configs:
- source_labels: ['__meta_docker_container_name']
regex: '/(.*)'
target_label: 'container'
Be careful with this configuration, label container
, even with one app, can contain a lot of uniq values because of docker hash name like:
- nginx_nginx.1.0zwo879s92fk38o07a65uzolc - replicated deployment
- minio_minio.wy6c6pk1arod6vbqeem8iyosj.t9mdy2hgnv2upqwmrh0fo2c9p - global deployment
This is not recommended according to Grafana Loki label best pracitce. I fix this with regexp relabel config which filter name of the container, in above example nginx_nginx
or minio_minio
, and replica number (if exists):
scrape_configs:
relabel_configs:
- source_labels: ['__meta_docker_container_name']
regex: '/(.*)\.[0-9]\..*'
target_label: 'name'
- source_labels: ['__meta_docker_container_name']
regex: '/(.*)\.[0-9a-z]*\..*'
target_label: 'name'
- source_labels: ['__meta_docker_container_name']
regex: '/.*\.([0-9]{1,2})\..*'
target_label: 'replica'