Resolved - Promtail "runs away" until system lockup, (5 seconds) on a near stock Debian Stable install (bookworm)

accounting84e7 · July 7, 2023, 4:32am

Hello everyone, first post as I am trying to build an IDS panel leveraging on-prem (no i wont pay for the cloud, ever) grafana + loki + promtail + snort3

I have all of these working BUT promtail because it just loops and chokes itself to death

VM hosted on Proxmox VE w/ 4 cpu’s and 24GB of vRAM

Version:

promtail, version 2.8.2 (branch: HEAD, revision: 9f809eda7)
  build user:       root@b7e9ca0bf6e0
  build date:       2023-05-03T11:13:57Z
  go version:       go1.20.4
  platform:         linux/amd64

The Issue:

Regardless if its 8GB, 64GB of RAM, as soon as the promtail service starts, the leaks/runs away until complete lockup (under 20 seconds)

What have I tried:

Different vCPU (now on host cpu, EPYC 7551P, which has AVX2 encoding, to be safe)
Different RAM
Various bandwidth limits in the .yaml file (no change in end result)

Current Configs:

Promtail .service file

[Unit]
Description=Promtail Service
After=network.target

[Service]
Type=simple
User=promtail
ExecStart=/opt/loki/promtail-linux-amd64 -config.file=/opt/loki/promtail-local-config.yaml

[Install]
WantedBy=multi-user.target

Current YAML file for Promtails

server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://localhost:3100/loki/api/v1/push

limits_config:
  readline_rate_enabled: true
  readline_rate: 10
  readline_burst: 20



scrape_configs:
- job_name: system
  static_configs:
  - targets:
      - localhost
    labels:
      job: varlogs
      __path__: /var/log/*log

Just for reference:

Current Loki YAML file

auth_enabled: false

server:
  http_listen_port: 3100
  grpc_listen_port: 9096

common:
  instance_addr: 127.0.0.1
  path_prefix: /tmp/loki
  storage:
    filesystem:
      chunks_directory: /tmp/loki/chunks
      rules_directory: /tmp/loki/rules
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory

query_range:
  results_cache:
    cache:
      embedded_cache:
        enabled: true
        max_size_mb: 100

schema_config:
  configs:
    - from: 2020-10-24
      store: boltdb-shipper
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h

ruler:
  alertmanager_url: http://localhost:9093

All the other services

user@host:/opt/opensearch$ systemctl status snort3
● snort3.service - Snort Daemon
     Loaded: loaded (/etc/systemd/system/snort3.service; enabled; preset: enabled)
     Active: active (running) since Thu 2023-07-06 23:43:59 EDT; 43min ago
   Main PID: 636 (snort)
      Tasks: 2 (limit: 28769)
     Memory: 295.5M
        CPU: 33.325s
     CGroup: /system.slice/snort3.service
             └─636 /usr/local/bin/snort -c /usr/local/etc/snort/snort.lua -s 65535 -k none -l /var/log/snort -D -i ens18 -m 0x1b -u snort ->

user@host:/opt/opensearch$ systemctl status grafana-server.service 
● grafana-server.service - Grafana instance
     Loaded: loaded (/lib/systemd/system/grafana-server.service; enabled; preset: enabled)
     Active: active (running) since Thu 2023-07-06 23:44:02 EDT; 43min ago
       Docs: http://docs.grafana.org
   Main PID: 890 (grafana)
      Tasks: 20 (limit: 28769)
     Memory: 172.6M
        CPU: 5.336s
     CGroup: /system.slice/grafana-server.service
             └─890 /usr/share/grafana/bin/grafana server --config=/etc/grafana/grafana.ini --pidfile=/run/grafana/grafana-server.pid --pack>

user@host:/opt/opensearch$ systemctl status loki.service
● loki.service - Loki logging daemon
     Loaded: loaded (/etc/systemd/system/loki.service; enabled; preset: enabled)
     Active: active (running) since Thu 2023-07-06 23:43:59 EDT; 43min ago
   Main PID: 632 (loki-linux-amd6)
      Tasks: 9 (limit: 28769)
     Memory: 93.5M
        CPU: 5.291s
     CGroup: /system.slice/loki.service
             └─632 /opt/loki/loki-linux-amd64 -config.file=/opt/loki/loki-local-config.yaml

Thank you for your time,
Scott

accounting84e7 · July 7, 2023, 8:11pm

RESOLVED!!

There was a corrupt system log file in /var/log

I ran ls -lshat /var/log and noticed that /var/log/lastlog was 531GIGABYTES
2023-07-07_16-09

For a system with less than 100GB of storage, that was impossible.

I cleared the log file by running (as root) >/var/log/lastlog and it cleared it out. Be aware this does track who all signed into the machine so there may be operational consequences to you if you remove/delete the fie.

restarted it, CPU and RAM didn’t supersan the VM to death.

system · July 6, 2024, 8:11pm

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Promtail config Grafana Loki	1	603	June 27, 2023
Bad performance with loki Grafana Loki performance	3	233	July 13, 2024
Running multiple services Grafana Loki	3	619	October 5, 2022
Promtail log synchronization data loss Grafana Loki loki , promtail	4	207	June 7, 2024
Promtail - cannot allocate memory for thread-local data: ABORT Grafana Loki	2	2853	July 8, 2023

*Resolved* - Promtail "runs away" until system lockup, (5 seconds) on a near stock Debian Stable install (bookworm)

Related topics

Resolved - Promtail "runs away" until system lockup, (5 seconds) on a near stock Debian Stable install (bookworm)