Tempo Distributed Helm chart, Increasing counter for "tempo_ distributor_ ingester_ append_f ailures_ total" metric

Hey Team,

New to tempo, tried installation but stuck at this…

Need help to debug more about why our counter keeps on increasing for “tempo_distributor_ingester_append_failures_total” metric.

We tried exploring the Tempo - ingester and distributor logs, but we were only getting the following messgaes:
“level=warn ts=2023-07-19T20:15:30.129704983Z caller=tcp_transport.go:252 component=“memberlist TCPTransport” msg=“failed to read message type” err=EOF remote=”

The total append sent vs failure rate is around 20%

Over google or on Grafana docs, I was not able to find what this metrics means and how can we dig more, to figure out what is going wrong?

We are currently using TEMPO-DISTRIBUTED helm chart (version version=2.1.1) to deploy tempo to our EKS Cluster

Logs for distributor and metrics:

We did found something about override and tried that too:

Overrides as per /status/config page:

Overrides configured in helm

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.