Hello! I couldn’t query traces from Tempo datasource (storage configured as s3 bucket). Traces avaliable only for 48h.
Why Tempo could`t query my tracees from s3 bucket? In our production infrastructure, we need to be able to request traces for 30 days.
Should I set block_retention: 30 days to solve the problem ?
Compactor config:
Hi, yes block_retention controls the life of files in the s3 bucket. The compactor is responsible for this task. Set it to block_retention: 720h for 30 days retention. Documentation
@mdisibio Hello! Could your help me please to understand this metrics: tempo_distributor_ingester_append_failures_total(Panel Failed batch sent to ingesters)?
I couldn’t figure out about what kind of failures this metrics contains?
I see it everytime when we sent traces to Tempo, and I also check this metrics: tempo_receiver_refused_spans and tempo_discarded_spans_total, there are both = 0.
Hi, the metric tempo_distributor_ingester_append_failures_total means the distributor component had trouble forwarding traffic to the ingesters. More detail will be in the distributor logs, possibly the error pusher failed to consume trace data. Based on your screenshot it looks like some traffic was ok because the bottom left panel Ingester Traces Created has data.
compacted_block_retention configures how long compacted blocks are kept in storage before deletion. When the compactor compacts blocks, it doesn’t delete them right away, but marks them as compacted. Compacted blocks are deleted afterwards asynchronously.
@mariorodriguez , thanks a lot for the reply. It is helpful to understand there is a separate garbage collection algorithm on this. But why keeping a retention on these blocks? Is it to try to protect the data when there could be transient failures/crashes?
Mainly to help with block list maintenance. Once a new block is created by compacting others, it can take a bit for all queriers to find it and update it in their block lists. Not deleting compacted blocks right away allows for queriers to fallback to those and adds resilience to the read path.