Hello,
I’m evaluating Grafanf Tempo as our tracing solution. As of now, I’ve installed it in distributed mode and playing with full-backend search.
I execute the same query(search) for a 2hours period and tried to adjust some parameters.
The api_search latency on query_frontend and querier are quite high:
query_frontend:
querier:
At the same time, backend latency is much lower:
I tried to increase the number of queries (from 2 to 3) but it didn’t help at all.
Also played a bit with the configuration with recommendations here https://github.com/grafana/tempo/blob/main/docs/tempo/website/operations/backend_search.md But with the same 0 improvements
Can someone advise where to look at to reduce the search time
here is a diff of my config:
GET /status/config
---
compactor:
compaction:
block_retention: 168h0m0s
max_block_bytes: 3221225472
ring:
kvstore:
store: memberlist
distributor:
receivers:
kafka:
auth:
tls:
ca_file: /tmp/ca.crt
insecure: true
brokers: kafka-cluster-kafka-bootstrap.kafka.svc.cluster.local:9093
client_id: tempo-ingester
encoding: otlp_proto
group_id: tempo-ingester
message_marking:
after: true
on_error: false
protocol_version: 2.8.0
topic: otlp-tracing
ingester:
lifecycler:
readiness_check_ring_health: false
tokens_file_path: /var/tempo/tokens.json
memberlist:
abort_if_cluster_join_fails: false
dead_node_reclaim_time: 10s
join_members:
- grafana-tempo-tempo-distributed-gossip-ring
overrides:
per_tenant_override_config: /conf/overrides.yaml
querier:
frontend_worker:
frontend_address: grafana-tempo-tempo-distributed-query-frontend-discovery:9095
parallelism: 10
max_concurrent_queries: 10
search_query_timeout: 1m30s
query_frontend:
query_shards: 30
search:
concurrent_jobs: 300
max_duration: 2h0m0s
search_enabled: true
server:
http_listen_port: 3100
http_server_read_timeout: 2m0s
storage:
trace:
backend: s3
cache: memcached
local:
path: /var/tempo/traces
memcached:
addresses: dns+memcached:11211
circuit_breaker_consecutive_failures: 0
circuit_breaker_interval: 0s
circuit_breaker_timeout: 0s
consistent_hash: true
host: ""
max_idle_conns: 16
max_item_size: 0
service: ""
timeout: 1s
ttl: 0s
update_interval: 1m0s
pool:
max_workers: 100
queue_depth: 10000
s3:
bucket: kubernetes-tracing-us-east-1
endpoint: s3.amazonaws.com
search:
prefetch_trace_count: 20000
wal:
blocksfilepath: /var/tempo/wal/blocks
completedfilepath: /var/tempo/wal/completed
target: querier