- What Grafana version and what operating system are you using?
The standard images from docker.io:
Grafana 9.5.2
Grafana Loki 2.8.0
Grafana Mimir 2.8.0
Host is Fedora Server 38. The images are being run under podman
.
Linux services02 6.2.15-300.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Thu May 11 17:37:39 UTC 2023 x86_64 GNU/Linux
- What are you trying to achieve?
I’m trying to display the rate of various metrics. This works when the data source is Prometheus,
but does not work when the data source is Mimir.
- How are you trying to achieve it?
Using the standard queries suggested by the Time Series panel. For example, here’s a simple
query picking a counter metric from Prometheus at random:
And here’s that same query wrapped in rate()
:
This is as expected.
Now, here’s a counter metric picked at random from a Mimir instance that’s exposing a Prometheus
API:
This already looks kind of weird to me, to be honest. The metric in question is of an HTTP server
under a mild load test, and therefore the number of total requests is increasing at a rate of
between 1 and 10 per second. That graph shows an odd shelf effect that I’m sure is not in the
actual data.
Wrapping this in a rate()
fails completely:
It only really starts to work when I increase the range vector to 120s:
Going down to 110s displays odd gaps in the data. I would display an image but I’ve hit the limit for embedded media in posts.
- What did you expect to happen?
I expect rate queries to work as well as they do with a real Prometheus instance.
- Can you copy/paste the configuration(s) that you are having problems with?
In this configuration, I’m using a Java application manually instrumented with
the OpenTelemetry Java SDK. I’m using the OpenTelemetry Collector to write metrics
to Mimir.
Otel collector:
#----------------------------------------------------------------------
# Receivers.
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
#----------------------------------------------------------------------
# Processors.
processors:
batch:
send_batch_max_size: 10000
timeout: 0s
memory_limiter:
check_interval: 1s
limit_percentage: 80
spike_limit_percentage: 10
#----------------------------------------------------------------------
# Exporters.
exporters:
prometheusremotewrite:
endpoint: http://mimir:9009/api/v1/push
loki:
endpoint: http://loki:3100/loki/api/v1/push
logging:
verbosity: basic
#----------------------------------------------------------------------
# The service pipelines connecting receivers -> processors -> exporters.
service:
pipelines:
logs:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [logging, loki]
traces:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [logging]
metrics:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [logging, prometheusremotewrite]
Mimir:
multitenancy_enabled: false
blocks_storage:
backend: filesystem
bucket_store:
sync_dir: /mimir/data/tsdb-sync
filesystem:
dir: /mimir/data/tsdb/fs
tsdb:
dir: /mimir/data/tsdb/db
compactor:
data_dir: /mimir/data/compactor
sharding_ring:
kvstore:
store: memberlist
distributor:
ring:
instance_addr: 127.0.0.1
kvstore:
store: memberlist
ingester:
ring:
instance_addr: 127.0.0.1
kvstore:
store: memberlist
replication_factor: 1
ruler:
rule_path: /tmp/ruler
ruler_storage:
backend: filesystem
filesystem:
dir: /mimir/data/rules
server:
http_listen_port: 9009
log_level: error
store_gateway:
sharding_ring:
replication_factor: 1
activity_tracker:
filepath: "/tmp/metrics-activity.log"
For Grafana itself, I’m using an empty grafana.ini
. All servers are running in podman
containers on the
same physical host.
- Did you receive any errors in the Grafana UI or in related logs? If so, please tell us exactly what they were.
None that I could see.
- Did you follow any online instructions? If so, what is the URL?
Only the official Grafana documentation for the setup of the various containers.