I’m using Grafana v9.2.5 and set up some alerting rules via Prometheus. This works actually great and I receive alerts as expected.
Some of the alerts I want to “acknowledge” when I have taken a look at. So what I’m doing is that I am creating a specific silencer for the alert.
Eg. I do have a session that had a faulty value. I therefore receive an alert with the specific session id that has triggered the alert. Now I’m going into the Silences Page on Grafana and set up a silencer, with the label being the affected session id.
I see in the “Affected alert instances” that apparently the alert was found, therefore, I press submit. So far so good. (For the “Choose Alertmanger” I have chosen “Alertmanager”)
Unfortunately my trouble starts. I now see the list of active silences has been extended with my new silencer. Though the number of Alerts is 0 and not 1 as just shown in the affected instances list. This also leads to the fact, that I get notified on the alert over and over again.
Does anybody know, what is going wrong here and how I can fix that?
Thanks for the support.
Hi @jf0! Is this alert rule Grafana Managed or Prometheus managed?
The silences need to be created on the alertmanager that will be receiving the alerts from your rules. If it’s a Grafana Managed alert rule, then this will be the built-in “Grafana” alertmanager by default. If it’s a Prometheus (or Mimir / Cortex / Loki) managed alert rule then it will be whichever alertmanager is configured on that datasource.
Currently, the silence preview can be a bit misleading. Just because it shows up in “Affected alert instances” does not guarantee that you’ve selected the correct alertmanager to create the silence on. It simply matches against all of your alerts and assumes you’ve chosen the alertmanager that will be receiving them.
I think you pointed me in the prefect direction. I was/am using probably the wrong alertmanager. I tired to setup my own, what is on a different server then the grafana host. Unfortunately I now receive an “Health Check failed” message, when I try to set it up.
My Setup is:
URL: https://{MY external DOMAIN}:9093
Access: Server (default)
My alerts are coming from Prometheus (port 9090) with Loki (port 3100).
When I try to debug, why the Health check (what ever this one is checking) is failing, I can see in the inspection of the page: Access to fetch at 'https://{MY external DOMAIN}:3100/api/v2/status' (redirected from 'https://{GRAFANA HOST DOMAIN}/api/datasources/proxy/uid/**********/api/v2/status') from origin 'https://{GRAFANA HOST DOMAIN}' has been blocked by CORS policy: Response to preflight request doesn't pass access control check: No 'Access-Control-Allow-Origin' header is present on the requested resource. If an opaque response serves your needs, set the request's mode to 'no-cors' to fetch the resource with CORS disabled.
The docker container hosting grafana doesn’t show any log message unfortunately.
When I curl https://{MY external DOMAIN}:9093 I receive an answer so apparently it is reachable.
Maybe you know where the rout problem is here as well?
I am able to check the endpoint and reach it (tested via telnet and curl). I was also able to save the data source even thought the health check fails on grafana side. I was also able to silence now the alerts when choosing my new setup data source for the alert manager. It seems I can just ignore the failed health check on grafana side though I would be still interested, what it is looking for in the check and why it fails.
Thanks @mjacobson for pointing me into the right direction!