I’m running grafana in a docker swarm. The swarm is using traefik to route incoming traffic to the appropriate containers and traefik is providing ssl termination for incoming https. Grafana inside the swarm is only serving http, not https. Everything works fine up until grafana makes an api call to itself, and then it either gets a 404 or a 401 error.
For instance, the logs I get when trying to update a datasource:
t=2018-10-25T21:23:25+0000 lvl=info msg=“Request Completed” logger=context userId=1 orgId=1 uname=admin method=GET path=/api/datasources/proxy/1/api/v1/query status=404 remote_addr=10.255.0.35 time_ms=59 size=19 referer=https://grafana-qa.mydomain.com/datasources/edit/1,
t=2018-10-25T21:23:25+0000 lvl=info msg=“Request Completed” logger=context userId=1 orgId=1 uname=admin method=GET path=/api/frontend/settings status=200 remote_addr=10.255.0.35 time_ms=9 size=14490 referer=https://grafana-qa.mydomain.com/datasources/edit/1,
t=2018-10-25T21:23:25+0000 lvl=info msg=“Request Completed” logger=context userId=1 orgId=1 uname=admin method=PUT path=/api/datasources/1 status=200 remote_addr=10.255.0.35 time_ms=20 size=462 referer=https://grafana-qa.mydomain.com/datasources/edit/1,
t=2018-10-25T21:23:22+0000 lvl=info msg=“Request Completed” logger=context userId=1 orgId=1 uname=admin method=GET path=/api/plugins/prometheus/settings status=200 remote_addr=10.255.0.35 time_ms=6 size=1237 referer=https://grafana-qa.mydomain.com/datasources/edit/1
I’m getting the same sorts of errors in dashboards. Looking at what was supposed to be a graph of container memory usage by image, instead it’s a blank graph with red triangle in the corner, and when I get the error details I find this:
xhrStatus:“complete”
request:Object
method:“GET”url:“api/datasources/proxy/2/api/v1/query_range?query=sum%20(%20container_memory_usage_bytes%20%7Bid%3D~%22%2Fdocker%2F.%22%2Ccontainer_label_com_docker_swarm_service_name%3D~%22ourservice_uat.%22%7D)%20by%20(container_label_com_docker_swarm_service_name)%0A&start=1540501980&end=1540503795&step=15”
response:"404 page not found "
It was working fine when I set up the container on my laptop, it’s only when I run the container in our swarm environments that this is a problem. It seems pretty clear that this is due to a configuration problem wrt the proxying, but the documentation around proxy configuration is not helpful.
I’ve tried any number of settings for the [server] config block, here is what I am currently using.
[server]
; The public facing domain name used to access grafana from a browser
domain = grafana-qa.mydomain.com; Redirect to correct domain if host header does not match domain
; Prevents DNS rebinding attacks
enforce_domain = falseprotocol = http
; http_port = 443
root_url = https://%(domain)s/
router_logging = true
Part of my confusion is that the proxy and grafana are using different ports (80 vs 3000) and protocols (https vs http) but the documentation offers no suggestion as to what to do in those cases.
Does anyone have any ideas?
This is using the grafana 5.3.1 container image with my own grafana.ini added along with some certs for ssl to the database and three plugins pre-installed (grafana-azure-monitor-datasource, grafana-clock-panel, grafana-simple-json-datasource). The swarms are all docker ce 18.0.3.1. Traefik is running their latest container.