Hi,
I am using Grafana 8.0.3 (upgrade from 7.x in last week), deploy by official Docker image with external data volume in EXT4 filesystem.
I created alters (ngalert) for our system, everything looks fine in the beginning, then I start get alert by error report, as below.
The number is the create sequence of the rules, I start get the “database locked” error from the 8th rules. After that, any new rule I created always get this error after running a while.
And actually, the rules still working properly. I mean, I still can see the state changed and receive the alters, if I setup the state to OK when execution error or timeout, the rule still working.
t=2021-07-05T11:55:25+0800 lvl=eror msg="failed to fetch lert rule" logger=ngalert key="{orgID: 1, UID: 3YW4mjz7k}"
t=2021-07-05T11:55:26+0800 lvl=eror msg="failed to fetch alert rule" logger=ngalert key="{orgID: 1, UID: ESFGWCknk}"
t=2021-07-05T11:55:27+0800 lvl=info msg="Request Completed" logger=context userId=1 =1 uname=admin method=GET path=/api/live/ws status=400 remote_addr=1.163.110.115 time_ms=7065 size=12 referer=
t=2021-07-05T11:55:29+0800 lvl=eror msg="Anonymous access organization error: 'company': database is locked"
t=2021-07-05T11:55:29+0800 lvl=info msg="Request Completed" logger=context userId=0 orgId=0 uname= method=GET path=/api/live/ws status=401 remote_addr=220.133.186.239 time_ms=5005 size=26 referer=
t=2021-07-05T11:55:32+0800 lvl=eror msg="Failed to look up user based on cookie" logger=context error="database is locked"
t=2021-07-05T11:55:34+0800 lvl=eror msg="Anonymous access organization error: 'company': database is locked"
t=2021-07-05T11:55:34+0800 lvl=info msg="Request Completed" logger=context userId=0 orgId=0 uname= method=GET path=/api/live/ws status=401 remote_addr=220.133.186.239 time_ms=5005 size=26 referer=
t=2021-07-05T11:55:34+0800 lvl=info msg="Request Completed" logger=context userId=0 orgId=1 uname= method=GET path=/api/live/ws status=400 remote_addr=220.133.186.239 time_ms=830 size=12 referer=
t=2021-07-05T11:55:35+0800 lvl=info msg="Request Completed" logger=context userId=0 orgId=1 uname= method=GET path=/api/live/ws status=400 remote_addr=220.133.186.239 time_ms=0 size=12 referer=
From the log, you can see some “failed to fetch alert rule” errors, too. When I using the grafana UI, it did show some errors sometimes, but normally work fine after a reload.