We had a 2 hour outage on processing new events; see the “updatedAt” and “createdAt” fields jumping from 14:41:19 UTC to 16:27:44 UTC.
The issue shows up on all synced events for the exact same timeframe - they just weren’t processed, and then they show up. It’s not even caught up by the time it’s processing new events.
There’s no useful logs that I can view, other than showing that the server was responding to client read requests just fine:
- 2021-11-17T17:06:23.163Z - Can not find client undefined on disconnect
- 2021-11-17T15:43:06.899Z - Can not find client undefined on disconnect
- 2021-11-17T14:41:33.798Z - Can not find client undefined on disconnect
- 2021-11-17T13:45:19.949Z - Can not find client undefined on disconnect
We also have a second server (with a mostly identical setup, but some different calculations) processing the same events. It had the same outage.
My guess is some kind of node issue with syncing events? I’d like to understand the mitigations you’ve got in place here so we can prevent this in the future.