As of v1.9 Tyk supports a kind of built-in “uptime awareness” of the underlying services and hosts that it is managing traffic for by actively polling user-defined endpoints at set intervals.
Tyk uptime awareness is not meant to replace traditional uptime monitoring tools, in fact, it is designed to supplement them by offering a way to bypass unhealthy nodes when they are down as art of Tyk’s role as an API Gateway.
How do the uptime tests work?
When uptime tests are added into a Tyk cluster, a single node will elect itself as master, masters stay in this state using a dead-mans-switch, by keeping a key active in Redis, masters are re-elected or confirmed every few seconds, if one node stops or fails, another can detect the failure an elect itself master.
The master node will then run the uptime tests allocated to the cluster (shard group).
The node running the uptime test will have a worker pool defined so that it can execute tests simultaneously every few seconds determined by a node-configurable interval loop. Depending on how many uptime tests are being run, this worker pool should be increased or decreased as needed.