The biggest mistake Kubernetes admins make with health probes is configuring the probes the same way for all apps. A payment app does not need to meet the same standards as a log collecting app - so why should their probes be configured the same way?
For this reason, we recently added to Datree a policy rule that verifies you are customizing the health probes’ parameters rather than using the default values. You can view the full list of policy rules here.
Yet, you might ask yourself what values you should give to those parameters. If this is the case, read on to learn 6 best practices for configuring your liveness and readiness probes.
1. Configure the probes based on how long it takes your app to load
If your app has a long startup sequence, give it some time before initiating the probe. Otherwise, Kubernetes might deem your application inoperable, even when this is not the case. You can do this by increasing the initialDelaySeconds parameter.
2. And also based on how long it takes your app to respond
Similar to #1, some applications take a while to respond. If this is the case with your application, you should give it some time to do so by increasing the timeoutSeconds parameter.
3. Raise the bar for your critical apps
Mission-critical apps, like a payment application, should be probed frequently. You can achieve this by decreasing the periodSeconds parameter. Additionally, you should increase the successThreshold and decrease failureThreshold to make absolutely sure no traffic goes to an unavailable pod.
4. And lower it for your non-critical apps
The opposite is true as well. If your app is not critical you should not probe it as frequently and should not be as strict with the success and failure definitions. Practically, this means increasing the periodSeconds, decreasing the successThreshold and increasing the failureThreshold.
5. Give your flaky apps some wiggle room
If you know your app has timeouts or networking issues, you should give it more time to actually respond before you decide to kill it. Do this by increasing the failureThreshold as well as the periodSeconds.
6. Check your entire application to determine its real state
This one is not really about the parameters, but it is important enough to be mentioned anyway:
Don’t set up a high level HTTP check to an endpoint that returns a general response 200.
This won’t tell you the real state of your app. Instead, set the probe to check all the dependencies of the application. For example, if your application talks to your database and a cache, have the probe do the same.
Learn from Nana, AWS Hero & CNCF Ambassador, how to enforce K8s best practices with Datree
Headingajsdajk jkahskjafhkasj khfsakjhf
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.