How to schedule pods restart
There's a specific resource for that: CronJob
Here an example:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: your-cron
spec:
schedule: "*/20 8-19 * * 1-5"
concurrencyPolicy: Forbid
jobTemplate:
spec:
template:
metadata:
labels:
app: your-periodic-batch-job
spec:
containers:
- name: my-image
image: your-image
imagePullPolicy: IfNotPresent
restartPolicy: OnFailure
change spec.concurrencyPolicy
to Replace
if you want to replace the old pod when starting a new pod. Using Forbid
, the new pod creation will be skip if the old pod is still running.
Use a cronjob, but not to run your pods, but to schedule a Kubernetes API command that will restart the deployment everyday (kubectl rollout restart
). That way if something goes wrong, the old pods will not be down or removed.
Rollouts create new ReplicaSets, and wait for them to be up, before killing off old pods, and rerouting the traffic. Service will continue uninterrupted.
You have to setup RBAC, so that the Kubernetes client running from inside the cluster has permissions to do needed calls to the Kubernetes API.
---
# Service account the client will use to reset the deployment,
# by default the pods running inside the cluster can do no such things.
kind: ServiceAccount
apiVersion: v1
metadata:
name: deployment-restart
namespace: <YOUR NAMESPACE>
---
# allow getting status and patching only the one deployment you want
# to restart
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: deployment-restart
namespace: <YOUR NAMESPACE>
rules:
- apiGroups: ["apps", "extensions"]
resources: ["deployments"]
resourceNames: ["<YOUR DEPLOYMENT NAME>"]
verbs: ["get", "patch", "list", "watch"] # "list" and "watch" are only needed
# if you want to use `rollout status`
---
# bind the role to the service account
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: deployment-restart
namespace: <YOUR NAMESPACE>
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: deployment-restart
subjects:
- kind: ServiceAccount
name: deployment-restart
namespace: <YOUR NAMESPACE>
And the cronjob specification itself:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: deployment-restart
namespace: <YOUR NAMESPACE>
spec:
concurrencyPolicy: Forbid
schedule: '0 8 * * *' # cron spec of time, here, 8 o'clock
jobTemplate:
spec:
backoffLimit: 2 # this has very low chance of failing, as all this does
# is prompt kubernetes to schedule new replica set for
# the deployment
activeDeadlineSeconds: 600 # timeout, makes most sense with
# "waiting for rollout" variant specified below
template:
spec:
serviceAccountName: deployment-restart # name of the service
# account configured above
restartPolicy: Never
containers:
- name: kubectl
image: bitnami/kubectl # probably any kubectl image will do,
# optionaly specify version, but this
# should not be necessary, as long the
# version of kubectl is new enough to
# have `rollout restart`
command:
- 'kubectl'
- 'rollout'
- 'restart'
- 'deployment/<YOUR DEPLOYMENT NAME>'
Optionally, if you want the cronjob to wait for the deployment to roll out, change the cronjob command to:
command:
- bash
- -c
- >-
kubectl rollout restart deployment/<YOUR DEPLOYMENT NAME> &&
kubectl rollout status deployment/<YOUR DEPLOYMENT NAME>
I borrowed idea from @Ryan Lowe but modified it a bit. It will restart pod older than 24 hours
livenessProbe:
exec:
command:
- bin/sh
- -c
- "end=$(date -u +%s);start=$(stat -c %Z /proc/1 | awk '{print int($1)}'); test $(($end-$start)) -lt 86400"
Another quick and dirty option for a pod that has a restart policy of Always (which cron jobs are not supposed to handle - see creating a cron job spec pod template) is a livenessProbe that simply tests the time and restarts the pod on a specified schedule
ex. After startup, wait an hour, then check hour every minute, if hour is 3(AM) fail probe and restart, otherwise pass
livenessProbe:
exec:
command:
- exit $(test $(date +%H) -eq 3 && echo 1 || echo 0)
failureThreshold: 1
initialDelaySeconds: 3600
periodSeconds: 60
Time granularity is up to how you return the date and test ;)
Of course this does not work if you are already utilizing the liveness probe as an actual liveness probe ¯\_(ツ)_/¯