Resize CPU Limit To Speed Up Java Startup on Kubernetes
In this article, you will learn how to solve problems with the slow startup of Java apps on Kubernetes related to the CPU limit. We will use a new Kubernetes feature called “In-place Pod Vertical Scaling”. It allows resizing resources (CPU or memory) assigned to the containers without pod restart. We can use it since the Kubernetes 1.27 version. However, it is still the alpha feature, that has to be explicitly enabled. In order to test we will run a simple Spring Boot Java app on Kubernetes.
Motivation
If you are running Java apps on Kubernetes you probably have already encountered the problem with slow startup after setting too low CPU limit. It occurs because Java apps usually need significantly more CPU during initialization than during standard work. If such applications specify requests and limits suited for regular operation, they may suffer from very long startup times. On the other hand, if they specify a high CPU limit just to start fast it may not be the optimal approach for managing resource limits on Kubernetes. You can find some considerations in this area in my article about best practices for Java apps on Kubernetes.
Thanks to the new feature such pods can request a higher CPU at the time of pod creation and can be resized down to normal running needs once the application has finished initializing. We will also consider how to automatically apply such changes on the cluster once the pod is ready. In order to do that, we will use Kyverno. Kyverno policies can mutate Kubernetes resources in reaction to admission callbacks – which perfectly matches our needs in this exercise.
You can somehow associate “In-place Pod Vertical Scaling” with the Vertical Pod Autoscaler tool. The Kubernetes Vertical Pod Autoscaler (VPA) automatically adjusts the CPU and memory reservations for pods to do the “right-sizing” for your applications. However, these are two different things. Currently, VPA is working on out-of-the-box support for in-place pod vertical scaling. If you don’t use VPA, this article still provides a valuable solution to your problems with CPU limits and Java startup.
I think our goal is pretty clear. Let’s begin!
Enable In-place Pod Vertical Scaling
Since the “in-place pod vertical scaling” feature is still in the alpha state we need to explicitly enable it on Kubernetes. I’m testing that feature on Minikube. Here’s my minikube starting command (you try with lower memory if you wish):
$ minikube start --memory='8g' \
--feature-gates=InPlacePodVerticalScaling=true
Install Kyverno on Kubernetes
Before we deploy the app we need to install Kyverno and create its policy. However, our scenario is not very standard for Kyverno. Let’s take some time to analyze it. When creating a new Kubernetes Deployment
we should set the right CPU to allow fast startup of our Java app. Once our app has started and is ready to work we will resize the limit to match the standard app requirements. We cannot do it until the app startup procedure is in progress. In other words, we are not waiting for the pod Running
status…
… but for app container readiness inside the pod.
Here’s a picture that illustrates our scenario. We will set the CPU limit to 2 cores during startup. Once our app is started we decrease it to 500 millicores.
Now, let’s go back to Kyverno. We will install it on Kubernetes using the official Helm chart. In the first step we need to add the following Helm repository:
$ helm repo add kyverno https://kyverno.github.io/kyverno/
During the installation, we need to customize a single property. By default, Kyverno filters out updates made on Kubernetes by the members of the system:nodes
group. One of those members is kubelet
, which is responsible for updating the state of containers running on the node. So, if we want to catch the container-ready event from kubelet
we need to override that behavior. That’s why we set the config.excludeGroups
property as an empty array. Here’s our values.yaml
file:
config:
excludeGroups: []
Finally, we can install Kyverno on Kubernetes using the following Helm command:
$ helm install kyverno kyverno/kyverno -n kyverno \
--create-namespace -f values.yaml
Kyverno has been installed in the kyverno
namespace. Just to verify if everything works fine we can display a list of running pods:
$ kubectl get po -n kyverno
NAME READY STATUS RESTARTS AGE
kyverno-admission-controller-79dcbc777c-8pbg2 1/1 Running 0 55s
kyverno-background-controller-67f4b647d7-kp5zr 1/1 Running 0 55s
kyverno-cleanup-controller-566f7bc8c-w5q72 1/1 Running 0 55s
kyverno-reports-controller-6f96648477-k6dcj 1/1 Running 0 55s
Create a Policy for Resizing the CPU Limit
We want to trigger our Kyverno policy on pod start and its status update (1). We will apply the change to the resource only if the current readiness state is true
(2). It is possible to select a target container using a special element called “anchor” (3). Finally, we can define a new CPU limit for the container inside the target pod with the patchStrategicMerge
section (4).
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: resize-pod-policy
spec:
mutateExistingOnPolicyUpdate: false
rules:
- name: resize-pod-policy
match:
any:
- resources: # (1)
kinds:
- Pod/status
- Pod
preconditions:
all: # (2)
- key: "{{request.object.status.containerStatuses[0].ready}}"
operator: Equals
value: true
mutate:
targets:
- apiVersion: v1
kind: Pod
name: "{{request.object.metadata.name}}"
patchStrategicMerge:
spec:
containers:
- (name): sample-kotlin-spring # (3)
resources:
limits:
cpu: 0.5 # (4)
Let’s apply the policy.
We need to add some additional privileges that allow the Kyverno background controller to update pods. We don’t need to create ClusterRoleBinding
, but just a ClusterRole
with the right aggregation labels in order for those permissions to take effect.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: kyverno:update-pods
labels:
app.kubernetes.io/component: background-controller
app.kubernetes.io/instance: kyverno
app.kubernetes.io/part-of: kyverno
rules:
- verbs:
- patch
- update
apiGroups:
- ''
resources:
- pods
After that, we may try to create a policy once again. As you see, this time there were no more problems with that.
Deploy the Java App and Resize CPU Limit After Startup
Let’s take a look at the Deployment
manifest of our Java app. The name of the app container is sample-kotlin-spring
, which matches the conditional "anchor"
in the Kyverno policy (1). As you see I’m setting the CPU limit to 2 cores (2). There’s also a new field used here resizePolicy
(3). I would not have to set it since the default value is NotRequired
. It means that changing the resource limit or request will not result in a pod restart. The Deployment
object also contains a readiness probe that calls the GET/actuator/health/readiness
exposed with the Spring Boot Actuator (4).
apiVersion: apps/v1
kind: Deployment
metadata:
name: sample-kotlin-spring
namespace: demo
labels:
app: sample-kotlin-spring
spec:
replicas: 1
selector:
matchLabels:
app: sample-kotlin-spring
template:
metadata:
labels:
app: sample-kotlin-spring
spec:
containers:
- name: sample-kotlin-spring # (1)
image: quay.io/pminkows/sample-kotlin-spring:1.5.1.1
ports:
- containerPort: 8080
resources:
limits:
cpu: 2 # (2)
memory: "1Gi"
requests:
cpu: 0.1
memory: "256Mi"
resizePolicy: # (3)
- resourceName: "cpu"
restartPolicy: "NotRequired"
readinessProbe: # (4)
httpGet:
path: /actuator/health/readiness
port: 8080
scheme: HTTP
initialDelaySeconds: 15
periodSeconds: 5
successThreshold: 1
failureThreshold: 3
Once we deploy the app a new pod is starting. We can verify its current resource limits. As you see it is still 2 CPUs.
Our app starts around 10-15 seconds. Therefore the readiness check also waits 15 seconds after it begins to call the Actuator endpoint (the initialDelaySeconds
parameter). After that, it finishes with success and our container switches to the ready state.
Then, Kyverno detects container status change and triggers the policy. The policy precondition is met since the container is ready. Now, we can verify the current CPU limit on the same pod. It is 500 millicores. You can also take a look at the Annotations
field. It indicates
That’s exactly what we want to achieve. Now, we can scale up the number of running instances of our app just to continue testing. Then, you can verify by yourself that a new pod will also have its CPU limit modified after startup by Kyverno to 0.5 core.
$ kubectl scale --replicas=2 deployment sample-kotlin-spring -n demo
And the last thing. How long would it take to start our app if we set 500 millicores as the CPU limit at the beginning? For my app and such a CPU limit, it is around 40 seconds. So the difference is significant.
Final Thoughts
Finally, there is a solution for Java apps on Kubernetes to dynamically resize the CPU limit after startup. In this article, you can find my proposal for managing it automatically using a Kyverno policy. In our example, the pods have a different CPU limit than the limit declared in the Deployment
object. However, I can imagine a policy that consists of two rules and just modifies the limit only for the time of startup.
12 COMMENTS