Resize CPU Limit To Speed Up Java Startup on Kubernetes

Resize CPU Limit To Speed Up Java Startup on Kubernetes

In this article, you will learn how to solve problems with the slow startup of Java apps on Kubernetes related to the CPU limit. We will use a new Kubernetes feature called “In-place Pod Vertical Scaling”. It allows resizing resources (CPU or memory) assigned to the containers without pod restart. We can use it since the Kubernetes 1.27 version. However, it is still the alpha feature, that has to be explicitly enabled. In order to test we will run a simple Spring Boot Java app on Kubernetes.

Motivation

If you are running Java apps on Kubernetes you probably have already encountered the problem with slow startup after setting too low CPU limit. It occurs because Java apps usually need significantly more CPU during initialization than during standard work. If such applications specify requests and limits suited for regular operation, they may suffer from very long startup times. On the other hand, if they specify a high CPU limit just to start fast it may not be the optimal approach for managing resource limits on Kubernetes. You can find some considerations in this area in my article about best practices for Java apps on Kubernetes.

Thanks to the new feature such pods can request a higher CPU at the time of pod creation and can be resized down to normal running needs once the application has finished initializing. We will also consider how to automatically apply such changes on the cluster once the pod is ready. In order to do that, we will use Kyverno. Kyverno policies can mutate Kubernetes resources in reaction to admission callbacks – which perfectly matches our needs in this exercise.

You can somehow associate “In-place Pod Vertical Scaling” with the Vertical Pod Autoscaler tool. The Kubernetes Vertical Pod Autoscaler (VPA) automatically adjusts the CPU and memory reservations for pods to do the “right-sizing” for your applications. However, these are two different things. Currently, VPA is working on out-of-the-box support for in-place pod vertical scaling. If you don’t use VPA, this article still provides a valuable solution to your problems with CPU limits and Java startup.

I think our goal is pretty clear. Let’s begin!

Enable In-place Pod Vertical Scaling

Since the “in-place pod vertical scaling” feature is still in the alpha state we need to explicitly enable it on Kubernetes. I’m testing that feature on Minikube. Here’s my minikube starting command (you try with lower memory if you wish):

$ minikube start --memory='8g' \
  --feature-gates=InPlacePodVerticalScaling=true

Install Kyverno on Kubernetes

Before we deploy the app we need to install Kyverno and create its policy. However, our scenario is not very standard for Kyverno. Let’s take some time to analyze it. When creating a new Kubernetes Deployment we should set the right CPU to allow fast startup of our Java app. Once our app has started and is ready to work we will resize the limit to match the standard app requirements. We cannot do it until the app startup procedure is in progress. In other words, we are not waiting for the pod Running status…

… but for app container readiness inside the pod.

kubernetes-cpu-java-pod

Here’s a picture that illustrates our scenario. We will set the CPU limit to 2 cores during startup. Once our app is started we decrease it to 500 millicores.

kubernetes-cpu-java-limits

Now, let’s go back to Kyverno. We will install it on Kubernetes using the official Helm chart. In the first step we need to add the following Helm repository:

$ helm repo add kyverno https://kyverno.github.io/kyverno/

During the installation, we need to customize a single property. By default, Kyverno filters out updates made on Kubernetes by the members of the system:nodes group. One of those members is kubelet, which is responsible for updating the state of containers running on the node. So, if we want to catch the container-ready event from kubelet we need to override that behavior. That’s why we set the config.excludeGroups property as an empty array. Here’s our values.yaml file:

config:
  excludeGroups: []

Finally, we can install Kyverno on Kubernetes using the following Helm command:

$ helm install kyverno kyverno/kyverno -n kyverno \
  --create-namespace -f values.yaml

Kyverno has been installed in the kyverno namespace. Just to verify if everything works fine we can display a list of running pods:

$ kubectl get po -n kyverno
NAME                                             READY   STATUS    RESTARTS   AGE
kyverno-admission-controller-79dcbc777c-8pbg2    1/1     Running   0          55s
kyverno-background-controller-67f4b647d7-kp5zr   1/1     Running   0          55s
kyverno-cleanup-controller-566f7bc8c-w5q72       1/1     Running   0          55s
kyverno-reports-controller-6f96648477-k6dcj      1/1     Running   0          55s

Create a Policy for Resizing the CPU Limit

We want to trigger our Kyverno policy on pod start and its status update (1). We will apply the change to the resource only if the current readiness state is true (2). It is possible to select a target container using a special element called “anchor” (3). Finally, we can define a new CPU limit for the container inside the target pod with the patchStrategicMerge section (4).

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: resize-pod-policy
spec:
  mutateExistingOnPolicyUpdate: false
  rules:
    - name: resize-pod-policy
      match:
        any:
          - resources: # (1)
              kinds:
                - Pod/status
                - Pod
      preconditions: 
        all: # (2)
          - key: "{{request.object.status.containerStatuses[0].ready}}"
            operator: Equals
            value: true
      mutate:
        targets:
          - apiVersion: v1
            kind: Pod
            name: "{{request.object.metadata.name}}"
        patchStrategicMerge:
          spec:
            containers:
              - (name): sample-kotlin-spring # (3)
                resources:
                  limits:
                    cpu: 0.5 # (4)

Let’s apply the policy.

We need to add some additional privileges that allow the Kyverno background controller to update pods. We don’t need to create ClusterRoleBinding, but just a ClusterRole with the right aggregation labels in order for those permissions to take effect.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: kyverno:update-pods
  labels:
    app.kubernetes.io/component: background-controller
    app.kubernetes.io/instance: kyverno
    app.kubernetes.io/part-of: kyverno
rules:
  - verbs:
      - patch
      - update
    apiGroups:
      - ''
    resources:
      - pods

After that, we may try to create a policy once again. As you see, this time there were no more problems with that.

Deploy the Java App and Resize CPU Limit After Startup

Let’s take a look at the Deployment manifest of our Java app. The name of the app container is sample-kotlin-spring, which matches the conditional "anchor" in the Kyverno policy (1). As you see I’m setting the CPU limit to 2 cores (2). There’s also a new field used here resizePolicy (3). I would not have to set it since the default value is NotRequired. It means that changing the resource limit or request will not result in a pod restart. The Deployment object also contains a readiness probe that calls the GET/actuator/health/readiness exposed with the Spring Boot Actuator (4).

apiVersion: apps/v1
kind: Deployment
metadata:
  name: sample-kotlin-spring
  namespace: demo
  labels:
    app: sample-kotlin-spring
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sample-kotlin-spring
  template:
    metadata:
      labels:
        app: sample-kotlin-spring
    spec:
      containers:
      - name: sample-kotlin-spring # (1)
        image: quay.io/pminkows/sample-kotlin-spring:1.5.1.1
        ports:
        - containerPort: 8080
        resources:
          limits:
            cpu: 2 # (2)
            memory: "1Gi"
          requests:
            cpu: 0.1
            memory: "256Mi"
        resizePolicy: # (3)
        - resourceName: "cpu"
          restartPolicy: "NotRequired"
        readinessProbe: # (4)
          httpGet:
            path: /actuator/health/readiness
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 15
          periodSeconds: 5
          successThreshold: 1
          failureThreshold: 3

Once we deploy the app a new pod is starting. We can verify its current resource limits. As you see it is still 2 CPUs.

Our app starts around 10-15 seconds. Therefore the readiness check also waits 15 seconds after it begins to call the Actuator endpoint (the initialDelaySeconds parameter). After that, it finishes with success and our container switches to the ready state.

Then, Kyverno detects container status change and triggers the policy. The policy precondition is met since the container is ready. Now, we can verify the current CPU limit on the same pod. It is 500 millicores. You can also take a look at the Annotations field. It indicates

kubernetes-cpu-java-limit-changed

That’s exactly what we want to achieve. Now, we can scale up the number of running instances of our app just to continue testing. Then, you can verify by yourself that a new pod will also have its CPU limit modified after startup by Kyverno to 0.5 core.

$ kubectl scale --replicas=2 deployment sample-kotlin-spring -n demo

And the last thing. How long would it take to start our app if we set 500 millicores as the CPU limit at the beginning? For my app and such a CPU limit, it is around 40 seconds. So the difference is significant.

Final Thoughts

Finally, there is a solution for Java apps on Kubernetes to dynamically resize the CPU limit after startup. In this article, you can find my proposal for managing it automatically using a Kyverno policy. In our example, the pods have a different CPU limit than the limit declared in the Deployment object. However, I can imagine a policy that consists of two rules and just modifies the limit only for the time of startup.

12 COMMENTS

comments user
Mahatma Fatal

AFAIK the number of CPU cores has an effect on the default GC that is chosen at startup. https://stackoverflow.com/a/70665158/1239904
I guess the GC should be configured manually in your setup

    comments user
    piotr.minkowski

    Hi, it’s about startup time not gc. I’m not very sure what you mean

comments user
elia rohana

hi piotr
why settings limits from the first place
cpu limits only degrade performance, the linux operating system guarantee the cpu request.

if you asked for 0.5 core, you are guaranteed to get it even though other pods are trying to ask for more cpu .

    comments user
    piotr.minkowski

    Hi,
    I can agree with you – in my opinion setting CPU request is enought. But many people set CPU limits for their apps on k8s. This article is for them

comments user
rocketraman

Very intesting article, thank you. Regarding @Mahatma Fatal ‘s comment above, he has a very valid point. The JVM chooses a garbage collector — for the entire life of the application — based on the CPU cores available at startup. By allocating more cores at startup, you are subverting this choice. It may or may not be suboptimal for whatever workload is running in the pod, but it is something worth considering when using this technique.

    comments user
    piotr.minkowski

    Yes, I need to take a look on it (this case with CPUs and GC).

comments user
Phanideep

I have tried the same but getting below error, can some one help on this
background/resize-pod-policy “msg”=”failed to update target resource” “error”=”Pod \”sample-kotlin-spring-784b96f988-tzzm2\” is invalid: spec: Forbidden: pod updates may not change fields

    comments user
    piotr.minkowski

    Please, double check you have run minikube properly:
    “`
    minikube start –memory=’8g’ –feature-gates=InPlacePodVerticalScaling=true
    “`

comments user
Ahmet

Hi Piotr,
This is a great way for fast Java starts.
Is there a more lightweight alternative to use instead of kyverno?
Thanks

    comments user
    piotr.minkowski

    Hi,
    Good question. For sure you can do that into several different ways. For example, with ArgoCD Events. But are those things more lightweignt than Kyverno? Hard to say for me

comments user
Mikołaj Stefaniak

I’m working on a similar approach to the one proposed by Piotr but with a dedicated k8s operator rather than Kyverno. Such approach gives more flexibility for approaching different aspects of this particular problem IMO. We have PoC of a working Kube Startup CPU Boost operator so far (https://github.com/google/kube-startup-cpu-boost)

    comments user
    piotr.minkowski

    Cool idea! Thanks for that tip. I’ll take a look on it.

Leave a Reply