Startup CPU Boost in Kubernetes with In-Place Pod Resize

Startup CPU Boost in Kubernetes with In-Place Pod Resize

This article explains how to use the In-Place Pod Resize feature in Kubernetes, combined with Kube Startup CPU Boost, to speed up Java application startup. The In-Place Update of Pod Resources feature was initially introduced in Kubernetes 1.27 as an alpha release. With version 1.35, Kubernetes has reached GA stability. One potential use case for using this feature is to set a high CPU limit only during application startup, which is necessary for Java to launch quickly. I have already described such a scenario in my previous article. The example implemented in that article used the Kyverno tool. However, it is based on an alpha version of the in-place pod resize feature, so it requires a minor tweak to the Kyverno policy to align with the GA release.

The other potential solution in that context is the Vertical Pod Autoscaler. In the latest version, it supports in-place pod resize. Vertical Pod Autoscaler (VPA) in Kubernetes automatically adjusts CPU and memory requests/limits for pods based on their actual usage, ensuring containers receive the appropriate resources. Unlike the Horizontal Pod Autoscaler (HPA), which scales resources, not replicas, it may restart pods to apply changes. For now, VPA does not support this use case, but once this feature is implemented, the situation will change.

On the other hand, Kube Startup CPU Boost is a dedicated feature for scenarios with high CPU requirements during app startup. It is a controller that increases CPU resource requests and limits during Kubernetes workload startup. Once the workload is up and running, the resources are set back to their original values. Let’s see how this solution works in practice!

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. The sample application is based on Spring Boot and exposes several REST endpoints. However, in this exercise, we will use a ready-made image published on my Quay: quay.io/pminkows/sample-kotlin-spring:1.5.1.1.

Install Kube Startup CPU Boost

The Kubernetes cluster you are using must enable the In-Place Pod Resize feature. The activation method may vary by Kubernetes distribution. For the Minikube I am using in today’s example, it looks like this:

minikube start --memory='8gb' --cpus='6' --feature-gates=InPlacePodVerticalScaling=true
ShellSession

After that, we can proceed with installing the Kube Startup CPU Boost controller. There are several ways to achieve it. The easiest way to do this is with a Helm chart. Let’s add the following Helm repository:

helm repo add kube-startup-cpu-boost https://google.github.io/kube-startup-cpu-boost
ShellSession

Then, we can install the kube-startup-cpu-boost chart in the dedicated kube-startup-cpu-boost-system namespace using the following command:

helm install -n kube-startup-cpu-boost-system kube-startup-cpu-boost \
  kube-startup-cpu-boost/kube-startup-cpu-boost --create-namespace
ShellSession

If the installation was successful, you should see the following pod running in the kube-startup-cpu-boost-system namespace as below.

$ kubectl get pod -n kube-startup-cpu-boost-system
NAME                                                         READY   STATUS    RESTARTS   AGE
kube-startup-cpu-boost-controller-manager-75f95d5fb6-692s6   1/1     Running   0          36s
ShellSession

Install Monitoring Stack (optional)

Then, we can install Prometheus monitoring. It is an optional step to verify pod resource usage in the graphical form. Firstly, let’s install the following Helm repository:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
ShellSession

After that, we can install the latest version of the kube-prometheus-stack chart in the monitoring namespace.

helm install my-kube-prometheus-stack prometheus-community/kube-prometheus-stack \
  -n monitoring --create-namespace
ShellSession

Let’s verify the installation succeeded by listing the pods running in the monitoring namespace.

$ kubectl get pods -n monitoring
NAME                                                         READY   STATUS    RESTARTS   AGE
alertmanager-my-kube-prometheus-stack-alertmanager-0         2/2     Running   0          38s
my-kube-prometheus-stack-grafana-f8bb6b8b8-mzt4l             3/3     Running   0          48s
my-kube-prometheus-stack-kube-state-metrics-99f4574c-bf5ln   1/1     Running   0          48s
my-kube-prometheus-stack-operator-6d58dd9d6c-6srtg           1/1     Running   0          48s
my-kube-prometheus-stack-prometheus-node-exporter-tdwmr      1/1     Running   0          48s
prometheus-my-kube-prometheus-stack-prometheus-0             2/2     Running   0          38s
ShellSession

Finally, we can expose the Prometheus console over localhost using the port forwarding feature:

kubectl port-forward svc/my-kube-prometheus-stack-prometheus 9090:9090 -n monitoring
ShellSession

Configure Kube Startup CPU Boost

The Kube Startup CPU Boost configuration is pretty intuitive. We need to create a StartupCPUBoost resource. It can manage multiple applications based on a given selector. In our case, it is a single sample-kotlin-spring Deployment determined by the app.kubernetes.io/name label (1). The next step is to define the resource management policy (2). The Kube Startup CPU Boost increases both request and limit by 50%. Resources should only be increased for the duration of the startup (3). Therefore, once the readiness probe succeeds, the resource level will return to its initial state. Of course, everything happens in-place without restarting the container.

apiVersion: autoscaling.x-k8s.io/v1alpha1
kind: StartupCPUBoost
metadata:
  name: sample-kotlin-spring
  namespace: demo
selector:
  matchExpressions: # (1)
  - key: app.kubernetes.io/name
    operator: In
    values: ["sample-kotlin-spring"]
spec:
  resourcePolicy: # (2)
    containerPolicies:
    - containerName: sample-kotlin-spring
      percentageIncrease:
        value: 50
  durationPolicy: # (3)
    podCondition:
      type: Ready
      status: "True"
YAML

Next, we will deploy our sample application. Here’s the Deployment manifest of our Spring Boot app. The name of the app container is sample-kotlin-spring, which matches the target Deployment name defined inside the StartupCPUBoost object (1). Then, we set the CPU limit to 500 millicores (2). There’s also a new field resizePolicy. It tells Kubernetes whether a change to CPU or memory can be applied in-place or requires a Pod restart. (3). The NotRequired value means that changing the resource limit or request will not trigger a pod restart. The Deployment object also contains a readiness probe that calls the GET/actuator/health/readiness exposed with the Spring Boot Actuator (4).

apiVersion: apps/v1
kind: Deployment
metadata:
  name: sample-kotlin-spring
  namespace: demo
  labels:
    app: sample-kotlin-spring
    app.kubernetes.io/name: sample-kotlin-spring
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sample-kotlin-spring
  template:
    metadata:
      labels:
        app: sample-kotlin-spring
        app.kubernetes.io/name: sample-kotlin-spring
    spec:
      containers:
      - name: sample-kotlin-spring # (1)
        image: quay.io/pminkows/sample-kotlin-spring:1.5.1.1
        ports:
        - containerPort: 8080
        resources:
          limits:
            cpu: 500m # (2)
            memory: "1Gi"
          requests:
            cpu: 200m
            memory: "256Mi"
        resizePolicy: # (3)
        - resourceName: "cpu"
          restartPolicy: "NotRequired"
        readinessProbe: # (4)
          httpGet:
            path: /actuator/health/readiness
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 15
          periodSeconds: 5
          successThreshold: 1
          failureThreshold: 3
YAML

Here are the pod requests and limits configured by Kube Startup CPU Boost. As you can see, the request is set to 300m, while the limit is completely removed.

in-place-pod-resize-boost

Once the application startup process completes, Kube Startup CPU Boost restores the initial request and limit.

Now we can switch to the Prometheus console to see the history of CPU request values for our pod. As you can see, the request was temporarily increased during the pod startup.

The chart below illustrates CPU usage when the application is launched and then during normal operation.

in-place-pod-resize-metric

We can also define the fixed resources for a target container. The CPU requests and limits of the selected container will be set to the given values (1). If you do not want the operator to remove the CPU limit during boost time, set the REMOVE_LIMITS environment variable to false in the kube-startup-cpu-boost-controller-manager Deployment.

apiVersion: autoscaling.x-k8s.io/v1alpha1
kind: StartupCPUBoost
metadata:
  name: sample-kotlin-spring
  namespace: demo
selector:
  matchExpressions:
  - key: app.kubernetes.io/name
    operator: In
    values: ["sample-kotlin-spring"]
spec:
  resourcePolicy:
    containerPolicies: # (1)
    - containerName: sample-kotlin-spring
      fixedResources:
        requests: "500m"
        limits: "2"
  durationPolicy:
    podCondition:
      type: Ready
      status: "True"
YAML

Conclusion

There are many ways to address application CPU demand during startup. First, you don’t need to set a CPU limit for Deployment. What’s more, many people believe that setting a CPU limit doesn’t make sense, but for different reasons. In this situation, the request issue remains, but given the short timeframe and the significantly higher usage than in the declaration, it isn’t material.

Other solutions are related to strictly Java features. If we compile the application natively with GraalVM or use the CRaC feature, we will significantly speed up startup and reduce CPU requirements.

Finally, several solutions rely on in-place resizing. If you use Kyverno, consider its mutate policy, which can modify resources in response to an application startup event. The Kube Startup CPU Boost tool described in this article operates similarly but is designed exclusively for this use case. In the near future, Vertical Pod Autoscaler will also offer a CPU boost via in-place resize.

Leave a Reply