Best Practices for Java Apps on Kubernetes

Best Practices for Java Apps on Kubernetes

In this article, you will read about the best practices for running Java apps on Kubernetes. Most of these recommendations will also be valid for other languages. However, I’m considering all the rules in the scope of Java characteristics and also showing solutions and tools available for JVM-based apps. Some of these Kubernetes recommendations are forced by design when using the most popular Java frameworks like Spring Boot or Quarkus. I’ll show you how to effectively leverage them to simplify developer life.

I’m writing a lot about topics related to both Kubernetes and Java. You can find many practical examples on my blog. Some time ago I published a similar article to that one – but mostly focused on best practices for microservices-based apps. You can find it here.

Don’t Set Limits Too Low

Should we set limits for Java apps on Kubernetes or not? The answer seems to be obvious. There are many tools that validate your Kubernetes YAML manifests, and for sure they will print a warning if you don’t set CPU or memory limits. However, there are some “hot discussions” in the community about that. Here’s an interesting article that does not recommend setting any CPU limits. Here’s another article written as a counterpoint to the previous one. They consider CPU limits, but we could as well begin a similar discussion for memory limits. Especially in the context of Java apps 🙂

However, for memory management, the proposition seems to be quite different. Let’s read the other article – this time about memory limits and requests. In short, it recommends always setting the memory limit. Moreover, the limit should be the same as the request. In the context of Java apps, it is also important that we may limit the memory with JVM parameters like -Xmx, -XX:MaxMetaspaceSize or -XX:ReservedCodeCacheSize. Anyway, from the Kubernetes perspective, the pod receives the resources it requests. The limit has nothing to do with it.

It all leads me to the first recommendation today – don’t set your limits too low. Even if you set a CPU limit, it shouldn’t impact your app. For example, as you probably know, even if your Java app doesn’t consume much CPU in normal work, it requires a lot of CPU to start fast. For my simple Spring Boot app that connects MongoDB on Kubernetes, the difference between no limit and even 0.5 core is significant. Normally it starts below 10 seconds:

kubernetes-java-startup

With the CPU limit set to 500 millicores, it starts ~30 seconds:

Of course, we could find some examples. But we will discuss them also in the next sections.

Beginning from the 1.27 version of Kubernetes you may take advantage of the feature called “In-Place Vertical Pod Scaling”. It allows users to resize CPU/memory resources allocated to pods without restarting the containers. Such an approach may help us to speed up Java startup on Kubernetes and keep adequate resource limits (especially CPU limits) for the app at the same time. You can read more about that in the following article: https://piotrminkowski.com/2023/08/22/resize-cpu-limit-to-speed-up-java-startup-on-kubernetes/.

Consider Memory Usage First

Let’s focus just on the memory limit. If you run a Java app on Kubernetes, you have two levels of limiting maximum usage: container and JVM. However, there are also some defaults if you don’t specify any settings for JVM. JVM sets its maximum heap size to approximately 25% of the available RAM in case you don’t set the -Xmx parameter. This value is counted based on the memory visible inside the container. Once you won’t set a limit at the container level, JVM will see the whole memory of the node.

Before running the app on Kubernetes, you should at least measure how much memory it consumes at the expected load. Fortunately, there are tools that may optimize memory configuration for Java apps running in containers. For example, Paketo Buildpacks comes with a built-in memory calculator. It calculates the -Xmx JVM flag using the formula Heap = Total Container Memory - Non-Heap - Headroom. On the other hand, the non-heap value is calculated using the following formula: Non-Heap = Direct Memory + Metaspace + Reserved Code Cache + (Thread Stack * Thread Count).

Paketo Buildpacks is currently a default option for building Spring Boot apps (with the mvn spring-boot:build-image command). Let’s try it for our sample app. Assuming we will set the memory limit to 512M it will calculate -Xmx at a level of 130M.

kubernetes-java-memory

Is it fine for my app? I should at least perform some load tests to verify how my app performs under heavy traffic. But once again – don’t set the limits too low. For example, with the 1024M limit, the -Xmx equals 650M.

As you see we take care of memory usage with JVM parameters. It prevents us from OOM kills described in the article mentioned in the first section. Therefore, setting the request at the same level as the limit does not make much sense. I would recommend setting it a little higher than normal usage – let’s say 20% more.

Proper Liveness and Readiness Probes

Introduction

It is essential to understand the difference between liveness and readiness probes in Kubernetes. If both these probes are not implemented carefully, they can degrade the overall operation of a service, for example by causing unnecessary restarts. The third type of probe, the startup probe, is a relatively new feature in Kubernetes. It allows us to avoid setting initialDelaySeconds on liveness or readiness probes and therefore is especially useful if your app startup takes a lot of time. For more details about Kubernetes probes in general and best practices, I can recommend that very interesting article.

A liveness probe is used to decide whether to restart the container or not. If an application is unavailable for any reason, restarting the container sometimes can make sense. On the other hand, a readiness probe is used to decide if a container can handle incoming traffic. If a pod has been recognized as not ready, it is removed from load balancing. Failure of the readiness probe does not result in pod restart. The most typical liveness or readiness probe for web applications is realized via an HTTP endpoint.

Since subsequent failures of the liveness probe result in pod restart, it should not check the availability of your app integrations. Such things should be verified by the readiness probe.

Configuration Details

The good news is that the most popular Java frameworks like Spring Boot or Quarkus provide an auto-configured implementation of both Kubernetes probes. They follow best practices, so we usually don’t have to take about basics. However, in Spring Boot besides including the Actuator module you need to enable them with the following property:

management:
  endpoint: 
    health:
      probes:
        enabled: true

Since Spring Boot Actuator provides several endpoints (e.g. metrics, traces) it is a good idea to expose it on a different port than a default (usually 8080). Of course, the same rule applies to other popular Java frameworks. On the other hand, a good practice is to check your main app port – especially in the readiness probe. Since it defines if our app is ready to process incoming requests, it should listen also on the main port. It looks just the opposite with the liveness probe. If let’s say the whole working thread pool is busy, I don’t want to restart my app. I just don’t want to receive incoming traffic for some time.

We can also customize other aspects of Kubernetes probes. Let’s say that our app connects to the external system, but we don’t verify that integration in our readiness probe. It is not critical and doesn’t have a direct impact on our operational status. Here’s a configuration that allows us to include in the probe only the selected set of integrations (1) and also exposes readiness on the main server port (2).

spring:
  application:
    name: sample-spring-boot-on-kubernetes
  data:
    mongodb:
      host: ${MONGO_URL}
      port: 27017
      username: ${MONGO_USERNAME}
      password: ${MONGO_PASSWORD}
      database: ${MONGO_DATABASE}
      authentication-database: admin

management:
  endpoint.health:
    show-details: always
    group:
      readiness:
        include: mongo # (1)
        additional-path: server:/readiness # (2)
    probes:
      enabled: true
  server:
    port: 8081

Hardly ever our application is able to exist without any external solutions like databases, message brokers, or just other applications. When configuring the readiness probe we should consider connection settings to that system carefully. Firstly you should consider the situation when external service is not available. How you will handle it? I suggest decreasing these timeouts to lower values as shown below.

spring:
  application:
    name: sample-spring-kotlin-microservice
  datasource:
    url: jdbc:postgresql://postgres:5432/postgres
    username: postgres
    password: postgres123
    hikari:
      connection-timeout: 2000
      initialization-fail-timeout: 0
  jpa:
    database-platform: org.hibernate.dialect.PostgreSQLDialect
  rabbitmq:
    host: rabbitmq
    port: 5672
    connection-timeout: 2000

Choose The Right JDK

If you have already built images with Dockerfile it is possible that you were using the official OpenJDK base image from the Docker Hub. However, currently, the announcement on the image site says that it is officially deprecated and all users should find suitable replacements. I guess it may be quite confusing, so you will find a detailed explanation of the reasons here.

All right, so let’s consider which alternative we should choose. Different vendors provide several replacements. If you are looking for a detailed comparison between them you should go to the following site. It recommends using Eclipse Temurin in the 21 version.

On the other hand, the most popular image build tools like Jib or Cloud Native Buildpacks automatically choose a vendor for you. By default, Jib uses Eclipse Temurin, while Paketo Buildpacks uses Bellsoft Liberica implementation. Of course, you can easily override these settings. I think it might make sense if you, for example, run your app in the environment matched to the JDK provider, like AWS and Amazon Corretto.

Let’s say we use Paketo Buildpacks and Skaffold for deploying Java apps on Kubernetes. In order to replace a default Bellsoft Liberica buildpack with another one we just need to set it literally in the buildpacks section. Here’s an example that leverages the Amazon Corretto buildpack.

apiVersion: skaffold/v2beta22
kind: Config
metadata:
  name: sample-spring-boot-on-kubernetes
build:
  artifacts:
    - image: piomin/sample-spring-boot-on-kubernetes
      buildpacks:
        builder: paketobuildpacks/builder:base
        buildpacks:
          - paketo-buildpacks/amazon-corretto
          - paketo-buildpacks/java
        env:
          - BP_JVM_VERSION=21

We can also easily test the performance of our apps with different JDK vendors. If you are looking for an example of such a comparison you can read my article describing such tests and results. I measured the different JDK performance for the Spring Boot 3 app that interacts with the Mongo database using several available Paketo Java Buildpacks.

Consider Migration To Native Compilation 

Native compilation is a real “game changer” in the Java world. But I can bet that not many of you use it – especially on production. Of course, there were (and still are) numerous challenges in the migration of existing apps into the native compilation. The static code analysis performed by the GraalVM during build time can result in errors like ClassNotFound, as or MethodNotFound. To overcome these challenges, we need to provide several hints to let GraalVM know about the dynamic elements of the code. The number of those hints usually depends on the number of libraries and the general number of language features used in the app.

The Java frameworks like Quarkus or Micronaut try to address challenges related to a native compilation by design. For example, they are avoiding using reflection wherever possible. Spring Boot has also improved native compilation support a lot through the Spring Native project. So, my advice in this area is that if you are creating a new application, prepare it the way it will be ready for native compilation. For example, with Quarkus you can simply generate a Maven configuration that contains a dedicated profile for building a native executable.

<profiles>
  <profile>
    <id>native</id>
    <activation>
      <property>
        <name>native</name>
      </property>
    </activation>
    <properties>
      <skipITs>false</skipITs>
      <quarkus.package.type>native</quarkus.package.type>
    </properties>
  </profile>
</profiles>

Once you add it you can just a native build with the following command:

$ mvn clean package -Pnative

Then you can analyze if there are any issues during the build. Even if you do not run native apps on production now (for example your organization doesn’t approve it), you should place GraalVM compilation as a step in your acceptance pipeline. You can easily build the Java native image for your app with the most popular frameworks. For example, with Spring Boot you just need to provide the following configuration in your Maven pom.xml as shown below:

<plugin>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-maven-plugin</artifactId>
  <executions>
    <execution>
      <goals>
        <goal>build-info</goal>
        <goal>build-image</goal>
      </goals>
    </execution>
  </executions>
  <configuration>
    <image>
      <builder>paketobuildpacks/builder:tiny</builder>
      <env>
        <BP_NATIVE_IMAGE>true</BP_NATIVE_IMAGE>
        <BP_NATIVE_IMAGE_BUILD_ARGUMENTS>
          --allow-incomplete-classpath
        </BP_NATIVE_IMAGE_BUILD_ARGUMENTS>
      </env>
    </image>
  </configuration>
</plugin>

Configure Logging Properly

Logging is probably not the first thing you are thinking about when writing your Java apps. However, at the global scope, it becomes very important since we need to be able to collect, store data, and finally search and present the particular entry quickly. The best practice is to write your application logs to the standard output (stdout) and standard error (stderr) streams.

Fluentd is a popular open-source log aggregator that allows you to collect logs from the Kubernetes cluster, process them, and then ship them to a data storage backend of your choice. It integrates seamlessly with Kubernetes deployments. Fluentd tries to structure data as JSON to unify logging across different sources and destinations. Assuming that probably the best way is to prepare logs in this format. With JSON format we can also easily include additional fields for tagging logs and then easily search them in the visual tool with various criteria.

In order to format our logs to JSON readable by Fluentd we may include the Logstash Logback Encoder library in Maven dependencies.

<dependency>
   <groupId>net.logstash.logback</groupId>
   <artifactId>logstash-logback-encoder</artifactId>
   <version>7.2</version>
</dependency>

Then we just need to set a default console log appender for our Spring Boot application in the file logback-spring.xml.

<configuration>
    <appender name="consoleAppender" class="ch.qos.logback.core.ConsoleAppender">
        <encoder class="net.logstash.logback.encoder.LogstashEncoder"/>
    </appender>
    <logger name="jsonLogger" additivity="false" level="DEBUG">
        <appender-ref ref="consoleAppender"/>
    </logger>
    <root level="INFO">
        <appender-ref ref="consoleAppender"/>
    </root>
</configuration>

Should we avoid using additional log appenders and just print logs just to the standard output? From my experience, the answer is – no. You can still alternative mechanisms for sending the logs. Especially if you use more than one tool for collecting logs in your organization – for example internal stack on Kubernetes and global stack outside. Personally, I’m using a tool that helps me to resolve performance problems e.g. message broker as a proxy. In Spring Boot we easily use RabbitMQ for that. Just include the following starter:

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-amqp</artifactId>
</dependency>

Then you need to provide a similar appender configuration in the logback-spring.xml:

<?xml version="1.0" encoding="UTF-8"?>
<configuration>

  <springProperty name="destination" source="app.amqp.url" />

  <appender name="AMQP"
		class="org.springframework.amqp.rabbit.logback.AmqpAppender">
    <layout>
      <pattern>
{
  "time": "%date{ISO8601}",
  "thread": "%thread",
  "level": "%level",
  "class": "%logger{36}",
  "message": "%message"
}
      </pattern>
    </layout>

    <addresses>${destination}</addresses>	
    <applicationId>api-service</applicationId>
    <routingKeyPattern>logs</routingKeyPattern>
    <declareExchange>true</declareExchange>
    <exchangeName>ex_logstash</exchangeName>

  </appender>

  <root level="INFO">
    <appender-ref ref="AMQP" />
  </root>

</configuration>

Create Integration Tests

Ok, I know – it’s not directly related to Kubernetes. However, since we use Kubernetes to manage and orchestrate containers, we should also run integration tests on the containers. Fortunately, with Java frameworks, we can simplify that process a lot. For example, Quarkus allows us to annotate the test with @QuarkusIntegrationTest. It is a really powerful solution in conjunction with the Quarkus containers build feature. We can run the tests against an already-built image containing the app. First, let’s include the Quarkus Jib module:

<dependency>
   <groupId>io.quarkus</groupId>
   <artifactId>quarkus-container-image-jib</artifactId>
</dependency>

Then we have to enable container build by setting the quarkus.container-image.build property to true in the application.properties file. In the test class, we can use @TestHTTPResource and @TestHTTPEndpoint annotations to inject the test server URL. Then we are creating a client with the RestClientBuilder and call the service started on the container. The name of the test class is not accidental. In order to be automatically detected as the integration test, it has the IT suffix.

@QuarkusIntegrationTest
public class EmployeeControllerIT {

    @TestHTTPEndpoint(EmployeeController.class)
    @TestHTTPResource
    URL url;

    @Test
    void add() {
        EmployeeService service = RestClientBuilder.newBuilder()
                .baseUrl(url)
                .build(EmployeeService.class);
        Employee employee = new Employee(1L, 1L, "Josh Stevens", 
                                         23, "Developer");
        employee = service.add(employee);
        assertNotNull(employee.getId());
    }

    @Test
    public void findAll() {
        EmployeeService service = RestClientBuilder.newBuilder()
                .baseUrl(url)
                .build(EmployeeService.class);
        Set<Employee> employees = service.findAll();
        assertTrue(employees.size() >= 3);
    }

    @Test
    public void findById() {
        EmployeeService service = RestClientBuilder.newBuilder()
                .baseUrl(url)
                .build(EmployeeService.class);
        Employee employee = service.findById(1L);
        assertNotNull(employee.getId());
    }
}

You can find more details about that process in my previous article about advanced testing with Quarkus. The final effect is visible in the picture below. When we run the tests during the build with the mvn clean verify command, our test is executed after building the container image.

kubernetes-java-integration-tests

That Quarkus feature is based on the Testcontainers framework. We can also use Testcontainers with Spring Boot. Here’s the sample test of the Spring REST app and its integration with the PostgreSQL database.

@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
@Testcontainers
@TestMethodOrder(MethodOrderer.OrderAnnotation.class)
public class PersonControllerTests {

    @Autowired
    TestRestTemplate restTemplate;

    @Container
    static PostgreSQLContainer<?> postgres = 
       new PostgreSQLContainer<>("postgres:15.1")
            .withExposedPorts(5432);

    @DynamicPropertySource
    static void registerMySQLProperties(DynamicPropertyRegistry registry) {
        registry.add("spring.datasource.url", postgres::getJdbcUrl);
        registry.add("spring.datasource.username", postgres::getUsername);
        registry.add("spring.datasource.password", postgres::getPassword);
    }

    @Test
    @Order(1)
    void add() {
        Person person = Instancio.of(Person.class)
                .ignore(Select.field("id"))
                .create();
        person = restTemplate.postForObject("/persons", person, Person.class);
        Assertions.assertNotNull(person);
        Assertions.assertNotNull(person.getId());
    }

    @Test
    @Order(2)
    void updateAndGet() {
        final Integer id = 1;
        Person person = Instancio.of(Person.class)
                .set(Select.field("id"), id)
                .create();
        restTemplate.put("/persons", person);
        Person updated = restTemplate.getForObject("/persons/{id}", Person.class, id);
        Assertions.assertNotNull(updated);
        Assertions.assertNotNull(updated.getId());
        Assertions.assertEquals(id, updated.getId());
    }

}

Final Thoughts

I hope that this article will help you avoid some common pitfalls when running Java apps on Kubernetes. Treat it as a summary of other people’s recommendations I found in similar articles and my private experience in that area. Maybe you will find some of those rules are quite controversial. Feel free to share your opinions and feedback in the comments. It will also be valuable for me. If you like this article, once again, I recommend reading another one from my blog – more focused on running microservices-based apps on Kubernetes – Best Practices For Microservices on Kubernetes. But it also contains several useful (I hope) recommendations.

10 COMMENTS

comments user
Halil-Cem Gürsoy

Dear Piotr,
you write “Anyway, from the Kubernetes perspective, the pod receives the resources it requests. The limit has nothing to do with it.”.
Unfortunately, this is not correct.
The requests (spec.containers[].resources.requests.memory and spec.containers[].resources.requests.cpu) are used by the scheduler to identify the node on which to start a container.
And later on, the memory request value is used to set the oom-score-adj parameter for the kernel. Based on this, the kernel memory manager decides which processes have to be killed in case of memory shortage or node eviction.

    comments user
    piotr.minkowski

    Ok, but exactly is not correct? I guess that you mean, the pod may not get the resources it requests and therefore it won’t be scheduled on the node?

comments user
Matt Henry

Don’t configure your app to log to AMPQ. 12-Factor principles say that the application should log to the console (stdout) and it should be the infrastructures responsibility to ship the logs, using someone like Fluentd or Filebeat.

    comments user
    piotr.minkowski

    Yes, I know that. That’s why in summary I write there are some controversial bits of advice in the article. But now, if you carefully read what I wrote, and also read the 11th rule of 12-factor I think you may have a similar opinion as me. Since RabbitMQ is not the destination of my logs. Routing is not realized by my app – but by the amqp exchange. You can then send the logs queued by the rabbit to anywhere else – e.g. Logstash.

comments user
wind57

the problem with setting the CPU limit, is not the limit itself, imho. It is the fact that requests and limits on a CPU translate to very different things, which are rarele understood or even though about properly. When I had a lot more time, I even investigated that a lot: https://stackoverflow.com/questions/55047093/whats-the-difference-between-pod-resources-limits-and-resources-requests-in-kub/70591202#70591202 Otherwise there are multiple people aswell that I know that recommend not setting those limits at all, agreed.

comments user
Halil-Cem Gürsoy

Please excuse my misleading statement.
My point is that there is no guarantee that the resources that were requested will be available later. One example is node eviction if the node goes low on resources.

Anyway, just forgot to say a good article.

comments user
dnastacio

@Halil-Cem Gürsoy, a container, while running, will always have the requested resources. That is the guarantee.

If a parent pod has to be evicted, the pod containers are terminated and only then the resources are released for other purposes.

comments user
KAMESH

nice article, well thought out on various facets of the java app on a k8s env.

    comments user
    piotr.minkowski

    Thanks!

comments user
Halil-Cem Gürsoy

@dnastacio – sorry for the late reply, I don’t get a notification.

You have no guarantees.
Regarding the CPU finally the Kernel CFS decides how many CPU your process gets and this depends how all other processes are started on the node.
And regarding the memory, especially on cgroup v1 systems, there is no guarantee for the requested memory! This changes on cgroup v2 if this feature is configured in the kubelet – memory QoS is still alpha!

Leave a Reply