Java Flight Recorder on Kubernetes

In this article, you will learn how to continuously monitor apps on Kubernetes with Java Flight Recorder and Cryostat. Java Flight Recorder (JFR) is a tool for collecting diagnostic and profiling data generated by the Java app. It is designed for use even in heavily loaded production environments since it causes almost no performance overhead. We can say that Java Flight Recorder acts similarly to an airplane’s black box. Even if the JVM crashes, we can analyze the diagnostic data collected just before the failure. This fact makes JFR especially usable in an environment with many running apps – like Kubernetes.

Assuming that we are running many Java apps on Kubernetes, we should interested in the tool that helps to automatically gather data generated by Java Flight Recorder. Here comes Cryostat. It allows us to securely manage JFR recordings for the containerized Java workloads. With the built-in discovery mechanism, it can detect all the apps that expose JFR data. Depending on the use case, we can store and analyze recordings directly on the Kubernetes cluster Cryostat Dashboard or export recorded data to perform a more in-depth analysis.

If you are interested in more topics related to Java apps on Kubernetes, you can take a look at some other posts on my blog. The following article describes a list of best practices for running Java apps Kubernetes. You can also read e.g. on how to resize CPU limit to speed up Java startup on Kubernetes here.

Source Code

If you would like to try it by yourself, you may always take a look at my source code. In order to do that you need to clone my GitHub repository. Then you need to go to the callme-service directory. After that, you should just follow my instructions. Let’s begin.

Install Cryostat on Kubernetes

In the first step, we install Cryostat on Kubernetes using its operator. In order to use and manage operators on Kubernetes, we should have the Operator Lifecycle Manager (OLM) installed on the cluster. The operator-sdk binary provides a command to easily install and uninstall OLM:

$ operator-sdk olm install

Alternatively, you can use Helm chart for Cryostat installation on Kubernetes. Firstly, let’s add the following repository:
$ helm repo add openshift https://charts.openshift.io/

Then, install the chart with the following command:
$ helm install my-cryostat openshift/cryostat --version 0.4.0

Once the OLM is running on our cluster, we can proceed to the Cryostat installation. We can find the required YAML manifest with the Subscription declaration in the Operator Hub. Let’s just apply the manifest to the target with the following command:

$ kubectl create -f https://operatorhub.io/install/cryostat-operator.yaml

By default, this operator will be installed in the operators namespace and will be usable from all namespaces in the cluster. After installation, we can verify if the operator works fine by executing the following command:

$ kubectl get csv -n operators

In order to simplify the Cryostat installation process, we can use OpenShift. With OpenShift we don’t need to install OLM, since it is already there. We just need to find the “Red Hat build of Cryostat” operator in the Operator Hub and install it using OpenShift Console. By default, the operator is available in the openshift-operators namespace.

Then, let’s create a namespace dedicated to running Cryostat and our sample app. The name of the namespace is demo-jfr.

$ kubectl create ns demo-jfr

Cryostat recommends using a cert-manager for traffic encryption. In our exercise, we disable that integration for simplification purposes. However, in the production environment, you should install “cert-manager” unless you do not use another solution for encrypting traffic. In order to run Cryostat in the selected namespace, we need to create the Cryostat object. The parameter spec.enableCertManager should be set to false.

apiVersion: operator.cryostat.io/v1beta1
kind: Cryostat
metadata:
  name: cryostat-sample
  namespace: demo-jfr
spec:
  enableCertManager: false
  eventTemplates: []
  minimal: false
  reportOptions:
    replicas: 0
  storageOptions:
    pvc:
      annotations: {}
      labels: {}
      spec: {}
  trustedCertSecrets: []

If everything goes fine, you should see the following pod in the demo-jfr namespace:

$ kubectl get po -n demo-jfr
NAME                               READY   STATUS    RESTARTS   AGE
cryostat-sample-5c57c9b8b8-smzx9   3/3     Running   0          60s

Here’s a list of Kubernetes Services. The Cryostat Dashboard is exposed by the cryostat-sample Service under the 8181 port.

$ kubectl get svc -n demo-jfr
NAME                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
cryostat-sample           ClusterIP   172.31.56.83    <none>        8181/TCP,9091/TCP   70m
cryostat-sample-grafana   ClusterIP   172.31.155.26   <none>        3000/TCP            70m

We can access the Cryostat dashboard using the Kubernetes Ingress or OpenShift Route. Currently, there are no apps to monitor.

Create Sample Java App

We build a sample Java app using the Spring Boot framework. Our app exposes a single REST endpoint. As you see the endpoint implementation is very simple. The pingWithRandomDelay() method adds a random delay between 0 and 3 seconds and returns the string. However, there is one interesting thing inside that method. We are creating the ProcessingEvent object (1). Then, we call its begin method just before sleeping the thread (2). After the method is resumed we call the commit method on the ProcessingEvent object (3). In this inconspicuous way, we are generating our first custom JFR event. This event aims to monitor the processing time of our method.

@RestController
@RequestMapping("/callme")
public class CallmeController {

   private static final Logger LOGGER = LoggerFactory.getLogger(CallmeController.class);

   private Random random = new Random();
   private AtomicInteger index = new AtomicInteger();

   @Value("${VERSION}")
   private String version;

   @GetMapping("/ping-with-random-delay")
   public String pingWithRandomDelay() throws InterruptedException {
      int r = new Random().nextInt(3000);
      int i = index.incrementAndGet();
      ProcessingEvent event = new ProcessingEvent(i); // (1)
      event.begin(); // (2)
      LOGGER.info("Ping with random delay: id={}, name={}, version={}, delay={}", i,
             buildProperties.isPresent() ? buildProperties.get().getName() : "callme-service", version, r);
      Thread.sleep(r);
      event.commit(); // (3)
      return "I'm callme-service " + version;
   }

}

Let’s switch to the ProcessingEvent implementation. Our custom event needs to extend the jdk.jfr.Event abstract class. It contains a single parameter id. We can use some additional labels to improve the event presentation in the JFR graphical tools. The event will be visible under the name set in the @Name annotation and under the category set in the @Category annotation. We also need to annotate the parameter @Label to make it visible as part of the event.

@Name("ProcessingEvent")
@Category("Custom Events")
@Label("Processing Time")
public class ProcessingEvent extends Event {
    @Label("Event ID")
    private Integer id;

    public ProcessingEvent(Integer id) {
        this.id = id;
    }

    public Integer getId() {
        return id;
    }

    public void setId(Integer id) {
        this.id = id;
    }
}

Of course, our app will generate a lot of standard JFR events useful for profiling and monitoring. But we could also monitor our custom event.

Build App Image and Deploy on Kubernetes

Once we finish the implementation, we may build the container image of our Spring Boot app. Spring Boot comes with a feature for building container images based on the Cloud Native Buildpacks. In the Maven pom.xml you will find a dedicated profile under the build-image id. Once you activate such a profile, it will build the image using the Paketo builder-jammy-base image.

<profile>
  <id>build-image</id>
  <build>
    <plugins>
      <plugin>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-maven-plugin</artifactId>
        <configuration>
          <image>
            <builder>paketobuildpacks/builder-jammy-base:latest</builder>
            <name>piomin/${project.artifactId}:${project.version}</name>
          </image>
        </configuration>
        <executions>
          <execution>
            <goals>
              <goal>build-image</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</profile>

Before running the build we should start Docker on the local machine. After that, we should execute the following Maven command:

$ mvn clean package -Pbuild-image -DskipTests

With the build-image profile activated, Spring Boot Maven Plugin builds the image of our app. You should have a similar result as shown below. In my case, the image tag is piomin/callme-service:1.2.1.

By default, Paketo Java Buildpacks uses BellSoft Liberica JDK. With the Paketo BellSoft Liberica Buildpack, we can easily enable Java Flight Recorder for the container using the BPL_JFR_ENABLED environment variable. In order to expose data for Cryostat, we also need to enable the JMX port. In theory, we could use BPL_JMX_ENABLED and BPL_JMX_PORT environment variables for that. However, that option includes some additional configuration to the java command parameters that break the Cryostat discovery. This issue has been already described here. Therefore we will use the JAVA_TOOL_OPTIONS environment variable to set the required JVM parameters directly on the running command.

Instead of exposing the JMX port for discovery, we can include the Cryostat agent in the app dependencies. In that case, we should set the address of the Cryostat API in the Kubernetes Deployment manifest. However, I prefer an approach that doesn’t require any changes on the app side.

Now, let’s back to the Cryostat app discovery. Cryostat is able to automatically detect pods with a JMX port exposed. It requires the concrete configuration of the Kubernetes Service. We need to set the name of the port to jfr-jmx. In theory, we can expose JMX on any port we want, but for me anything other than 9091 caused discovery problems on Cryostat. In the Deployment definition, we have to set the BPL_JFR_ENABLED env to true, and the JAVA_TOOL_OPTIONS to -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=9091.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: callme-service
spec:
  replicas: 1
  selector:
    matchLabels:
      app: callme-service
  template:
    metadata:
      labels:
        app: callme-service
    spec:
      containers:
        - name: callme-service
          image: piomin/callme-service:1.2.1
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 8080
            - containerPort: 9091
          env:
            - name: VERSION
              value: "v1"
            - name: BPL_JFR_ENABLED
              value: "true"
            - name: JAVA_TOOL_OPTIONS
              value: "-Dcom.sun.management.jmxremote.port=9091 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false"
---
apiVersion: v1
kind: Service
metadata:
  name: callme-service
  labels:
    app: callme-service
spec:
  type: ClusterIP
  ports:
  - port: 8080
    name: http
  - port: 9091
    name: jfr-jmx
  selector:
    app: callme-service

Let’s apply our deployment manifest to the demo-jfr namespace:

$ kubectl apply -f k8s/deployment-jfr.yaml -n demo-jfr

Here’s a list of pods of our callme-service app:

$ kubectl get po -n demo-jfr -l app=callme-service -o wide
NAME                              READY   STATUS    RESTARTS   AGE   IP            NODE
callme-service-6bc5745885-kvqfr   1/1     Running   0          31m   10.134.0.29   worker-cluster-lvsqq-1

Using Cryostat with JFR

View Default Dashboards

Cryostat automatically detects all the pods related to the Kubernetes Service that expose the JMX port. Once we switch to the Cryostat Dashboard, we will see the name of our pod in the “Target” dropdown. The default dashboard shows diagrams illustrating CPU load, heap memory usage, and a number of running Java threads.

java-flight-recorder-kubernetes-dashboard

Then, we can go to the “Recordings” section. It shows a list of active recordings made by Java Flight Recorder for our app running on Kubernetes. By default, Cryostat creates and starts a single recording per each detected target.

We can expand the selected record to see a detailed view. It provides a summarized panel divided into several different categories like heap, memory leak, or exceptions. It highlights warnings with a yellow color and problems with a red color.

We can display a detailed description of each case. We just need to click on the selected field with a problem name. The detailed description will appear in the context menu.

java-flight-recorder-kubernetes-description

Create and Use a Custom Event Template

We can create a custom recording strategy by defining a new event template. Firstly, we need to go to the “Events” section, and then to the “Event Templates” tab. There are three built-in templates. We can use each of them as a base for our custom template. After deciding which of them to choose we can download it to our laptop. The default file extension is *.jfc.

java-flight-recorder-kubernetes-event-templates

In order to edit the *.jfc files we need a special tool called JDK Mission Control. Each vendor provides such a tool for their distribution of JDK. In our case, it is BellSoft Liberica. Once we download and install Liberica Mission Control on the laptop we should go to Window -> Flight Recording Template Manager.

java-flight-recorder-kubernetes-mission-control

With the Flight Recording Template Manager, we can import and edit an exported event template. I choose the higher monitoring for “Garbage Collection”, “Allocation Profiling”, “Compiler”, and “Thread Dump”.

java-flight-recorder-kubernetes-template-manager

Once a new template is ready, we should save it under the selected name. For me, it is the “Continuous Detailed” name. After that, we need to export the template to the file.

Then, we need to switch to the Cryostat Dashboard. We have to import the newly created template exported to the *.jfc file.

Once you import the template, you should see a new strategy in the “Event Templates” section.

We can create a recording based on our custom “Continuous_Detailed” template. After some time, Cryostat should gather data generated by the Java Flight Recorder for the app running on Kubernetes. However, this time we want to make some advanced analysys using Liberica Mission Control rather than just with the Cryostat Dashboard. Therefore we will export the recording to the *.jfr file. Such a file may be then imported to the JDK Mission Control tool.

Use the JDK Mission Control Tool

Let’s open the exported *.jfr file with Liberica Mission Control. Once we do it, we can analyze all the important aspects related to the performance of our Java app. We can display a table with memory allocation per the object type.

We can display a list of running Java threads.

Finally, we go to the “Event Browser” section. In the “Custom Events” category we should find our custom event under the name determined by the @Label annotation on the ProcessingEvent class. We can see the history of all generated JFR events together with the duration, start time, and the name of the processing thread.

Final Thoughts

Cryostat helps you to manage the Java Flight Recorder on Kubernetes at scale. It provides a graphical dashboard that allows to monitoring of all the Java workloads that expose JFR data over JMX. The important thing is that even after an app crash we can export the archived monitoring report and analyze it using advanced tools like JDK Mission Control.

Piotr's TechBlog

Java Flight Recorder on Kubernetes