Java Flight Recorder on Kubernetes
In this article, you will learn how to continuously monitor apps on Kubernetes with Java Flight Recorder and Cryostat. Java Flight Recorder (JFR) is a tool for collecting diagnostic and profiling data generated by the Java app. It is designed for use even in heavily loaded production environments since it causes almost no performance overhead. We can say that Java Flight Recorder acts similarly to an airplane’s black box. Even if the JVM crashes, we can analyze the diagnostic data collected just before the failure. This fact makes JFR especially usable in an environment with many running apps – like Kubernetes.
Assuming that we are running many Java apps on Kubernetes, we should interested in the tool that helps to automatically gather data generated by Java Flight Recorder. Here comes Cryostat. It allows us to securely manage JFR recordings for the containerized Java workloads. With the built-in discovery mechanism, it can detect all the apps that expose JFR data. Depending on the use case, we can store and analyze recordings directly on the Kubernetes cluster Cryostat Dashboard or export recorded data to perform a more in-depth analysis.
If you are interested in more topics related to Java apps on Kubernetes, you can take a look at some other posts on my blog. The following article describes a list of best practices for running Java apps Kubernetes. You can also read e.g. on how to resize CPU limit to speed up Java startup on Kubernetes here.
Source Code
If you would like to try it by yourself, you may always take a look at my source code. In order to do that you need to clone my GitHub repository. Then you need to go to the callme-service
directory. After that, you should just follow my instructions. Let’s begin.
Install Cryostat on Kubernetes
In the first step, we install Cryostat on Kubernetes using its operator. In order to use and manage operators on Kubernetes, we should have the Operator Lifecycle Manager (OLM) installed on the cluster. The operator-sdk
binary provides a command to easily install and uninstall OLM:
$ operator-sdk olm install
Alternatively, you can use Helm chart for Cryostat installation on Kubernetes. Firstly, let’s add the following repository:
$ helm repo add openshift https://charts.openshift.io/
Then, install the chart with the following command:
$ helm install my-cryostat openshift/cryostat --version 0.4.0
Once the OLM is running on our cluster, we can proceed to the Cryostat installation. We can find the required YAML manifest with the Subscription
declaration in the Operator Hub. Let’s just apply the manifest to the target with the following command:
$ kubectl create -f https://operatorhub.io/install/cryostat-operator.yaml
By default, this operator will be installed in the operators
namespace and will be usable from all namespaces in the cluster. After installation, we can verify if the operator works fine by executing the following command:
$ kubectl get csv -n operators
In order to simplify the Cryostat installation process, we can use OpenShift. With OpenShift we don’t need to install OLM, since it is already there. We just need to find the “Red Hat build of Cryostat” operator in the Operator Hub and install it using OpenShift Console. By default, the operator is available in the openshift-operators
namespace.
Then, let’s create a namespace dedicated to running Cryostat and our sample app. The name of the namespace is demo-jfr
.
$ kubectl create ns demo-jfr
Cryostat recommends using a cert-manager for traffic encryption. In our exercise, we disable that integration for simplification purposes. However, in the production environment, you should install “cert-manager” unless you do not use another solution for encrypting traffic. In order to run Cryostat in the selected namespace, we need to create the Cryostat
object. The parameter spec.enableCertManager
should be set to false
.
apiVersion: operator.cryostat.io/v1beta1
kind: Cryostat
metadata:
name: cryostat-sample
namespace: demo-jfr
spec:
enableCertManager: false
eventTemplates: []
minimal: false
reportOptions:
replicas: 0
storageOptions:
pvc:
annotations: {}
labels: {}
spec: {}
trustedCertSecrets: []
If everything goes fine, you should see the following pod in the demo-jfr
namespace:
$ kubectl get po -n demo-jfr
NAME READY STATUS RESTARTS AGE
cryostat-sample-5c57c9b8b8-smzx9 3/3 Running 0 60s
Here’s a list of Kubernetes Services
. The Cryostat Dashboard is exposed by the cryostat-sample
Service
under the 8181
port.
$ kubectl get svc -n demo-jfr
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cryostat-sample ClusterIP 172.31.56.83 <none> 8181/TCP,9091/TCP 70m
cryostat-sample-grafana ClusterIP 172.31.155.26 <none> 3000/TCP 70m
We can access the Cryostat dashboard using the Kubernetes Ingress
or OpenShift Route
. Currently, there are no apps to monitor.
Create Sample Java App
We build a sample Java app using the Spring Boot framework. Our app exposes a single REST endpoint. As you see the endpoint implementation is very simple. The pingWithRandomDelay()
method adds a random delay between 0 and 3 seconds and returns the string. However, there is one interesting thing inside that method. We are creating the ProcessingEvent
object (1). Then, we call its begin
method just before sleeping the thread (2). After the method is resumed we call the commit method on the ProcessingEvent
object (3). In this inconspicuous way, we are generating our first custom JFR event. This event aims to monitor the processing time of our method.
@RestController
@RequestMapping("/callme")
public class CallmeController {
private static final Logger LOGGER = LoggerFactory.getLogger(CallmeController.class);
private Random random = new Random();
private AtomicInteger index = new AtomicInteger();
@Value("${VERSION}")
private String version;
@GetMapping("/ping-with-random-delay")
public String pingWithRandomDelay() throws InterruptedException {
int r = new Random().nextInt(3000);
int i = index.incrementAndGet();
ProcessingEvent event = new ProcessingEvent(i); // (1)
event.begin(); // (2)
LOGGER.info("Ping with random delay: id={}, name={}, version={}, delay={}", i,
buildProperties.isPresent() ? buildProperties.get().getName() : "callme-service", version, r);
Thread.sleep(r);
event.commit(); // (3)
return "I'm callme-service " + version;
}
}
Let’s switch to the ProcessingEvent
implementation. Our custom event needs to extend the jdk.jfr.Event
abstract class. It contains a single parameter id
. We can use some additional labels to improve the event presentation in the JFR graphical tools. The event will be visible under the name set in the @Name
annotation and under the category set in the @Category
annotation. We also need to annotate the parameter @Label
to make it visible as part of the event.
@Name("ProcessingEvent")
@Category("Custom Events")
@Label("Processing Time")
public class ProcessingEvent extends Event {
@Label("Event ID")
private Integer id;
public ProcessingEvent(Integer id) {
this.id = id;
}
public Integer getId() {
return id;
}
public void setId(Integer id) {
this.id = id;
}
}
Of course, our app will generate a lot of standard JFR events useful for profiling and monitoring. But we could also monitor our custom event.
Build App Image and Deploy on Kubernetes
Once we finish the implementation, we may build the container image of our Spring Boot app. Spring Boot comes with a feature for building container images based on the Cloud Native Buildpacks. In the Maven pom.xml
you will find a dedicated profile under the build-image
id. Once you activate such a profile, it will build the image using the Paketo builder-jammy-base
image.
<profile>
<id>build-image</id>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<configuration>
<image>
<builder>paketobuildpacks/builder-jammy-base:latest</builder>
<name>piomin/${project.artifactId}:${project.version}</name>
</image>
</configuration>
<executions>
<execution>
<goals>
<goal>build-image</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</profile>
Before running the build we should start Docker on the local machine. After that, we should execute the following Maven command:
$ mvn clean package -Pbuild-image -DskipTests
With the build-image profile activated, Spring Boot Maven Plugin builds the image of our app. You should have a similar result as shown below. In my case, the image tag is piomin/callme-service:1.2.1
.
By default, Paketo Java Buildpacks uses BellSoft Liberica JDK. With the Paketo BellSoft Liberica Buildpack, we can easily enable Java Flight Recorder for the container using the BPL_JFR_ENABLED
environment variable. In order to expose data for Cryostat, we also need to enable the JMX port. In theory, we could use BPL_JMX_ENABLED
and BPL_JMX_PORT
environment variables for that. However, that option includes some additional configuration to the java command parameters that break the Cryostat discovery. This issue has been already described here. Therefore we will use the JAVA_TOOL_OPTIONS
environment variable to set the required JVM parameters directly on the running command.
Instead of exposing the JMX port for discovery, we can include the Cryostat agent in the app dependencies. In that case, we should set the address of the Cryostat API in the Kubernetes Deployment manifest. However, I prefer an approach that doesn’t require any changes on the app side.
Now, let’s back to the Cryostat app discovery. Cryostat is able to automatically detect pods with a JMX port exposed. It requires the concrete configuration of the Kubernetes Service
. We need to set the name of the port to jfr-jmx
. In theory, we can expose JMX on any port we want, but for me anything other than 9091
caused discovery problems on Cryostat. In the Deployment
definition, we have to set the BPL_JFR_ENABLED
env to true
, and the JAVA_TOOL_OPTIONS
to -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=9091
.
apiVersion: apps/v1
kind: Deployment
metadata:
name: callme-service
spec:
replicas: 1
selector:
matchLabels:
app: callme-service
template:
metadata:
labels:
app: callme-service
spec:
containers:
- name: callme-service
image: piomin/callme-service:1.2.1
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
- containerPort: 9091
env:
- name: VERSION
value: "v1"
- name: BPL_JFR_ENABLED
value: "true"
- name: JAVA_TOOL_OPTIONS
value: "-Dcom.sun.management.jmxremote.port=9091 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false"
---
apiVersion: v1
kind: Service
metadata:
name: callme-service
labels:
app: callme-service
spec:
type: ClusterIP
ports:
- port: 8080
name: http
- port: 9091
name: jfr-jmx
selector:
app: callme-service
Let’s apply our deployment manifest to the demo-jfr
namespace:
$ kubectl apply -f k8s/deployment-jfr.yaml -n demo-jfr
Here’s a list of pods of our callme-service
app:
$ kubectl get po -n demo-jfr -l app=callme-service -o wide
NAME READY STATUS RESTARTS AGE IP NODE
callme-service-6bc5745885-kvqfr 1/1 Running 0 31m 10.134.0.29 worker-cluster-lvsqq-1
Using Cryostat with JFR
View Default Dashboards
Cryostat automatically detects all the pods related to the Kubernetes Service that expose the JMX port. Once we switch to the Cryostat Dashboard, we will see the name of our pod in the “Target” dropdown. The default dashboard shows diagrams illustrating CPU load, heap memory usage, and a number of running Java threads.
Then, we can go to the “Recordings” section. It shows a list of active recordings made by Java Flight Recorder for our app running on Kubernetes. By default, Cryostat creates and starts a single recording per each detected target.
We can expand the selected record to see a detailed view. It provides a summarized panel divided into several different categories like heap, memory leak, or exceptions. It highlights warnings with a yellow color and problems with a red color.
We can display a detailed description of each case. We just need to click on the selected field with a problem name. The detailed description will appear in the context menu.
Create and Use a Custom Event Template
We can create a custom recording strategy by defining a new event template. Firstly, we need to go to the “Events” section, and then to the “Event Templates” tab. There are three built-in templates. We can use each of them as a base for our custom template. After deciding which of them to choose we can download it to our laptop. The default file extension is *.jfc
.
In order to edit the *.jfc
files we need a special tool called JDK Mission Control. Each vendor provides such a tool for their distribution of JDK. In our case, it is BellSoft Liberica. Once we download and install Liberica Mission Control on the laptop we should go to Window -> Flight Recording Template Manager.
With the Flight Recording Template Manager, we can import and edit an exported event template. I choose the higher monitoring for “Garbage Collection”, “Allocation Profiling”, “Compiler”, and “Thread Dump”.
Once a new template is ready, we should save it under the selected name. For me, it is the “Continuous Detailed” name. After that, we need to export the template to the file.
Then, we need to switch to the Cryostat Dashboard. We have to import the newly created template exported to the *.jfc
file.
Once you import the template, you should see a new strategy in the “Event Templates” section.
We can create a recording based on our custom “Continuous_Detailed” template. After some time, Cryostat should gather data generated by the Java Flight Recorder for the app running on Kubernetes. However, this time we want to make some advanced analysys using Liberica Mission Control rather than just with the Cryostat Dashboard. Therefore we will export the recording to the *.jfr
file. Such a file may be then imported to the JDK Mission Control tool.
Use the JDK Mission Control Tool
Let’s open the exported *.jfr
file with Liberica Mission Control. Once we do it, we can analyze all the important aspects related to the performance of our Java app. We can display a table with memory allocation per the object type.
We can display a list of running Java threads.
Finally, we go to the “Event Browser” section. In the “Custom Events” category we should find our custom event under the name determined by the @Label
annotation on the ProcessingEvent
class. We can see the history of all generated JFR events together with the duration, start time, and the name of the processing thread.
Final Thoughts
Cryostat helps you to manage the Java Flight Recorder on Kubernetes at scale. It provides a graphical dashboard that allows to monitoring of all the Java workloads that expose JFR data over JMX. The important thing is that even after an app crash we can export the archived monitoring report and analyze it using advanced tools like JDK Mission Control.
Leave a Reply