Distributed Tracing with Istio, Quarkus and Jaeger

Distributed Tracing with Istio, Quarkus and Jaeger

In this article, you will learn how to configure distributed tracing for your service mesh with Istio and Quarkus. For test purposes, we will build and run Quarkus microservices on Kubernetes. The communication between them is going to be managed by Istio. Istio service mesh uses Jaeger as a distributed tracing system.

This time I won’t tell you about Istio basics. Although our configuration is not complicated you may read the following introduction before we start.

Source Code

If you would like to try it by yourself, you may always take a look at my source code. In order to do that you need to clone my GitHub repository. Then you should go to the mesh-with-db directory. After that, you should just follow my instructions 🙂

Service Mesh Architecture

Let’s start with our microservices architecture. There are two applications: person-app and insurance-app. As you probably guessed, the person-app stores and returns information about insured people. On the other hand, the insurance-app keeps insurances data. Each service has a separate database. We deploy the person-app in two versions. The v2 version contains one additional field externalId.

The following picture illustrates our scenario. Istio splits traffic between two versions of the person-app. By default, it splits the traffic 50% to 50%. If it receives the X-Version header in the request it calls the particular version of the person-app. Of course, the possible values of the header are v1 or v2.

quarkus-istio-tracing-arch

Distributed Tracing with Istio

Istio generates distributed trace spans for each managed service. It means that every request sent inside the Istio will have the following HTTP headers:

So, every single request incoming from the Istio gateway contains X-B3-SpanId, X-B3-TraceId, and some other B3 headers. The X-B3-SpanId indicates the position of the current operation in the trace tree. On the other hand, every span in a trace should share the X-B3-TraceId header. At first glance, you can feel surprised that Istio does not propagate B3 headers in client calls. To clarify, if one service communicates with another service using e.g. REST client, you will see two different traces. The first of them is related to the API endpoint call, while the second with the client call of another API endpoint. That’s not exactly what we would like to achieve, right?

Let’s visualize our problem. If you call the insurance-app through the Istio gateway you will have the first trace in Jaeger. During that call the insurance-app calls endpoint from the person-app using Quarkus REST client. That’s another separate trace in Jeager. Our goal here is to propagate all required B3 headers to the person-app also. You can find a list of required headers here in the Istio documentation.

quarkus-istio-tracing-details

Of course, that’s not the only thing we will do today. We will also prepare Istio rules in order to simulate latency in our communication. It is a good scenario to use the tracing tool. Also, I’m going to show you how to easily deploy microservice on Kubernetes using Quarkus features for that.

I’m running Istio and Jaeger on OpenShift. More precisely, I’m using OpenShift Service Mesh that is the RedHat’s SM implementation based on Istio. I doesn’t have any impact on the exercise, so you as well repeat all the steps on Kubernetes.

Create Microservices with Quarkus

Let’s begin with the insurance-app. Here’s the class responsible for the REST endpoints implementation. There are several methods there. However, the most important for us is the getInsuranceDetailsById method that calls the GET /persons/{id} endpoint in the person-app. In order to use the Quarkus REST client extension, we need to inject client bean with @RestClient annotation.

@Path("/insurances")
public class InsuranceResource {

   @Inject
   Logger log;
   @Inject
   InsuranceRepository insuranceRepository;
   @Inject @RestClient
   PersonService personService;

   @POST
   @Transactional
   public Insurance addInsurance(Insurance insurance) {
      insuranceRepository.persist(insurance);
      return insurance;
   }

   @GET
   public List<Insurance> getInsurances() {
      return insuranceRepository.listAll();
   }

   @GET
   @Path("/{id}")
   public Insurance getInsuranceById(@PathParam("id") Long id) {
      return insuranceRepository.findById(id);
   }

   @GET
   @Path("/{id}/details")
   public InsuranceDetails getInsuranceDetailsById(@PathParam("id") Long id, @HeaderParam("X-Version") String version) {
      log.infof("getInsuranceDetailsById: id=%d, version=%s", id, version);
      Insurance insurance = insuranceRepository.findById(id);
      InsuranceDetails insuranceDetails = new InsuranceDetails();
      insuranceDetails.setPersonId(insurance.getPersonId());
      insuranceDetails.setAmount(insurance.getAmount());
      insuranceDetails.setType(insurance.getType());
      insuranceDetails.setExpiry(insurance.getExpiry());
      insuranceDetails.setPerson(personService.getPersonById(insurance.getPersonId()));
      return insuranceDetails;
   }

}

As you probably remember from the previous section, we need to propagate several headers responsible for Istio tracing to the downstream Quarkus service. Let’s take a look at the implementation of the REST client in the PersonService interface. In order to send some additional headers to the request, we need to annotate the interface with @RegisterClientHeaders. Then we have two options. We can provide our custom headers factory as shown below. Otherwise, we may use the property org.eclipse.microprofile.rest.client.propagateHeaders with a list of headers.

@Path("/persons")
@RegisterRestClient
@RegisterClientHeaders(RequestHeaderFactory.class)
public interface PersonService {

   @GET
   @Path("/{id}")
   Person getPersonById(@PathParam("id") Long id);
}

Let’s just take a look at the implementation of the REST endpoint in the person-app. The method getPersonId at the bottom is responsible for finding a person by the id field. It is a target method called by the PersonService client.

@Path("/persons")
public class PersonResource {

   @Inject
   Logger log;
   @Inject
   PersonRepository personRepository;

   @POST
   @Transactional
   public Person addPerson(Person person) {
      personRepository.persist(person);
      return person;
   }

   @GET
   public List<Person> getPersons() {
      return personRepository.listAll();
   }

   @GET
   @Path("/{id}")
   public Person getPersonById(@PathParam("id") Long id) {
      log.infof("getPersonById: id=%d", id);
      Person p = personRepository.findById(id);
      log.infof("getPersonById: %s", p);
      return p;
   }
}

Finally, the implementation of our customer client headers factory. It needs to implement the ClientHeadersFactory interface and its update() method. We are not doing anything complicated here. We are forwarding B3 tracing headers and the X-Version used by Istio to route between two versions of the person-app. I also added some logs. There I didn’t use the already mentioned option of header propagation based on the property org.eclipse.microprofile.rest.client.propagateHeaders.

@ApplicationScoped
public class RequestHeaderFactory implements ClientHeadersFactory {

   @Inject
   Logger log;

   @Override
   public MultivaluedMap<String, String> update(MultivaluedMap<String, String> inHeaders,
                                                 MultivaluedMap<String, String> outHeaders) {
      String version = inHeaders.getFirst("x-version");
      log.infof("Version Header: %s", version);
      String traceId = inHeaders.getFirst("x-b3-traceid");
      log.infof("Trace Header: %s", traceId);
      MultivaluedMap<String, String> result = new MultivaluedHashMap<>();
      result.add("X-Version", version);
      result.add("X-B3-TraceId", traceId);
      result.add("X-B3-SpanId", inHeaders.getFirst("x-b3-spanid"));
      result.add("X-B3-ParentSpanId", inHeaders.getFirst("x-b3-parentspanid"));
      return result;
   }
}

Run Quarkus Applications on Kubernetes

Before we test Istio tracing we need to deploy our Quarkus microservices on Kubernetes. We may do it in several different ways. One of the methods is provided directly by Quarkus. Thanks to that we can generate the Deployment manifests automatically during the build. It turns out that we can also apply Istio manifests to the Kubernetes cluster as well. Firstly, we need to include the following two modules.

<dependency>
  <groupId>io.quarkus</groupId>
  <artifactId>quarkus-openshift</artifactId>
</dependency>
<dependency>
  <groupId>me.snowdrop</groupId>
  <artifactId>istio-client</artifactId>
  <version>1.7.7.1</version>
</dependency>

If you deploy on Kubernetes just replace the quarkus-openshift module with the quarkus-kubernetes module. In the next step, we need to provide some configuration settings in the application.properties file. In order to enable deployment during the build, we need to set the property quarkus.kubernetes.deploy to true. We can configure several aspects of the Kubernetes Deployment. For example, we may enable the Istio proxy by setting the annotation sidecar.istio.io/inject to true or add some labels required for routing: app and version (in that case). Finally, our application connects to the database, so we need to inject values from the Kubernetes Secret.

quarkus.container-image.group = demo-mesh
quarkus.container-image.build = true
quarkus.kubernetes.deploy = true
quarkus.kubernetes.deployment-target = openshift
quarkus.kubernetes-client.trust-certs = true

quarkus.openshift.deployment-kind = Deployment
quarkus.openshift.labels.app = quarkus-insurance-app
quarkus.openshift.labels.version = v1
quarkus.openshift.annotations."sidecar.istio.io/inject" = true
quarkus.openshift.env.mapping.postgres_user.from-secret = insurance-db
quarkus.openshift.env.mapping.postgres_user.with-key = database-user
quarkus.openshift.env.mapping.postgres_password.from-secret = insurance-db
quarkus.openshift.env.mapping.postgres_password.with-key = database-password
quarkus.openshift.env.mapping.postgres_db.from-secret = insurance-db
quarkus.openshift.env.mapping.postgres_db.with-key = database-name

Here’s a part of the configuration responsible for database connection settings. The name of the key after quarkus.openshift.env.mappings maps to the name of the environment variable, for example, the postgres_password property to the POSTGRES_PASSWORD env.

quarkus.datasource.db-kind = postgresql
quarkus.datasource.username = ${POSTGRES_USER}
quarkus.datasource.password = ${POSTGRES_PASSWORD}
quarkus.datasource.jdbc.url = 
jdbc:postgresql://person-db:5432/${POSTGRES_DB}

If there are any additional manifests to apply we should place them inside the src/main/kubernetes directory. This applies to, for example, the Istio configuration. So, now the only thing we need to do is to application build. Firstly go to the quarkus-person-app directory and run the following command. Then, go to the quarkus-insurance-app directory, and do the same.

$ mvn clean package

Traffic Management with Istio

There are two versions of the person-app application. So, let’s create the DestinationRule object containing two subsets v1 and v2 based on the version label.

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: quarkus-person-app-dr
spec:
  host: quarkus-person-app
  subsets:
    - name: v1
      labels:
        version: v1
    - name: v2
      labels:
        version: v2

In the next step, we need to create the VirtualService object for the quarkus-person-app service. The routing between versions can be based on the X-Version header. If that is not set Istio sends 50% to the v1 version, and 50% to the v2 version. Also, we will inject the delay into the v2 route using the Istio HTTPFaultInjection object. It adds 3 seconds delay for 50% of incoming requests.

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: quarkus-person-app-vs
spec:
  hosts:
    - quarkus-person-app
  http:
    - match:
        - headers:
            X-Version:
              exact: v1
      route:
        - destination:
            host: quarkus-person-app
            subset: v1
    - match:
        - headers:
            X-Version:
              exact: v2
      route:
        - destination:
            host: quarkus-person-app
            subset: v2
      fault:
        delay:
          fixedDelay: 3s
          percentage:
            value: 50
    - route:
        - destination:
            host: quarkus-person-app
            subset: v1
          weight: 50
        - destination:
            host: quarkus-person-app
            subset: v2
          weight: 50

Now, let’s create the Istio Gateway object. Replace the CLUSTER_DOMAIN variable with your cluster’s domain name:

apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: microservices-gateway
spec:
  selector:
    istio: ingressgateway
  servers:
    - port:
        number: 80
        name: http
        protocol: HTTP
      hosts:
        - quarkus-insurance-app.apps.$CLUSTER_DOMAIN
        - quarkus-person-app.apps.$CLUSTER_DOMAIN

In order to forward traffic from the gateway, the VirtualService needs to refer to that gateway.

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: quarkus-insurance-app-vs
spec:
  hosts:
    - quarkus-insurance-app.apps.$CLUSTER_DOMAIN
  gateways:
    - microservices-gateway
  http:
    - match:
        - uri:
            prefix: "/insurance"
      rewrite:
        uri: " "
      route:
        - destination:
            host: quarkus-insurance-app
          weight: 100

Now you can call the insurance-app service:

$ curl http://quarkus-insurance-app.apps.${CLUSTER_DOMAIN}/insurance/insurances/1/details

We can verify all the existing Istio objects using Kiali.

Testing Istio Tracing with Quarkus

First of all, you can generate many requests using the siege tool. There are multiple ways to run it. We can prepare the file with example requests as shown below:

http://quarkus-person-app.apps.${CLUSTER_DOMAIN}/person/persons/1
http://quarkus-person-app.apps.${CLUSTER_DOMAIN}/person/persons/2
http://quarkus-person-app.apps.${CLUSTER_DOMAIN}/person/persons/3
http://quarkus-person-app.apps.${CLUSTER_DOMAIN}/person/persons/4
http://quarkus-person-app.apps.${CLUSTER_DOMAIN}/person/persons/5
http://quarkus-person-app.apps.${CLUSTER_DOMAIN}/person/persons/6
http://quarkus-person-app.apps.${CLUSTER_DOMAIN}/person/persons/7
http://quarkus-person-app.apps.${CLUSTER_DOMAIN}/person/persons/8
http://quarkus-person-app.apps.${CLUSTER_DOMAIN}/person/persons/9
http://quarkus-person-app.apps.${CLUSTER_DOMAIN}/person/persons
http://quarkus-insurance-app.apps.${CLUSTER_DOMAIN}/insurance/insurances/1/details
http://quarkus-insurance-app.apps.${CLUSTER_DOMAIN}/insurance/insurances/2/details
http://quarkus-insurance-app.apps.${CLUSTER_DOMAIN}/insurance/insurances/3/details
http://quarkus-insurance-app.apps.${CLUSTER_DOMAIN}/insurance/insurances/4/details
http://quarkus-insurance-app.apps.${CLUSTER_DOMAIN}/insurance/insurances/5/details
http://quarkus-insurance-app.apps.${CLUSTER_DOMAIN}/insurance/insurances/6/details
http://quarkus-insurance-app.apps.${CLUSTER_DOMAIN}/insurance/insurances/1
http://quarkus-insurance-app.apps.${CLUSTER_DOMAIN}/insurance/insurances/2
http://quarkus-insurance-app.apps.${CLUSTER_DOMAIN}/insurance/insurances/3
http://quarkus-insurance-app.apps.${CLUSTER_DOMAIN}/insurance/insurances/4
http://quarkus-insurance-app.apps.${CLUSTER_DOMAIN}/insurance/insurances/5
http://quarkus-insurance-app.apps.${CLUSTER_DOMAIN}/insurance/insurances/6

Now we need to set that file as an input for the siege command. We can also set the number of repeats (-r) and concurrent threads (-c).

$ siege -f k8s/traffic/urls.txt -i -v -r 500 -c 10 --no-parser

It takes some time before the command will finish. In the meanwhile, let’s try to send a single request to the insurance-app with the X-Version header set to v2. 50% of such requests are delayed by the Istio quarkus-person-app-vs VirtualService. Repeat the request until you have the response with 3s delay:

Here’s the access log from the Quarkus application:

Then, let’s switch to the Jaeger console. We can find the request by the guid:x-request-id tag.

Here’s the result of our search:

quarkus-istio-tracing-jaeger

We have a full trace of the request including communication between Istio gateway and insurance-app, and also between the insurance-app and person-app. In order to print the details of the trace just click the record. You will see a trace timeline and requests/responses structure. You can easily verify that latency occurs somewhere between insurance-app and person-app, because the request has been processed only 4.46 ms by the person-app.

quarkus-istio-tracing-jaeger-timeline

Leave a Reply