A Deep Dive Into Spring Cloud Load Balancer

Spring Cloud is currently on the verge of large changes. I have been writing about it in my previous article A New Era of Spring Cloud. While almost all of Spring Cloud Netflix components will be removed in the next release, it seems that the biggest change is a replacement of Ribbon client into Spring Cloud Load Balancer.
Currently, there are not many articles about Spring Cloud Load Balancer online. In fact, this component is still under active development, so we could expect some new features in the near future. Netflix Ribbon client is a stable solution, but unfortunately not developed anymore. However, it is still used as a default load balancer in all Spring Cloud projects, and has many interesting features like integration with circuit breaker or load balancing according to an average response time from service instances. Currently, such features are not available for Spring Cloud Load Balancer, but we can create some custom code to implement them. In this article I’m going to show you how to use spring-cloud-loadbalancer module with RestTemplate for communication between applications, how to implement custom load balancer basing on average response time, and finally how to provide static list of service addresses.

If you are interested in more detailed explanation of Spring Cloud components used for inter-service communication you should refer to the third part of my online course Microservices With Spring Boot And Spring Cloud: Part 3 – Inter-service communication.

Example

You can find a source code snippets related to this article in my GitHub repository https://github.com/piomin/course-spring-microservices.git. That repository is also used for my online course, so I decided to extend it with the new examples. All the required changes were performed in directory inter-communication/inter-caller-service inside that repository. The code is written in Kotlin.
There are three applications, which are a part of our sample system: discovery-server (Spring Cloud Netflix Eureka), inter-callme-service (Spring Boot application that expose REST API), and finally inter-caller-service (Spring Boot application that calls endpoints exposed by inter-callme-service).

How to start with Spring Cloud Load Balancer

To enable Spring Cloud Load Balancer for our application we first need to include the following starter to Maven dependencies. That module may be also included together with some other Spring Cloud starters.

<dependency>
	<groupId>org.springframework.cloud</groupId>
	<artifactId>spring-cloud-starter-loadbalancer</artifactId>
</dependency>

Because Ribbon is still used as a default client-side load balancer for REST-based communication between applications we need to disable it in application properties. Here’s a fragment of application.yml file.


spring:
  application:
    name: inter-caller-service
  cloud:
    loadbalancer:
      ribbon:
        enabled: false

For discovery integration we also need to include spring-cloud-starter-netflix-eureka-client. To use RestTemplate with a client-side load balancer we should define the bean visible below and annotate it with @LoadBalanced. As you on the code below I’m also setting interceptor on RestTemplate, but more about it in the next section.

@Bean
@LoadBalanced
fun template(): RestTemplate = RestTemplateBuilder()
		.interceptors(responseTimeInterceptor())
		.build()

Adapt traffic to average response time

Spring Cloud Load Balancer provides a simple round robin rule for load balancing between multiple instances of a single service. Our goal here is to implement a rule, which measures each application response time and gives a weight according to that time. The longer the response time, the less weight it will get. The rule should randomly pick an instance where the possibility is determined by its weight. To record response time of each call we need to set an already mentioned interceptor that implements ClientHttpRequestInterceptor. Interceptor is executed on every request (1). Since the implementation is very typical, one line requires explanation (2). I’m getting the address of the target application from a thread scoped variable existing in Slf4J MDC. Of course I could also implement a simple thread scoped context based on ThreadLocal, but MDC is used here just for simplification.

class ResponseTimeInterceptor(private val responseTimeHistory: ResponseTimeHistory) : ClientHttpRequestInterceptor {

    private val logger: Logger = LoggerFactory.getLogger(ResponseTimeInterceptor::class.java)

    override fun intercept(request: HttpRequest, array: ByteArray,
                           execution: ClientHttpRequestExecution): ClientHttpResponse {
        val startTime: Long = System.currentTimeMillis()
        val response: ClientHttpResponse = execution.execute(request, array) // 1
        val endTime: Long = System.currentTimeMillis()
        val responseTime: Long = endTime - startTime
        logger.info("Response time: instance->{}, time->{}", MDC.get("address"), responseTime)
        responseTimeHistory.addNewMeasure(MDC.get("address"), responseTime) // 2
        return response
    }
}

Of course, counting an average response time is just a part of our job. The most important is the implementation of our custom load balancer, which is visible below. It should implement interface ReactorServiceInstanceLoadBalancer. It need to inject ServiceInstanceListSupplier bean to fetch a list of available instances of a given service in overridden method choose. While choosing the right instance we are analyzing the average response time for each instance saved in ResponseTimeHistory by ResponseTimeInterceptor. In the beginning our load balancer acts like a simple round robin.

class WeightedTimeResponseLoadBalancer(
        private val serviceInstanceListSupplierProvider: ObjectProvider<ServiceInstanceListSupplier>,
        private val serviceId: String,
        private val responseTimeHistory: ResponseTimeHistory) : ReactorServiceInstanceLoadBalancer {

    private val logger: Logger = LoggerFactory.getLogger(WeightedTimeResponseLoadBalancer::class.java)
    private val position: AtomicInteger = AtomicInteger()

    override fun choose(request: Request<*>?): Mono<Response<ServiceInstance>> {
        val supplier: ServiceInstanceListSupplier = serviceInstanceListSupplierProvider
                .getIfAvailable { NoopServiceInstanceListSupplier() }
        return supplier.get().next()
                .map { serviceInstances: List<ServiceInstance> -> getInstanceResponse(serviceInstances) }
    }

    private fun getInstanceResponse(instances: List<ServiceInstance>): Response<ServiceInstance> {
        return if (instances.isEmpty()) {
            EmptyResponse()
        } else {
            val address: String? = responseTimeHistory.getAddress(instances.size)
            val pos: Int = position.incrementAndGet()
            var instance: ServiceInstance = instances[pos % instances.size]
            if (address != null) {
                val found: ServiceInstance? = instances.find { "${it.host}:${it.port}" == address }
                if (found != null)
                    instance = found
            }
            logger.info("Current instance: [address->{}:{}, stats->{}ms]", instance.host, instance.port,
                    responseTimeHistory.stats["${instance.host}:${instance.port}"])
            MDC.put("address", "${instance.host}:${instance.port}")
            DefaultResponse(instance)
        }
    }
}

Here’s the implementation of ResponseTimeHistory bean, which is responsible for storing measures and selecting the instance of service based on computed weight.

class ResponseTimeHistory(private val history: MutableMap<String, Queue<Long>> = mutableMapOf(),
                          val stats: MutableMap<String, Long> = mutableMapOf()) {

    private val logger: Logger = LoggerFactory.getLogger(ResponseTimeHistory::class.java)

    fun addNewMeasure(address: String, measure: Long) {
        var list: Queue<Long>? = history[address]
        if (list == null) {
            history[address] = LinkedList<Long>()
            list = history[address]
        }
        logger.info("Adding new measure for->{}, measure->{}", address, measure)
        if (measure == 0L)
            list!!.add(1L)
        else list!!.add(measure)
        if (list.size > 9)
            list.remove()
        stats[address] = countAvg(address)
        logger.info("Counting avg for->{}, stat->{}", address, stats[address])
    }

    private fun countAvg(address: String): Long {
        val list: Queue<Long>? = history[address]
        return list?.sum()?.div(list.size) ?: 0
    }

    fun getAddress(numberOfInstances: Int): String? {
        if (stats.size < numberOfInstances)
            return null
        var sum: Long = 0
        stats.forEach { sum += it.value }
        var r: Long = Random.nextLong(100)
        var current: Long = 0
        stats.forEach {
            val weight: Long = (sum - it.value)*100 / sum
            logger.info("Weight for->{}, value->{}, random->{}", it.key, weight, r)
            current += weight
            if (r <= current)
                return it.key
        }
        return null
    }

}

Customizing Spring Cloud Load Balancer

The implementation of our mechanism for weighted response time rule is ready, so the last step is to apply it to Spring Cloud Load Balancer. To do that we need to create a dedicated configuration class with ReactorLoadBalancer bean declaration as shown below.

class CustomCallmeClientLoadBalancerConfiguration(private val responseTimeHistory: ResponseTimeHistory) {

    @Bean
    fun loadBalancer(environment: Environment, loadBalancerClientFactory: LoadBalancerClientFactory):
            ReactorLoadBalancer<ServiceInstance> {
        val name: String? = environment.getProperty("loadbalancer.client.name")
        return WeightedTimeResponseLoadBalancer(
                loadBalancerClientFactory.getLazyProvider(name, ServiceInstanceListSupplier::class.java),
                name!!, responseTimeHistory)
    }
}

The custom configuration may be passed to a load balancer using annotation @LoadBalancerClient. The name of client should be the same as registered in discovery. This part of code is currently commented out in the GitHub repository, so if you would like to enable it for testing just uncomment it.

@SpringBootApplication
@LoadBalancerClient(value = "inter-callme-service", configuration = [CustomCallmeClientLoadBalancerConfiguration::class])
class InterCallerServiceApplication {

    @Bean
    fun responseTimeHistory(): ResponseTimeHistory = ResponseTimeHistory()

    @Bean
    fun responseTimeInterceptor(): ResponseTimeInterceptor = ResponseTimeInterceptor(responseTimeHistory())

    // THE REST OF IMPLEMENTATION...
}

Customizing instance list supplier

Currently Spring Cloud Load Balancer does not support a static list of instances set in configuration properties (unlike Netflix Ribbon). We can easily add such a mechanism. The static list of instances for every service will be defined as shown below.

spring:
  application:
    name: inter-caller-service
  cloud:
    loadbalancer:
      ribbon:
        enabled: false
      instances:
        - name: inter-callme-service
          servers: localhost:59600, localhost:59800

As the first step, we should define a class that implements interface ServiceInstanceListSupplier and overrides two methods: getServiceId() and get(). The following implementation of ServiceInstanceListSupplier takes the list of service addresses from application properties through @ConfigurationProperties.

class StaticServiceInstanceListSupplier(private val properties: LoadBalancerConfigurationProperties,
                                        private val environment: Environment) : ServiceInstanceListSupplier {

    override fun getServiceId(): String = environment.getProperty("loadbalancer.client.name")!!

    override fun get(): Flux<MutableList<ServiceInstance>> {
        val serviceConfig: LoadBalancerConfigurationProperties.ServiceConfig? =
                properties.instances.find { it.name == serviceId }
        val list: MutableList<ServiceInstance> =
                serviceConfig!!.servers.split(",", ignoreCase = false, limit = 0)
                        .map { StaticServiceInstance(serviceId, it) }.toMutableList()
        return Flux.just(list)
    }

}

Here’s the implementation of configuration class with properties.

@Configuration
@ConfigurationProperties("spring.cloud.loadbalancer")
class LoadBalancerConfigurationProperties {

    val instances: MutableList<ServiceConfig> = mutableListOf()

    class ServiceConfig {
        var name: String = ""
        var servers: String = ""
    }

}

The same as for the previous sample we should also register our implementation of ServiceInstanceListSupplier as a bean inside custom configuration class.

class CustomCallmeClientLoadBalancerConfiguration) {

    @Bean
    fun discoveryClientServiceInstanceListSupplier(discoveryClient: ReactiveDiscoveryClient, environment: Environment,
        zoneConfig: LoadBalancerZoneConfig, context: ApplicationContext,
        properties: LoadBalancerConfigurationProperties): ServiceInstanceListSupplier {
        val delegate = StaticServiceInstanceListSupplier(properties, environment)
        val cacheManagerProvider = context.getBeanProvider(LoadBalancerCacheManager::class.java)
        return if (cacheManagerProvider.ifAvailable != null) {
            CachingServiceInstanceListSupplier(delegate, cacheManagerProvider.ifAvailable)
        } else delegate
    }
}

Testing Spring Cloud Load Balancer

To test the solution implemented for the purpose of this article you should:

Run the instance of discovery server (only if StaticServiceInstanceListSupplier is disabled)
Run two instances of inter-callme-service (for one selected instance activate random delay using VM parameter -Dspring.profiles.active=delay)
Run instance of inter-caller-service, which is available on port 8080
Send some test requests to inter-caller-service using command, for example curl -X POST http://localhost:8080/caller/random-send/12345

Our test scenario is visualized in the following picture.

spring-cloud-load-balancer-arch

Conclusion

Currently, Spring Cloud Load Balancer does not offer such many interesting features for inter-service communication as the Netflix Ribbon client. Of course, it is still being actively developed by the Spring Team. The good news is that we can easily customize Spring Cloud Load Balancer to add some custom features. In this article I demonstrated how to provide more advanced load balancing algorithms or create custom instances of list suppliers.

16 COMMENTS

Kyriakos MandalasMay 13, 2020 6:25 pm

As mentioned, Spring Cloud Load Balancer provides simple round robin rule. Are there any other built-in rules? For example based on resources like CPU or heap usage? Or we need to implemented custom rules for these as well?

Piotr MińkowskiMay 13, 2020 6:34 pm

Currently you don’t have any more advanced load-balanced rules: https://github.com/spring-cloud/spring-cloud-commons/tree/master/spring-cloud-loadbalancer/src/main/java/org/springframework/cloud/loadbalancer/core

Kyriakos MandalasMay 14, 2020 9:10 am

Thanks for the prompt reply. One final but important question: the custom rule you describe, applies only to inter-service communication balancing? Or it is effective as well when requests are routed via Spring Cloud Gateway to the various service instances?

Piotr MińkowskiMay 14, 2020 9:31 am

The communication between gateway and microservices is defacto inter-service communication too 🙂 Gateway uses discovery to locate service and uses the same lb logic as other Spring Cloud applications. So, the answer is that it should work for gateway the same as for applications demonstrated in this article. However, I haven’t verified it on gateway yet.

Kyriakos MandalasMay 13, 2020 6:25 pm

Piotr MińkowskiMay 13, 2020 6:34 pm

Kyriakos MandalasMay 14, 2020 9:10 am

Piotr MińkowskiMay 14, 2020 9:31 am

Kyriakos MandalasMay 22, 2020 6:08 pm

Hello. Just to inform that I have tried it with the gateway with a custom load balancer (more specifically based on a custom actuator metric) and it does work as expected.

Piotr MińkowskiMay 22, 2020 6:12 pm

Hello. How did you test it? Have you got any source code?

Kyriakos MandalasMay 22, 2020 6:18 pm

I will provide github link in a couple of days. I tested it on local machine with eureka, gateway and 2 instances of a microservice along with a traffic load simulator.

Kyriakos MandalasMay 22, 2020 6:08 pm

Hello. Just to inform that I have tried it with the gateway with a custom load balancer (more specifically based on a custom actuator metric) and it does work as expected.

Piotr MińkowskiMay 22, 2020 6:12 pm

Hello. How did you test it? Have you got any source code?

Kyriakos MandalasMay 22, 2020 6:18 pm

I will provide github link in a couple of days. I tested it on local machine with eureka, gateway and 2 instances of a microservice along with a traffic load simulator.

Kyriakos MandalasJune 3, 2020 2:54 pm

Not yet polished but please have a look at: https://github.com/petros94/smart-home-websockets/blob/custom-load-balancer/gateway-server/src/main/java/com/pmitseas/gateway/lb/LeastConnLoadBalancer.java and https://github.com/petros94/smart-home-websockets/blob/custom-load-balancer/gateway-server/src/main/java/com/pmitseas/gateway/GatewayApplication.java

Your feedback/observations is more than welcome.

Kyriakos.

piotr.minkowskiAugust 22, 2020 12:08 pm

I’ll try to take a look on that next week. Sorry, for the delay …

Piotr's TechBlog

A Deep Dive Into Spring Cloud Load Balancer