A Deep Dive Into Spring Cloud Load Balancer

Spring Cloud is currently on the verge of large changes. I have been writing about it in my previous article A New Era of Spring Cloud. While almost all of Spring Cloud Netflix components will be removed in the next release, it seems that the biggest change is a replacement of Ribbon client into Spring Cloud Load Balancer.
Currently, there are not many articles about Spring Cloud Load Balancer online. In fact, this component is still under active development, so we could expect some new features in the near future. Netflix Ribbon client is stable solution, but unfortunately not developed anymore. However, it is still used as a default load balancer in all Spring Cloud projects, and has many interesting features like integration with circuit breaker or load balancing according to an average response time from service instances. Currently, such features are not available for Spring Cloud Load Balancer, but we can create some custom code to implement them. In this article I’m going to show you how to use spring-cloud-loadbalancer module with RestTemplate for communication between applications, how to implement custom load balancer basing on average response time, and finally how to provide static list of service addresses.

If you are interested in more detailed explanation of Spring Cloud components used for inter-service communication you should refer to the third part of my online course Microservices With Spring Boot And Spring Cloud: Part 3 – Inter-service communication.


You can find a source code snippets related to this article in my GitHub repository https://github.com/piomin/course-spring-microservices.git. That repository is also used for my online course, so I decided to extend it with the new examples. All the required changes were performed in directory inter-communication/inter-caller-service inside that repository. The code is written in Kotlin.
There are three applications, which are a part of our sample system: discovery-server (Spring Cloud Netflix Eureka), inter-callme-service (Spring Boot application that expose REST API), and finally inter-caller-service (Spring Boot application that calls endpoints exposed by inter-callme-service).

How to start

To enable Spring Cloud Load Balancer for our application we first need to include the following starter to Maven dependencies (this module may be also included together we some other Spring Cloud starters).


Because Ribbon is still used as a default client-side load balancer for REST-based communication between applications we need to disable it in application properties. Here’s fragment of application.yml file.

    name: inter-caller-service
        enabled: false

For discovery integration we also need to include spring-cloud-starter-netflix-eureka-client. To use RestTemplate with client-side load balancer we should define such bean and annotate it with @LoadBalanced. As you on the code below I’m also setting interceptor on RestTemplate, but more about it in the next section.

fun template(): RestTemplate = RestTemplateBuilder()

Adapt traffic to average response time

Spring Cloud Load Balancer provides simple round robin rule for load balancing between multiple instances of a single service. Our goal here is to implement a rule, which measures each application response time and gives a weight according to that time. The longer the response time, the less weight it will get. The rule should randomly picks an instance where the possibility is determined by its weight. To record response time of each call we need to set already mentioned interceptor that implements ClientHttpRequestInterceptor. Interceptor is executed on every request (1). Since the implementation is very typical, one line requires explanation (2). I’m getting the address of target application from thread scoped variable existing in Slf4J MDC. Of course I could also implement a simple thread scoped context based on ThreadLocal, but MDC is used here just for simplification.

class ResponseTimeInterceptor(private val responseTimeHistory: ResponseTimeHistory) : ClientHttpRequestInterceptor {

    private val logger: Logger = LoggerFactory.getLogger(ResponseTimeInterceptor::class.java)

    override fun intercept(request: HttpRequest, array: ByteArray,
                           execution: ClientHttpRequestExecution): ClientHttpResponse {
        val startTime: Long = System.currentTimeMillis()
        val response: ClientHttpResponse = execution.execute(request, array) // 1
        val endTime: Long = System.currentTimeMillis()
        val responseTime: Long = endTime - startTime
        logger.info("Response time: instance->{}, time->{}", MDC.get("address"), responseTime)
        responseTimeHistory.addNewMeasure(MDC.get("address"), responseTime) // 2
        return response

Of course, counting an average response time is just a part of our job. The most important is the implementation of our custom load balancer, which is visible below. It should implement interface ReactorServiceInstanceLoadBalancer. It need to inject ServiceInstanceListSupplier bean to fetch a list of available instances of a given service in overridden method choose. While choosing the right instance we are analyzing the average response time for each instance saved in ResponseTimeHistory by ResponseTimeInterceptor. In the beginning our load balancer acts like simple round robin.

class WeightedTimeResponseLoadBalancer(
        private val serviceInstanceListSupplierProvider: ObjectProvider<ServiceInstanceListSupplier>,
        private val serviceId: String,
        private val responseTimeHistory: ResponseTimeHistory) : ReactorServiceInstanceLoadBalancer {

    private val logger: Logger = LoggerFactory.getLogger(WeightedTimeResponseLoadBalancer::class.java)
    private val position: AtomicInteger = AtomicInteger()

    override fun choose(request: Request<*>?): Mono<Response<ServiceInstance>> {
        val supplier: ServiceInstanceListSupplier = serviceInstanceListSupplierProvider
                .getIfAvailable { NoopServiceInstanceListSupplier() }
        return supplier.get().next()
                .map { serviceInstances: List<ServiceInstance> -> getInstanceResponse(serviceInstances) }

    private fun getInstanceResponse(instances: List<ServiceInstance>): Response<ServiceInstance> {
        return if (instances.isEmpty()) {
        } else {
            val address: String? = responseTimeHistory.getAddress(instances.size)
            val pos: Int = position.incrementAndGet()
            var instance: ServiceInstance = instances[pos % instances.size]
            if (address != null) {
                val found: ServiceInstance? = instances.find { "${it.host}:${it.port}" == address }
                if (found != null)
                    instance = found
            logger.info("Current instance: [address->{}:{}, stats->{}ms]", instance.host, instance.port,
            MDC.put("address", "${instance.host}:${instance.port}")

Here’s the implementation of ResponseTimeHistory bean, which responsible for storing measures and selecting the instance of service basing on computed weight.

class ResponseTimeHistory(private val history: MutableMap<String, Queue<Long>> = mutableMapOf(),
                          val stats: MutableMap<String, Long> = mutableMapOf()) {

    private val logger: Logger = LoggerFactory.getLogger(ResponseTimeHistory::class.java)

    fun addNewMeasure(address: String, measure: Long) {
        var list: Queue<Long>? = history[address]
        if (list == null) {
            history[address] = LinkedList<Long>()
            list = history[address]
        logger.info("Adding new measure for->{}, measure->{}", address, measure)
        if (measure == 0L)
        else list!!.add(measure)
        if (list.size > 9)
        stats[address] = countAvg(address)
        logger.info("Counting avg for->{}, stat->{}", address, stats[address])

    private fun countAvg(address: String): Long {
        val list: Queue<Long>? = history[address]
        return list?.sum()?.div(list.size) ?: 0

    fun getAddress(numberOfInstances: Int): String? {
        if (stats.size < numberOfInstances)
            return null
        var sum: Long = 0
        stats.forEach { sum += it.value }
        var r: Long = Random.nextLong(100)
        var current: Long = 0
        stats.forEach {
            val weight: Long = (sum - it.value)*100 / sum
            logger.info("Weight for->{}, value->{}, random->{}", it.key, weight, r)
            current += weight
            if (r <= current)
                return it.key
        return null


Customizing LoadBalancer

The implementation of our mechanism for weighted response time rule is ready, so the last step is to apply it to Spring Cloud Load Balancer. To do that we need to create a dedicated configuration class with ReactorLoadBalancer bean declaration as shown below.

class CustomCallmeClientLoadBalancerConfiguration(private val responseTimeHistory: ResponseTimeHistory) {

    fun loadBalancer(environment: Environment, loadBalancerClientFactory: LoadBalancerClientFactory):
            ReactorLoadBalancer<ServiceInstance> {
        val name: String? = environment.getProperty("loadbalancer.client.name")
        return WeightedTimeResponseLoadBalancer(
                loadBalancerClientFactory.getLazyProvider(name, ServiceInstanceListSupplier::class.java),
                name!!, responseTimeHistory)

The custom configuration may be passed to a load balancer using annotation @LoadBalancerClient. The name of client should be the same as registered in discovery. This part of code is currently commented out in the GitHub repository, so if you would like to enable it for testing just uncomment it.

@LoadBalancerClient(value = "inter-callme-service", configuration = [CustomCallmeClientLoadBalancerConfiguration::class])
class InterCallerServiceApplication {

    fun responseTimeHistory(): ResponseTimeHistory = ResponseTimeHistory()

    fun responseTimeInterceptor(): ResponseTimeInterceptor = ResponseTimeInterceptor(responseTimeHistory())


Customizing instance list supplier

Currently Spring Cloud Load Balancer does not support a static list of instances set in configuration properties (unlike Netflix Ribbon). We can easily add such mechanism. The static list of instances for every service will be defined as shown below.

    name: inter-caller-service
        enabled: false
        - name: inter-callme-service
          servers: localhost:59600, localhost:59800

As the first step, we should define a class that implements interface ServiceInstanceListSupplier and overrides two methods: getServiceId() and get(). The following implementation of ServiceInstanceListSupplier takes the list of service address from application properties through @ConfigurationProperties.

class StaticServiceInstanceListSupplier(private val properties: LoadBalancerConfigurationProperties,
                                        private val environment: Environment) : ServiceInstanceListSupplier {

    override fun getServiceId(): String = environment.getProperty("loadbalancer.client.name")!!

    override fun get(): Flux<MutableList<ServiceInstance>> {
        val serviceConfig: LoadBalancerConfigurationProperties.ServiceConfig? =
                properties.instances.find { it.name == serviceId }
        val list: MutableList<ServiceInstance> =
                serviceConfig!!.servers.split(",", ignoreCase = false, limit = 0)
                        .map { StaticServiceInstance(serviceId, it) }.toMutableList()
        return Flux.just(list)


Here’s the implementation of configuration class with properties.

class LoadBalancerConfigurationProperties {

    val instances: MutableList<ServiceConfig> = mutableListOf()

    class ServiceConfig {
        var name: String = ""
        var servers: String = ""


The same as for the previous sample we should also register our implementation of ServiceInstanceListSupplier as a bean inside custom configuration class.

class CustomCallmeClientLoadBalancerConfiguration) {

    fun discoveryClientServiceInstanceListSupplier(discoveryClient: ReactiveDiscoveryClient, environment: Environment,
        zoneConfig: LoadBalancerZoneConfig, context: ApplicationContext,
        properties: LoadBalancerConfigurationProperties): ServiceInstanceListSupplier {
        val delegate = StaticServiceInstanceListSupplier(properties, environment)
        val cacheManagerProvider = context.getBeanProvider(LoadBalancerCacheManager::class.java)
        return if (cacheManagerProvider.ifAvailable != null) {
            CachingServiceInstanceListSupplier(delegate, cacheManagerProvider.ifAvailable)
        } else delegate


To test the solution implemented for the purpose of this article you should:

  1. Run the instance of discovery server (only if StaticServiceInstanceListSupplier is disabled)
  2. Run two instances of inter-callme-service (for one selected instance activate random delay using VM parameter -Dspring.profiles.active=delay)
  3. Run instance of inter-caller-service, which is available on port 8080
  4. Send some test requests to inter-caller-service using command, for example curl -X POST http://localhost:8080/caller/random-send/12345

Our test scenario is visualized on the following picture.



Currently, Spring Cloud Load Balancer does not offer such many interesting features for inter-service communication as the Netflix Ribbon client. Of course, it is still being actively developed by Spring Team. The good news is that we can easily customize Spring Cloud Load Balancer to add some custom features. In this article I demonstrated how to provide more advanced load balancing algorithms or create custom instances list supplier.

7 thoughts on “A Deep Dive Into Spring Cloud Load Balancer

  1. As mentioned, Spring Cloud Load Balancer provides simple round robin rule. Are there any other built-in rules? For example based on resources like CPU or heap usage? Or we need to implemented custom rules for these as well?


      1. Thanks for the prompt reply. One final but important question: the custom rule you describe, applies only to inter-service communication balancing? Or it is effective as well when requests are routed via Spring Cloud Gateway to the various service instances?


      2. The communication between gateway and microservices is defacto inter-service communication too 🙂 Gateway uses discovery to locate service and uses the same lb logic as other Spring Cloud applications. So, the answer is that it should work for gateway the same as for applications demonstrated in this article. However, I haven’t verified it on gateway yet.

        Liked by 1 person

  2. Hello. Just to inform that I have tried it with the gateway with a custom load balancer (more specifically based on a custom actuator metric) and it does work as expected.


      1. I will provide github link in a couple of days. I tested it on local machine with eureka, gateway and 2 instances of a microservice along with a traffic load simulator.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.