A Deep Dive Into Spring Cloud Load Balancer

Spring Cloud is currently on the verge of large changes. I have been writing about it in my previous article A New Era of Spring Cloud. While almost all of Spring Cloud Netflix components will be removed in the next release, it seems that the biggest change is a replacement of Ribbon client into Spring Cloud Load Balancer.
Currently, there are not many articles about Spring Cloud Load Balancer online. In fact, this component is still under active development, so we could expect some new features in the near future. Netflix Ribbon client is a stable solution, but unfortunately not developed anymore. However, it is still used as a default load balancer in all Spring Cloud projects, and has many interesting features like integration with circuit breaker or load balancing according to an average response time from service instances. Currently, such features are not available for Spring Cloud Load Balancer, but we can create some custom code to implement them. In this article I’m going to show you how to use spring-cloud-loadbalancer
module with RestTemplate
for communication between applications, how to implement custom load balancer basing on average response time, and finally how to provide static list of service addresses.
You can find a source code snippets related to this article in my GitHub repository https://github.com/piomin/course-spring-microservices.git. That repository is also used for my online course, so I decided to extend it with the new examples. All the required changes were performed in directory inter-communication/inter-caller-service inside that repository. The code is written in Kotlin.
There are three applications, which are a part of our sample system: discovery-server
(Spring Cloud Netflix Eureka), inter-callme-service
(Spring Boot application that expose REST API), and finally inter-caller-service
(Spring Boot application that calls endpoints exposed by inter-callme-service
How to start with Spring Cloud Load Balancer
To enable Spring Cloud Load Balancer for our application we first need to include the following starter to Maven dependencies. That module may be also included together with some other Spring Cloud starters.
Because Ribbon is still used as a default client-side load balancer for REST-based communication between applications we need to disable it in application properties. Here’s a fragment of application.yml
name: inter-caller-service
enabled: false
For discovery integration we also need to include spring-cloud-starter-netflix-eureka-client
. To use RestTemplate
with a client-side load balancer we should define the bean visible below and annotate it with @LoadBalanced
. As you on the code below I’m also setting interceptor on RestTemplate
, but more about it in the next section.
fun template(): RestTemplate = RestTemplateBuilder()
Adapt traffic to average response time
Spring Cloud Load Balancer provides a simple round robin rule for load balancing between multiple instances of a single service. Our goal here is to implement a rule, which measures each application response time and gives a weight according to that time. The longer the response time, the less weight it will get. The rule should randomly pick an instance where the possibility is determined by its weight. To record response time of each call we need to set an already mentioned interceptor that implements ClientHttpRequestInterceptor
. Interceptor is executed on every request (1). Since the implementation is very typical, one line requires explanation (2). I’m getting the address of the target application from a thread scoped variable existing in Slf4J MDC
. Of course I could also implement a simple thread scoped context based on ThreadLocal
, but MDC
is used here just for simplification.
class ResponseTimeInterceptor(private val responseTimeHistory: ResponseTimeHistory) : ClientHttpRequestInterceptor {
private val logger: Logger = LoggerFactory.getLogger(ResponseTimeInterceptor::class.java)
override fun intercept(request: HttpRequest, array: ByteArray,
execution: ClientHttpRequestExecution): ClientHttpResponse {
val startTime: Long = System.currentTimeMillis()
val response: ClientHttpResponse = execution.execute(request, array) // 1
val endTime: Long = System.currentTimeMillis()
val responseTime: Long = endTime - startTime
logger.info("Response time: instance->{}, time->{}", MDC.get("address"), responseTime)
responseTimeHistory.addNewMeasure(MDC.get("address"), responseTime) // 2
return response
Of course, counting an average response time is just a part of our job. The most important is the implementation of our custom load balancer, which is visible below. It should implement interface ReactorServiceInstanceLoadBalancer
. It need to inject ServiceInstanceListSupplier
bean to fetch a list of available instances of a given service in overridden method choose
. While choosing the right instance we are analyzing the average response time for each instance saved in ResponseTimeHistory
by ResponseTimeInterceptor
. In the beginning our load balancer acts like a simple round robin.
class WeightedTimeResponseLoadBalancer(
private val serviceInstanceListSupplierProvider: ObjectProvider<ServiceInstanceListSupplier>,
private val serviceId: String,
private val responseTimeHistory: ResponseTimeHistory) : ReactorServiceInstanceLoadBalancer {
private val logger: Logger = LoggerFactory.getLogger(WeightedTimeResponseLoadBalancer::class.java)
private val position: AtomicInteger = AtomicInteger()
override fun choose(request: Request<*>?): Mono<Response<ServiceInstance>> {
val supplier: ServiceInstanceListSupplier = serviceInstanceListSupplierProvider
.getIfAvailable { NoopServiceInstanceListSupplier() }
return supplier.get().next()
.map { serviceInstances: List<ServiceInstance> -> getInstanceResponse(serviceInstances) }
private fun getInstanceResponse(instances: List<ServiceInstance>): Response<ServiceInstance> {
return if (instances.isEmpty()) {
} else {
val address: String? = responseTimeHistory.getAddress(instances.size)
val pos: Int = position.incrementAndGet()
var instance: ServiceInstance = instances[pos % instances.size]
if (address != null) {
val found: ServiceInstance? = instances.find { "${it.host}:${it.port}" == address }
if (found != null)
instance = found
logger.info("Current instance: [address->{}:{}, stats->{}ms]", instance.host, instance.port,
MDC.put("address", "${instance.host}:${instance.port}")
Here’s the implementation of ResponseTimeHistory
bean, which is responsible for storing measures and selecting the instance of service based on computed weight.
class ResponseTimeHistory(private val history: MutableMap<String, Queue<Long>> = mutableMapOf(),
val stats: MutableMap<String, Long> = mutableMapOf()) {
private val logger: Logger = LoggerFactory.getLogger(ResponseTimeHistory::class.java)
fun addNewMeasure(address: String, measure: Long) {
var list: Queue<Long>? = history[address]
if (list == null) {
history[address] = LinkedList<Long>()
list = history[address]
logger.info("Adding new measure for->{}, measure->{}", address, measure)
if (measure == 0L)
else list!!.add(measure)
if (list.size > 9)
stats[address] = countAvg(address)
logger.info("Counting avg for->{}, stat->{}", address, stats[address])
private fun countAvg(address: String): Long {
val list: Queue<Long>? = history[address]
return list?.sum()?.div(list.size) ?: 0
fun getAddress(numberOfInstances: Int): String? {
if (stats.size < numberOfInstances)
return null
var sum: Long = 0
stats.forEach { sum += it.value }
var r: Long = Random.nextLong(100)
var current: Long = 0
stats.forEach {
val weight: Long = (sum - it.value)*100 / sum
logger.info("Weight for->{}, value->{}, random->{}", it.key, weight, r)
current += weight
if (r <= current)
return it.key
return null
Customizing Spring Cloud Load Balancer
The implementation of our mechanism for weighted response time rule is ready, so the last step is to apply it to Spring Cloud Load Balancer. To do that we need to create a dedicated configuration class with ReactorLoadBalancer
bean declaration as shown below.
class CustomCallmeClientLoadBalancerConfiguration(private val responseTimeHistory: ResponseTimeHistory) {
fun loadBalancer(environment: Environment, loadBalancerClientFactory: LoadBalancerClientFactory):
ReactorLoadBalancer<ServiceInstance> {
val name: String? = environment.getProperty("loadbalancer.client.name")
return WeightedTimeResponseLoadBalancer(
loadBalancerClientFactory.getLazyProvider(name, ServiceInstanceListSupplier::class.java),
name!!, responseTimeHistory)
The custom configuration may be passed to a load balancer using annotation @LoadBalancerClient
. The name of client should be the same as registered in discovery. This part of code is currently commented out in the GitHub repository, so if you would like to enable it for testing just uncomment it.
@LoadBalancerClient(value = "inter-callme-service", configuration = [CustomCallmeClientLoadBalancerConfiguration::class])
class InterCallerServiceApplication {
fun responseTimeHistory(): ResponseTimeHistory = ResponseTimeHistory()
fun responseTimeInterceptor(): ResponseTimeInterceptor = ResponseTimeInterceptor(responseTimeHistory())
Customizing instance list supplier
Currently Spring Cloud Load Balancer does not support a static list of instances set in configuration properties (unlike Netflix Ribbon). We can easily add such a mechanism. The static list of instances for every service will be defined as shown below.
name: inter-caller-service
enabled: false
- name: inter-callme-service
servers: localhost:59600, localhost:59800
As the first step, we should define a class that implements interface ServiceInstanceListSupplier
and overrides two methods: getServiceId()
and get()
. The following implementation of ServiceInstanceListSupplier
takes the list of service addresses from application properties through @ConfigurationProperties
class StaticServiceInstanceListSupplier(private val properties: LoadBalancerConfigurationProperties,
private val environment: Environment) : ServiceInstanceListSupplier {
override fun getServiceId(): String = environment.getProperty("loadbalancer.client.name")!!
override fun get(): Flux<MutableList<ServiceInstance>> {
val serviceConfig: LoadBalancerConfigurationProperties.ServiceConfig? =
properties.instances.find { it.name == serviceId }
val list: MutableList<ServiceInstance> =
serviceConfig!!.servers.split(",", ignoreCase = false, limit = 0)
.map { StaticServiceInstance(serviceId, it) }.toMutableList()
return Flux.just(list)
Here’s the implementation of configuration class with properties.
class LoadBalancerConfigurationProperties {
val instances: MutableList<ServiceConfig> = mutableListOf()
class ServiceConfig {
var name: String = ""
var servers: String = ""
The same as for the previous sample we should also register our implementation of ServiceInstanceListSupplier
as a bean inside custom configuration class.
class CustomCallmeClientLoadBalancerConfiguration) {
fun discoveryClientServiceInstanceListSupplier(discoveryClient: ReactiveDiscoveryClient, environment: Environment,
zoneConfig: LoadBalancerZoneConfig, context: ApplicationContext,
properties: LoadBalancerConfigurationProperties): ServiceInstanceListSupplier {
val delegate = StaticServiceInstanceListSupplier(properties, environment)
val cacheManagerProvider = context.getBeanProvider(LoadBalancerCacheManager::class.java)
return if (cacheManagerProvider.ifAvailable != null) {
CachingServiceInstanceListSupplier(delegate, cacheManagerProvider.ifAvailable)
} else delegate
Testing Spring Cloud Load Balancer
To test the solution implemented for the purpose of this article you should:
- Run the instance of discovery server (only if
is disabled) - Run two instances of
(for one selected instance activate random delay using VM parameter-Dspring.profiles.active=delay
) - Run instance of
, which is available on port8080
- Send some test requests to inter-caller-service using command, for example
curl -X POST http://localhost:8080/caller/random-send/12345
Our test scenario is visualized in the following picture.
Currently, Spring Cloud Load Balancer does not offer such many interesting features for inter-service communication as the Netflix Ribbon client. Of course, it is still being actively developed by the Spring Team. The good news is that we can easily customize Spring Cloud Load Balancer to add some custom features. In this article I demonstrated how to provide more advanced load balancing algorithms or create custom instances of list suppliers.