Interesting Facts About Java Streams and Collections

Interesting Facts About Java Streams and Collections

This article will show some interesting features of Java Streams and Collections you may not heard about. We will look at both the latest API enhancements as well as the older ones that have existed for years. That’s my private list of features I used recently or I just came across while reading articles about Java. If you are interested in Java you can find some similar articles on my blog. In one of them, you can find a list of less-known but useful Java libraries.

Source Code

If you would like to try it by yourself, you may always take a look at my source code. In order to do that, you need to clone my GitHub repository. Once you clone the repository, you can switch to the jdk22 branch. Then you should just follow my instructions.

Mutable or Immutable

The approach to the collections immutability in Java can be annoying for some of you. How do you know if a Java Collection is mutable or immutable? Java does not provide dedicated interfaces and implementations for mutable or immutable collections like e.g. Kotlin. Of course, you can switch to the Eclipse Collections library, which provides a clear differentiation of types between readable, mutable, and immutable. However, if you are still with a standard Java Collections let’s analyze the situation by the example of the java.util.List interface. Some years ago Java 16 introduced a new Stream.toList() method to convert between streams and collections. Probably, you are using it quite often 🙂 It is worth mentioning that this method returns an unmodifiable List and allows nulls.

var l = Stream.of(null, "Green", "Yellow").toList();
assertEquals(3, l.size());
assertThrows(UnsupportedOperationException.class, () -> l.add("Red"));
assertThrows(UnsupportedOperationException.class, () -> l.set(0, "Red"));
Java

So, the Stream.toList() method is not just a replacement for the older approach based on the Collectors.toList(). On the other hand, an older Collectors.toList() method returns a modifiable List and also allows nulls.

var l = Stream.of(null, "Green", "Yellow").collect(Collectors.toList());
l.add("Red");
assertEquals(4, l.size());
Java

Finally, let’s see how to achieve the next possible option here. We need an unmodifiable List and does not allow nulls. In order to achieve it, we have to use the Collectors.toUnmodifiableList() method as shown below.

assertThrows(NullPointerException.class, () ->
        Stream.of(null, "Green", "Yellow")
                .collect(Collectors.toUnmodifiableList()));
Java

Grouping and Aggregations with Java Streams

Java Streams introduces several useful methods that allow us to group and aggregate collections using different criteria. We can find those methods in the java.util.stream.Collectors class. Let’s create the Employee record for testing purposes.

public record Employee(String firstName, 
                       String lastName, 
                       String position, 
                       int salary) {}
Java

Now, let’s assume we have a stream of employees and we want to calculate the sum of salary grouped by the position. In order to achieve this, we should combine two methods from the Collectors class: groupingBy and summingInt. If you would like to count the average salary by the position, you can just replace the summingInt method with the averagingInt method.

Stream<Employee> s1 = Stream.of(
    new Employee("AAA", "BBB", "developer", 10000),
    new Employee("AAB", "BBC", "architect", 15000),
    new Employee("AAC", "BBD", "developer", 13000),
    new Employee("AAD", "BBE", "tester", 7000),
    new Employee("AAE", "BBF", "tester", 9000)
);

var m = s1.collect(Collectors.groupingBy(Employee::position, 
   Collectors.summingInt(Employee::salary)));
assertEquals(3, m.size());
assertEquals(m.get("developer"), 23000);
assertEquals(m.get("architect"), 15000);
assertEquals(m.get("tester"), 16000);
Java

For more simple grouping we can the partitioningBy method. It always returns a map with two entries, one where the predicate is true and the second one where it is false. So, for the same stream as in the previous example, we can use partitioningBy to divide employees into those with salaries higher than 10000, and lower or equal to 10000.

var m = s1.collect(Collectors.partitioningBy(emp -> emp.salary() > 10000));
assertEquals(2, m.size());
assertEquals(m.get(true).size(), 2);
assertEquals(m.get(false).size(), 3);
Java

Let’s take a look at another example. This time, we will count the number of times each element occurs in the collection. Once again, we can use the groupingBy method, but this time in conjunction with the Collectors.counting() method.

var s = Stream.of(2, 3, 4, 2, 3, 5, 1, 3, 4, 4)
   .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
assertEquals(5, m.size());
assertEquals(m.get(4), 3);
Java

The Map.merge() Method

In the previous examples, we use methods provided by Java Streams to perform grouping and aggregations. Now, the question is if we can do the same thing with the standard Java collections without converting them to streams. The answer is yes – we can easily do it with the Map.merge() method. It is probably the most versatile operation among all the Java key-value methods. The Map.merge() method either puts a new value under the given key (if absent) or updates the existing key with a given value. Let’s rewrite the previous examples, to switch from Java streams to collections. Here’s the implementation for counting the number of times each element occurs in the collection.

var map = new HashMap<Integer, Integer>();
var nums = List.of(2, 3, 4, 2, 3, 5, 1, 3, 4, 4);
nums.forEach(num -> map.merge(num, 1, Integer::sum));
assertEquals(5, map.size());
assertEquals(map.get(4), 3);
Java

Then, we can implement the operation of calculating the sum of salary grouped by the position. So, we are grouping by the emp.position() and calculating the total salary by summing the previous value with the value taken from the current Employee in the list. The results are the same as in the examples from the previous section.

var s1 = List.of(
   new Employee("AAA", "BBB", "developer", 10000),
   new Employee("AAB", "BBC", "architect", 15000),
   new Employee("AAC", "BBD", "developer", 13000),
   new Employee("AAD", "BBE", "tester", 7000),
   new Employee("AAE", "BBF", "tester", 9000)
);
var map = new HashMap<String, Integer>();
s1.forEach(emp -> map.merge(emp.position(), emp.salary(), Integer::sum));
assertEquals(3, map.size());
assertEquals(map.get("developer"), 23000);
Java

Use EnumSet for Java Enum

If you are storing enums inside Java collections you should use EnumSet instead of e.g. more popular HashSet. The EnumSet and EnumMap collections are specialized versions of Set and Map that are built for enums. Those abstracts guarantee less memory consumption and much better performance. They are also providing some methods dedicated to simplifying integration with Java Enum. In order to compare processing time between EnumSet and a standard Set we can prepare a simple test. In this test, I’m creating a subset of Java Enum inside the EnumSet and then checking out if all the values exist in the target EnumSet.

var x = EnumSet.of(
    EmployeePosition.SRE,
    EmployeePosition.ARCHITECT,
    EmployeePosition.DEVELOPER);
long beg = System.nanoTime();
for (int i = 0; i < 100_000_000; i++) {
   var es = EnumSet.allOf(EmployeePosition.class);
   es.containsAll(x);
}
long end = System.nanoTime();
System.out.println(x.getClass() + ": " + (end - beg)/1e9);
Java

Here’s a similar test without EnumSet:

var x = Set.of(
    EmployeePosition.SRE,
    EmployeePosition.ARCHITECT,
    EmployeePosition.DEVELOPER);
long beg = System.nanoTime();
for (int i = 0; i < 100_000_000; i++) {
   var hs = Set.of(EmployeePosition.values());
   hs.containsAll(x);
}
long end = System.nanoTime();
System.out.println(x.getClass() + ": " + (end - beg)/1e9);
Java

The difference in time consumption before both two variants visible above is pretty significant.

class java.util.ImmutableCollections$SetN: 8.577672411
class java.util.RegularEnumSet: 0.184956851
ShellSession

Java 22 Stream Gatherers

The latest JDK 22 release introduces a new addition to Java streams called gatherers. Gatherers enrich the Java Stream API with capabilities for custom intermediate operations. Thanks to that we can transform data streams in ways that were previously complex or not directly supported by the existing API. First of all, this is a preview feature, so we need to explicitly enable it in the compiler configuration. Here’s the modification in the Maven pom.xml:

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-compiler-plugin</artifactId>
  <version>3.13.0</version>
  <configuration>
    <release>22</release>
    <compilerArgs>--enable-preview</compilerArgs>
  </configuration>
</plugin>
XML

The aim of that article is not to provide you with a detailed explanation of Stream Gatherers. So, if you are looking for an intro it’s worth reading the following article. However, just to give you an example of how we can leverage gatherers I will provide a simple implementation of a circuit breaker with the slidingWindow() method. In order to show you, what exactly happens inside I’m logging the intermediate elements with the peek() method. Our implementation will open the circuit breaker if there are more than 10 errors in the specific period.

var errors = Stream.of(2, 0, 1, 3, 4, 2, 3, 0, 3, 1, 0, 0, 1)
                .gather(Gatherers.windowSliding(4))
                .peek(a -> System.out.println(a))
                .map(x -> x.stream().collect(summing()) > 10)
                .toList();
System.out.println(errors);
Java

The size of our sliding window is 4. Therefore, each time the slidingWindow() method creates a list with 4 subsequent elements. For each intermediate list, I’m summing the values and checking if the total number of errors is greater than 10. Here’s the output. As you see only the circuit breaker is opened for the [3, 4, 2, 3] fragment of the source stream.

[2, 0, 1, 3]
[0, 1, 3, 4]
[1, 3, 4, 2]
[3, 4, 2, 3]
[4, 2, 3, 0]
[2, 3, 0, 3]
[3, 0, 3, 1]
[0, 3, 1, 0]
[3, 1, 0, 0]
[1, 0, 0, 1]
[false, false, false, true, false, false, false, false, false, false]
Java

Use the Stream.reduce() Method

Finally, the last feature in our article. I believe that the Stream.reduce() method is not a very well-known and widely used stream method. However, it is very interesting. For example, we can use it to sum all the numbers in the Java List. The first parameter of the reduce method is an initial value, while the second is an accumulator algorithm.

var listOfNumbers = List.of(1, 2, 3, 4, 5);
var sum = listOfNumbers.stream().reduce(0, Integer::sum);
assertEquals(15, sum);
Java

Final Thoughts

Of course, it is just a small list of interesting facts about Java streams and collections. If you have some other favorite features you can put them in the article comments.

2 COMMENTS

comments user
Edilson Prudencio

Excellent article, clean and objective, congratulations!

    comments user
    piotr.minkowski

    Thanks!

Leave a Reply