Stream CollectorsJ8 Home « Stream Collectors

We have seen many examples in earlier lessons of a terminal operation producing a count, returning a boolean or optional or simply printing results to a console. But what if want the stream used by our terminal operation to output results to a collection which we can use afterwards, or maybe to aggregate or group and/or partition the stream?

Well we can do this using the overloaded collect() method of the Stream<E> interface that takes a Collector object as a parameter. However , before we look at using the collect() method in detail its worth looking at what we can pass to the collect() method and to do this we need to look at the Collector<T,A,R> interface. The Collector<T,A,R> interface provides methods for implementing a specific type of mutable reduction known as a collector.

A mutable reduction is one where the reduced value is a mutable result container e.g. a HashSet<E>, and elements are incorporated by updating the state of the result rather than by replacing the result

We will also investigate the Collectors class in some detail, an implementation of the Collector<T,A,R> interface which has some very useful static methods, including ones to create collections, find aggregates and group and/or partition a stream.

Collection creation and aggregations will be looked at in the Collecting & Aggregating Streams lesson.

Grouping & Partitioning our streams will be discussed in the Grouping & Partitioning Streams lesson.

Collector<T,A,R> Interface Top

The Collector<T,A,R> interface consists of five implementation methods and an of static metod for Collector creation.

You can implement the Collector<T,A,R> interface for a specific purpose or more generally use the static methods of the Collectors implementation class.

Type Parameter Pnemonics:

  1. T - Generic type of input elements to reduction operation
  2. A - Mutable accumulation type of reduction operation (object accumulated on during collection)
  3. R - Result type of reduction operation

The following table lists the implementation methods of the Collector<T,A,R> interface:

Implementation Methods
Method Return Type Description Code Example
accumulator()BiConsumer<A,T>A function that folds a value into a mutable result container.public BiConsumer<List<T>, T> accumulator() {
    return List::add;
}
characteristics()Set<Characteristics>Returns a Set indicating the characteristics of this Collector.public Set<Characteristics> characteristics() {
    return Collections.unmodifiableSet(
        EnumSet.of(IDENTITY_FINISH, CONCURRENT));
}
combiner()BinaryOperator<A>Function accepting two partial results and merging them.public BinaryOperator<List<T>> combiner() {
    return (listA, listB) -> {
        listA.addAll(listB);
        return listA;
    };
}
finisher()Function<A,R>Perform the final transformation from the intermediate accumulation type A to the final result type R.public Function<List<T>, List<T>> finisher() {
    return Function.identity();
}
supplier()Supplier<A>Create and returns a new mutable result container.public Supplier<List<T>> supplier() {
    return LinkedList::new;
}

The following code shows an example implementation of the Collector<T,A,R> interface:



package info.java8;

import java.util.*;
import java.util.function.BiConsumer;
import java.util.function.BinaryOperator;
import java.util.function.Function;
import java.util.function.Supplier;
import java.util.stream.Collector;

import static java.util.stream.Collector.Characteristics.*;

public class CollectorImpl<T> implements Collector<T, Set<T>, Set<T>> {

    @Override
    public Supplier<Set<T>> supplier() {
        return HashSet::new;
    }

    @Override
    public BiConsumer<Set<T>, T> accumulator() {
        return Set::add;
    }

    @Override
    public BinaryOperator<Set<T>> combiner() {
        return (setA, setB) -> {
            setA.addAll(setB);
            return setA;
        };
    }

    @Override
    public Function<Set<T>, Set<T>> finisher() {
        return Function.identity();
    }

    @Override
    public Set<Characteristics> characteristics() {
        return Collections.unmodifiableSet(
                EnumSet.of(CONCURRENT, UNORDERED, IDENTITY_FINISH));
    }
}


Lets go though the code!

The supplier() method creates and returns an empty instance of an accumulator used during collection; in our case an empty Set<E> object.

The accumulator() method adds the current item to the accumulator used during collection and returns the accumulator object; in our case a Set<E> object.

The combiner() method merges partial results from the subparts of the stream and returns the combined results; in our case a Set<E> object.

The finisher() returns the final transformation from the intermediate accumulation type A to the final result type R; in our case both are of type Set<T>.

The characteristics() method returns an immutable set of Characteristics which define the behaviour of this Collector.

The Characteristics class is an enumeration containing the following items:

  1. CONCURRENT - The accumulator()) function can be called from multiple threads for the same result container. If not also marked as UNORDERED, then it should only be evaluated concurrently if applied to an unordered data source.
  2. UNORDERED - The results of the reduction aren't affected by traversal and accumulation order.
  3. IDENTITY_FINISH - Indicates the finisher() function is the same as the identity function and can be left out.

Collectors Class Top

The Collectors class has lots of static method implementations of Collector for collecting a stream into a list, set or map and even into a collection implementation, for aggregating, grouping, partitioning as well as utility methods.

The following table lists all the Collectors class static methods available:

Collectors Static Methods
Method Notes Description Code Examples
averagingDouble()There are also variants for Integer and Long.Returns a Collector of the arithmetic mean (average) of a double-valued function applied to the input elements.Aggregates From Streams
collectingAndThen()Useful for further transformations.Adapts a Collector to perform an additional finishing transformation.Multilevel Grouping
counting()There are variants to add a delimiter and also a sequence.Returns a Collector that concatenates the input elements into a String, in encounter order.Grouping Streams
groupingBy()There are three variants of this method taking 1, 2 or three parameters.Returns a Collector implementing a group by operation on type T input elements, grouped by a classification function, and returning the results in a Map<K, V>.Grouping
Multilevel Grouping
Mapping Our Groupings
groupingByConcurrent()There are three variants of this method taking 1, 2 or three parameters.Returns a Collector implementing a group by operation on type T input elements, grouped by a classification function, and returning the results in a concurrent Collector.
joining()There are also variants to add a delimiter and also a sequence.Returns a Collector that concatenates the input elements into a String, in encounter order.Aggregates From Streams
mapping()Useful with groupingBy().Adapts a Collector from accepting elements of type U to one accepting elements of type T via a mapping function for every input element prior to accumulation.Mapping Our Groupings
maxBy()Returns an Optional<T>.Returns a Collector that produces the maximal element according to a given Comparator.Aggregates From Streams
minBy()Returns an Optional<T>.Returns a Collector that produces the minimal element according to a given Comparator.Aggregates From Streams
partitioningBy()Returns an Optional<T>.Returns a Collector that produces the minimal element according to a given Comparator.Partitioning
reducing()There are three variants of this method taking 1, 2 or three parameters.Returns a Collector that produces the minimal element according to a given Comparator.
summarizingDouble()There are also variants for Integer and Long.Returns a Collector that applies a double-producing mapping function to each input element, returning summary statistics for the resultant values.Aggregates From Streams
summingDouble()There are also variants for Integer and Long.Returns a Collector that produces the sum of a double-valued function applied to the input elements.Aggregates From Streams
toCollection()Use for collection implementations.Returns a Collector that accumulates the input elements into a new Collection<E>, in encounter order.Collections From Streams
toList()There is also a Set<E> variant of this methodReturns a Collector that accumulates elements into a new List<E>.Collections From Streams
toMap()There are three variants of this method taking 1, 2 or three parameters.Returns a Collector that accumulates elements into a Map<K,V> whose keys and values are the result of applying the provided mapping functions to the input elements.Collections From Streams

Related Quiz

Streams Quiz 9 - Stream Collectors Quiz

Lesson 9 Complete

In this lesson we looked at the Collector<T,A,R> interface and the Collectors class.

What's Next?

In the next lesson we look at grouping and partitioning.