Java Stream APIJ8 Home « Java Stream API

We start this lesson by looking at the various types of collections and their corresponding interfaces and which type of collection is suitable for specific types of data. We then look at usage of the Comparable interface. After this we look at the equals() method of the Object class which only ever returns true when the reference variables refer to the same object which isn't helpful when we want our collections sorted and stored dependent upon fields in the collection, so we look at correct and incorrect overrides of corresponding equals() and hashCode() methods and explain the difference between == and the equals() method. Generics were introduced with java and we look at the reasons for doing this. We then look at usage of type parameters in class/interface declarations, instance variables, method arguments, and return types before looking at generic methods as well as methods that make use of wildcard types. We finish the lesson by looking at array and list manipulation using the java.util package, usage of the java.util.Comparator and java.lang.Comparable interfaces for sorting collections and the effect of natural ordering of primitive wrapper classes and java.lang.String on sorting.

Lets take a look at the points outlined at Oracle Java SE 8 Programmer II for this part of the certification.

  • Java Stream API
    1. Develop code to extract data from an object using peek() and map() methods including primitive versions of the map() method.
    2. Search for data by using search methods of the Stream classes including findFirst, findAny, anyMatch, allMatch, noneMatch.
    3. Develop code that uses the Optional class.
    4. Develop code that uses Stream data methods and calculation methods.
    5. Sort a collection using Stream API.
    6. Save results to a collection using the collect method and group/partition data using the Collectors class.
    7. Use flatMap() methods in the Stream API.
Collection Types/Utilities & Concrete Implementation Classes
Collection Hierarchy Map Hierarchy Utilities Hierarchy
ListsQueuesSetsMapsUtilities
ArrayListPriorityQueueHashSetHashMapArrays
VectorLinkedHashSetHashtableCollections
LinkedListTreeSetLinkedHashMap
TreeMap

The table above shows we have two utility classes for use with collections as well as four collection types, these being lists, maps, queues and sets. Each collection type can be further divided by whether it is ordered and unsorted, or both sorted and ordered or finally both unsorted and unordered. Ordered collections allow us to iterate over our collections in a specific pre-determined way. Sorted collections are ordered according to the sort order and as such the ordering is defined by the sorting rules. This is the reason we can't have sorted unordered collections, as the sort order implicitly defines the ordering; but we can have ordered collections that are unsorted.

The sort order is derived from properties within the objects being sorted and can be based on natural ordering such as alphabetically or numerically or in a bespoke order as defined by the programmer. We will expand on the last paragraph when we look at the sorted collection types the terminology applies to as we work through the section.

There is quite a lot of information to assimilate when using collections so lets look at some hierarchy diagrams to clarify the points made so far and give an overview of what's to come:

Collection Hierarchy DiagramTop

The diagram below is a representation of the Collection hierarchy and covers the interfaces and classes studied in the Java section. The diagram has several interfaces missing and also the java.util.EnumSet<E> and java.util.Stack<E> concrete implemetations which are not covered on the site, but should help in visualisation:

collection hierarchy

Collection Interfaces & ClassesTop

The table below gives a description of each interface and class within the above diagram. Click a link to go to detailed information about a particular class:

Interface/Class Description
Collection<E>The root interface in the Collection hierarchy which the List<E>, Queue<E> and Set<E> interfaces extend. There is no direct implementation of the Collection<E> interface within the JDK.
List<E>Interface for ordered collections of elements.
Queue<E>Interface for holding elements prior to processing.
Set<E>Interface for unique collections of elements.
ArrayList<E>Random access, resizable-array implementation of the List<E> interface that implements all optional list operations and permits all elements, including null.
Vector<E>Synchronized random access resizable-array implementation of the List<E> interface.
LinkedList<E>Sequential access linked implementation of the List<E> interface that implements all optional list operations, and permits all elements, including null. The class also implements the Queue<E> interface, providing first-in-first-out queue operations.
PriorityQueue<E>Unbounded priority queue implementation of the Queue<E> interface based on a priority heap that does not permit null elements.
HashSet<E>Implementation of the Set<E> interface using a HashMap instance.
LinkedHashSet<E>Hash table and linked list implementation of the Set<E> interface that implements all optional set operations and permits all elements, including null.
SortedSet<E>Interface for unique collections of sorted elements.
TreeSet<E>Implementation of the Set<E> interface using a TreeMap instance.

Collection Types and OrderingTop

The table below extrapolates the commented information about ordering from the diagram above into a more reable tabular format:

Collection Type Ordering
ListsQueuesSetsOrderedSorted
ArrayList<E>By the indexNo
Vector<E>By the indexNo
LinkedList<E>By the indexNo
PriorityQueue<E>SortedPIPO
HashSet<E>NoNo
LinkedHashSet<E>By insertion orderNo
TreeSet<E>SortedNatural/bespoke order

SetsTop

So what type of collection is a Set? A Set doesn't allow duplicates so every entry within a Set must be unique and as the name implies the interface models the mathematical set abstraction. In practical terms this means that we can never have more than one element within the Set referencing the same object or more than one element within the Set referencing two objects that are considered equal. See the Checking Object Equality section from the OO Concepts - The Object Superclass section of the site for a reminder of what constitutes object equality. We can also only have a maximum of one entry within the set with a value of null.

HashSetsTop

A HashSet is an unordered and unsorted implementation of the Set<E> interface, backed by a hash table using a HashMap instance. This type of set makes no guarantees over iteration order and in particular no guarantees that the iteration order will remain constant over time.

The hashcode of the object being inserted into the collection is used so the more efficient your override of the hashCode() method of the Object class is, the more efficient the HashSet. This also means that if you want to find out if an object exists within the set, or you want to delete an object from the set, then you need a reference to the object in question.

This type of set is a good choice when you want a collection with no duplicates and fast access and the order doesn't matter when iterating over it.

See HashSets for example code.

LinkedHashSetsTop

A LinkedHashSet is a Hash table and linked list implementation of the Set interface with predictable iteration order and differs from a HashSet by maintaintaining a doubly-linked list running through all of its elements. The linked list defines the iteration order of a LinkedHashSet by the order in which elements are inserted into the set.

This type of set is a good choice when you want a collection with no duplicates and fast access and you want to maintain an insertion order when iterating over it.

See LinkedHashSets for example code.

TreeSetsTop

A TreeSet is a sorted implementation of the Set<E> interface, backed by a TreeMap instance. This type of set guarantees that the set will be sorted in ascending order, according to the natural order of the elements contained within it, or by a custom comparator/comparable, dependant upon the type of constructor used on creation.

The ordering maintained by a set whether through natural order of the elements contained within it or by a custom comparator/comparable, must be consistent with the equals() method override rules of the Object class if it is to correctly implement the Set<E> interface and obey the rules of equality.

This type of set is a good choice when you want a sorted collection with no duplicates with the option of custom sorting.

See TreeSets for example code.

ListsTop

So what type of collection is a List? Lists are ordered collections that use an index for the ordering of the elements within the collection. As such lists have methods for the index that you don't find in other collection types within the The Collections Framework. An element is added to a list by specifying the index position or to the end of the list when no index is specified.

ArrayListsTop

An ArrayList is a resizable-array, ordered implementation of the List<E> interface that permits all elements including null. ArrayList allows random access to elements and is similar to Vector apart from being unsynchronized.

This type of list is a good choice when you want fast access and iteration but are not doing a lot of insertions and deletions.

See ArrayLists for example code.

VectorsTop

A Vector is a resizable-array, ordered implementation of the List<E> interface that permits all elements including null. Vector allows random access to elements and is similar to ArrayList apart from being synchronized. In fact Vector is one of the two original collections shipped with Java, the other being Hashtable. Vector was retrofitted to implement List<E> interface, so that it became a part of The Collections Framework when the framework was introduced in version 1.2.

There is no sound reason since the introduction of the ArrayList<E> class to use the Vector<E> class. If you need an ArrayList to be synchronized this can be achieved using methods of the Collections class without all the overheads of using Vector<E> synchronized methods.

See Vectors for example code.

LinkedListsTop

A LinkedList is an ordered implementation of the List<E> interface that is accessed sequentialy using a doubly-linked list running through all of its elements. Because of this, the sequential access and the fact that the LinkedList<E> class also implements the Queue<E> interface, there are methods unique to this type of List.

This type of list is a good choice when you are doing a lot of insertions and deletions and because of the implementation of the Queue<E> interface is also a good candidate for stacks and queues.

See LinkedLists for example code.

QueuesTop

So what type of collection is a Queue? A Queue is an ordered list of actions that are to be processed and as such does not allow null elements. Typically, a Queue would be in FIFO (first-in, first-out) order although other orders such as LIFO (last-in, first-out), natural-ordering and custom comparator order are also possible.

PriorityQueuesTop

A PriorityQueue is a sorted, ordered implementation of the Queue<E> interface that permits all elements except null. A PriorityQueue queue is ordered in PIPO (priority-in, priority-out) order. This is achieved by natural order or a custom comparator to do the sorting and elements sorted first get the highest priority, in other words the ordering of the elements represents their priority within the queue.

See PriorityQueues for example code.

Map Hierarchy DiagramTop

The diagram below is a representation of the Map hierarchy and covers the interfaces and classes studied in the Java section. The diagram has several interfaces missing and also the java.util.WeakHashMap<K,V> and java.util.IdentityHashMap<K,V> concrete implemetations which are not covered on the site, but should help in visualisation:

map hierarchy

Map Interfaces & ClassesTop

The table below gives a description of each interface and class within the above diagram. Click a link to go to detailed information about a particular class:

Interface/Class Description
Map<K,V>The root interface in the map hierarchy which the SortedMap<K,V> interface extends.
HashMap<K,V>Hash table based implementation of the Map<K,V> interface that implements all optional map operations and permits all elements, including null and the null key.
LinkedHashMap<K,V>Hash table and linked list implementation of the Map<K,V> interface with predictable ordering and permits all elements, including null elements.
Hashtable<K,V>Hash table implementation of the Map<K,V> interface that doesn't allow null elements or null keys.
SortedMap<K,V>Interface for sorted map elements.
TreeMap<K,V>Random access tree based implementation of the SortedMap<K,V> interface.

Map Types and OrderingTop

The table below extrapolates the commented information about ordering from the diagram above into a more reable tabular format:

Collection Type Ordering
MapOrderedSorted
HashMap<K,V>NoNo
LinkedHashMap<K,V>By insertion order or last access orderNo
Hashtable<K,V>NoNo
TreeMap<K,V>SortedBy natural order or bespoke order

MapsTop

So what type of collection is a Map? Maps have unique identifiers as their keys which are mapped to a value, similar to key/value pairs in other languages. A Map doesn't allow duplicates keys so every entry within a Map must be unique. In practical terms this means that we can never have more than one element within the Map referencing the same object or more than one element within the Map referencing two objects that are considered equal. See the Checking Object Equality section from the OO Concepts - The Object Superclass section of the site for a reminder of what constitutes object equality. We can also only have a maximum of one entry within the set with a value of null.

Another feature of Map collection is how the Map<K,V> interface provides three different collection views, allowing us to view a map collection as a set of keys, a collection of values and as a set of key/value mappings.

HashMapTop

A HashMap is an unordered and unsorted implementation of the Map<K,V> interface that allows null values and a null key. This type of map makes no guarantees over iteration order and in particular no guarantees that the iteration order will remain constant over time.

The hashcode of the object being inserted into the collection is used so the more efficient your override of the hashCode() method of the Object class is, the more efficient the HashMap. This also means that if you want to find out if an object exists within the map, or you want to delete an object from the map, then you need a reference to the object in question.

This type of map is a good choice when you want a collection with no duplicates and fast access and the order doesn't matter when iterating over it.

See HashMap for example code.

LinkedHashMapsTop

A LinkedHashMap is a Hash table and linked list implementation of the Map<K,V> interface with predictable iteration order and differs from a HashSet by maintaintaining a doubly-linked list running through all of its elements. The linked list defines the iteration order of a LinkedHashMap by the order in which elements are inserted into the set.

This type of set is a good choice when you want a collection with no duplicates and fast access and you want to maintain an insertion order when iterating over it.

See LinkedHashMaps for example code.

HashtablesTop

A Hashtable is an unordered and unsorted implementation of the Map<K,V> interface that doesn't allow null values or a null key. This type of map makes no guarantees over iteration order and in particular no guarantees that the iteration order will remain constant over time. Hashtable allows random access to elements and is similar to HashMap apart from being synchronized. In fact Hashtable is one of the two original collections shipped with Java, the other being Vector.

There is no sound reason since the introduction of the HashMap<K,V> class to use the Hashtable<K,V> class. If you need an HashMap to be synchronized this can be achieved using methods of the Collections class without all the overheads of using Hashtable<K,V> synchronized methods.

See Hashtables for example code.

TreeMapsTop

A TreeMap is a sorted implementation of the Map<K,V> interface. This type of map guarantees that the map will be sorted in ascending order, according to the natural order of the elements contained within it, or by a custom comparator/comparable, dependant upon the type of constructor used on creation.

The ordering maintained by a map whether through natural order of the elements contained within it or by a custom comparator/comparable, must be consistent with the equals() method override rules of the Object class if it is to correctly implement the Map<K,V> interface and obey the rules of equality.

This type of map is a good choice when you want a sorted collection with no duplicates with the option of custom sorting.

See TreeMaps for example code.

Utilities Hierarchy DiagramTop

The diagram below is a representation of the utilities hierarchy and covers the java.util.Arrays and java.util.Collections classes which were studied in the Java section. These classes are genral utility classes and give us some useful static methods to use with arrays and collections.

utilities hierarchy

Utilities ClassesTop

The table below gives a description of the classes within the above diagram. Click a link to go to detailed information about a particuar class:

Class Description
java.util.ArraysUtility class for manipulating arrays.
java.util.CollectionsUtility class for manipulating collections.

Checking Object EqualityTop

The equals() method of the Object class indicates whether another object is equal to the invoking object by using the == relational operator. What the method is actually doing is checking the bits in both reference variables and returning a boolean value. Therefore the equals() method of the Object class only ever returns true when the reference variables refer to the same object. Following is the contract specified in the Oracle documentation for the equals() method of the Object class:

  • It is reflexive: for any non-null reference value x, x.equals(x) should return true.
  • It is symmetric: for any non-null reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
  • It is transitive: for any non-null reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.
  • It is consistent: for any non-null reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the objects is modified.
  • For any non-null reference value x, x.equals(null) should return false.

Overriding equals()Top

But what if we want to think of two objects as equal if part of their object state is shared? Well then we need to override the equals() method of the Object class.

See OO Concepts - Overriding equals() for example code.

The hashCode() MethodTop

The hashCode() method returns the hash code value associated with the object. Following is the contract specified in the Oracle documentation for the hashCode() method of the Object class:

  • Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode() method must consistently return the same integer, provided no information used in equals() comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
  • If two objects are equal according to the equals(Object) method, then calling the hashCode() method on each of the two objects must produce the same integer result.
  • It is not required that if two objects are unequal according to the equals(Object) method, then calling the hashCode() method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.

From this contract we can see that the hashCode() method of the Object class, relies on the equals() method of the Object class. What this means to us is whenever we override the equals method of the Object class, we should also override the hashCode() method to keep the general contract for the hashCode() method, which states that equal objects must have equal hash codes. We can tabularize the association between the equals and hashCode() contracts to make understanding easier:

Condition Mandatory Not Mandatory But Possible
From the equals() viewpoint
if x.equals(y) returns truex.hashCode() == y.hashCode()
if x.equals(y) returns falseNo hashCode() requirement
From the hashCode() viewpoint
x.hashCode() == y.hashCode()x.equals(y) returns true
x.hashCode() != y.hashCode()x.equals(y) returns false

See OO Concepts - Overriding hashCode() for example code.

GenericsTop

When we talk about generics or a generic type what we are actually talking about is a parameterized type. So a class, interface or method that has has one or more generic types in its definition can be thought of as generic and we can replace the generic type with a parameterized type. When we define the actual interface, class or method within our code we replace the formal type parameter with the actual type parameter required. As an example we saw in the Collection Hierarchy Diagram that the List interface is specified as List<E>. The following code shows a declaration of an ArrayList<E> using a polymorphic declaration that will only except String objects:


/*
  In the following code snippet we replace the formal type parameter E
  with an actual type parameter String
*/
List<String> aList = new ArrayList<String>();

So what do generics bring to the party? Even pre 1.5 we could create generalized interfaces, classes or methods by using references of the Object superclass. So you will see a lot of generalized pre 1.5 stuff that works on Object but this approach could not guarantee type safety. The problem being that you need to cast from Object to the actual type being worked on.

For instance every time you wanted to read from a collection you would need to cast every object you read. If an object of the wrong type was inserted into the collection then the cast could cause a runtime exception. The exception might be in a program completely removed from the collection itself and might not happen for a long time, making every collection a possible ticking timebomb. The exception to this were Array objects where we have always implicitly declared the type on array creation, so arrays have always been type safe.

With generic type declarations of collections the compiler knows that only objects of the specified type can be added to this collection or throws up a compiler error. Through this mechanism we can achieve type safety for our collections and not have to worry about someone else inputting an invalid object into our collections.

Because of the huge impact generics had on the language the developers at Sun microsystems, the original writers of Java and still in control when 1.5 was released, made the decision to keep raw types in the language. This decision was made so that the billions of lines of code already written in Java were still compatible in java. So wherever you see generic code in the language you can, although not in any way recommended, use raw types without any parameterisation instead.

Generic Terminology
Term Examples Notes
Raw TypeList
Map
No parameterization.
Generic TypeList<E>
Map<K,V>
An interface, class or method followed by one or more formal type parameters within angled brackets.
Formal Type parameterE
K,V
One or more formal type parameters delimitied by commas, which can be thought of as placeholders for the actual type parameters.
Parameterized TypeList<String>
Map<aKey,aValue>
An interface, class or method followed by one or more actual type parameters within angled brackets.
Actual Type parameterString
aKey,aValue
One or more actual type parameters delimitied by commas. Object types are allowed when using generics but primitive types are not.
Bounded Type<T extends Number>Create a superclass boundary which all actual type parameters must be or extend from.
Unbounded Wildcard TypeList<?>Unknown type.
Bounded Wildcard TypesList<? extends Number>Create a superclass boundary which all unknown types must be or extend from.

Click a link in the table above to see usage for the generic type.

java.util.Arrays ClassTop

The java.util.Arrays class gives us methods to search, compare, fill, get a hashcode, sort and give a readable string representation of arrays. All the methods accept Object and most are overloaded to accept a primitive type, appropriate to the method in question. The class also includes a static factory that allows arrays to be viewed as lists.

See java.util.Arrays for example code.

java.util.Collections ClassTop

The java.util.Collections class static methods give us ways to search, typesafe, empty, synchronize, immute, sort and several other ways to work with our collections. As a reminder collections only work with reference variables, so if you are using primitive variables then these need to be boxed before adding to a collection, or use an array instead.

See java.util.Collections for example code.

Sorting CollectionsTop

In our final look at collection we look at sorting our collections using the Comparable and Comparator interfaces. We have already seen examples of sorted collections when we looked at the TreeSet and TreeMap classes in the Java section. We used some simple code examples to show how these collection types worked and were sorted in natural order according to their types. Maybe what wasn't apparent at the time was that their was no problem with ordering these collections as we were using String and Integer objects implement the Comparable interface and its only method compareTo(). We also used the Collections.sort() method in the last lesson to sort a list, this also used a String object.

But what if we create a TreeSet or TreeMap collection or try to use the Collections.sort() method to sort a custom class? To answer this question we will use the Contractor class we created in the Overriding equals() section of the The Object Superclass lesson. The Contractor class was tested to ensure that the overrides of equals() and hashCode() worked correctly. When creating your own collections you will always want to override these methods if you want meaningful, efficient collections.


java.lang.Comparable<T> InterfaceTop

Classes can be sorted by using the java.lang.Comparable interface. We need to implement the Comparable interface and its only method compareTo() within our classes. The compareTo() method returns a negative integer, zero, or a positive integer dependant upon this object being less than, equal to, or greater than the specified object.

See the java.lang.Comparable interface for example code.


java.util.Comparator<T> InterfaceTop

The second way to sort collections is by using the java.util.Comparator interface. The Comparable interface works fine but you have to change the class you want sorted and is a 'one-trick pony' because you can only sort the collection in one way. But what about sorting classes we can't modify or we want to sort in different ways. Luckily for us Java also has the java.util.Comparator<T> interface, which allows us to sort a class without changing the code within that class, which also means we can have multiple sort sequences as well. All we need to do is create a class that implements the java.util.Comparator<T> interface and its compare() method.

See the java.util.Comparator<T> interface for example code.


Related Java Tutorials

Collections/Generics - Collections Overview
Collections/Generics - Sets
Collections/Generics - Lists
Collections/Generics - Queues
Collections/Generics - Maps
Collections/Generics - Generics
Collections/Generics - Sorting Collections
OO Concepts - The Object Superclass