Generics and CollectionsJ8 Home « Generics and Collections
We start this lesson by looking at the various types of collections and their corresponding interfaces and which type of collection is suitable for specific types of data. We then look at usage of
the Comparable interface. After this we look at the equals()
method of the Object
class which only ever returns true
when the reference variables refer to the same object which
isn't helpful when we want our collections sorted and stored dependent upon fields in the collection, so we look at correct and incorrect overrides of corresponding equals()
and
hashCode()
methods and explain the difference between ==
and the equals()
method. Generics were introduced with java and we look at the reasons for doing
this. We then look at usage of type parameters in class/interface declarations, instance variables, method arguments, and return types before looking at generic methods as well as methods that
make use of wildcard types. We finish the lesson by looking at array and list manipulation using the java.util
package, usage of the java.util.Comparator
and java.lang.Comparable
interfaces for sorting collections and the effect of natural ordering of primitive wrapper classes and java.lang.String
on sorting.
Lets take a look at the points outlined at Oracle Java SE 8 Programmer II for this part of the certification.
- Generics and Collections
- Create and use a generic class.
- Create and use ArrayList, TreeSet, TreeMap, and ArrayDeque objects.
- Use java.util.Comparator and java.lang.Comparable interfaces.
- Collections Streams and Filters.
- Iterate using forEach methods of Streams and List.
- Describe Stream interface and Stream pipeline.
- Filter a collection by using lambda expressions.
- Use method references with Streams.
Collection Hierarchy | Map Hierarchy | Utilities Hierarchy | ||
---|---|---|---|---|
Lists | Queues | Sets | Maps | Utilities |
ArrayList | PriorityQueue | HashSet | HashMap | Arrays |
Vector | LinkedHashSet | Hashtable | Collections | |
LinkedList | TreeSet | LinkedHashMap | ||
TreeMap |
The table above shows we have two utility classes for use with collections as well as four collection types, these being lists, maps, queues and sets. Each collection type can be further divided by whether it is ordered and unsorted, or both sorted and ordered or finally both unsorted and unordered. Ordered collections allow us to iterate over our collections in a specific pre-determined way. Sorted collections are ordered according to the sort order and as such the ordering is defined by the sorting rules. This is the reason we can't have sorted unordered collections, as the sort order implicitly defines the ordering; but we can have ordered collections that are unsorted.
The sort order is derived from properties within the objects being sorted and can be based on natural ordering such as alphabetically or numerically or in a bespoke order as defined by the programmer. We will expand on the last paragraph when we look at the sorted collection types the terminology applies to as we work through the section.
There is quite a lot of information to assimilate when using collections so lets look at some hierarchy diagrams to clarify the points made so far and give an overview of what's to come:
Collection Hierarchy DiagramTop
The diagram below is a representation of the Collection hierarchy and covers the interfaces and classes studied in the Java section. The diagram has several interfaces missing and also the
java.util.EnumSet<E>
and java.util.Stack<E>
concrete implemetations which are not covered on the site, but should help in visualisation:
Collection Interfaces & ClassesTop
The table below gives a description of each interface and class within the above diagram. Click a link to go to detailed information about a particular class:
Interface/Class | Description |
---|---|
Collection<E> | The root interface in the Collection hierarchy which the List<E> , Queue<E> and Set<E> interfaces extend. There is no direct implementation of the Collection<E> interface within the JDK. |
List<E> | Interface for ordered collections of elements. |
Queue<E> | Interface for holding elements prior to processing. |
Set<E> | Interface for unique collections of elements. |
ArrayList<E> | Random access, resizable-array implementation of the List<E> interface that
implements all optional list operations and permits all elements, including null . |
Vector<E> | Synchronized random access resizable-array implementation of the List<E> interface. |
LinkedList<E> | Sequential access linked implementation of the List<E> interface that implements all optional list
operations, and permits all elements, including null . The class also implements the Queue<E> interface, providing first-in-first-out queue operations. |
PriorityQueue<E> | Unbounded priority queue implementation of the Queue<E> interface based on a priority heap that does not permit null elements. |
HashSet<E> | Implementation of the Set<E> interface using a HashMap instance. |
LinkedHashSet<E> | Hash table and linked list implementation of the Set<E> interface that
implements all optional set operations and permits all elements, including null . |
SortedSet<E> | Interface for unique collections of sorted elements. |
TreeSet<E> | Implementation of the Set<E> interface using a TreeMap instance. |
Collection Types and OrderingTop
The table below extrapolates the commented information about ordering from the diagram above into a more reable tabular format:
Collection Type | Ordering | |||
---|---|---|---|---|
Lists | Queues | Sets | Ordered | Sorted |
ArrayList<E> | By the index | No | ||
Vector<E> | By the index | No | ||
LinkedList<E> | By the index | No | ||
PriorityQueue<E> | Sorted | PIPO | ||
HashSet<E> | No | No | ||
LinkedHashSet<E> | By insertion order | No | ||
TreeSet<E> | Sorted | Natural/bespoke order |
SetsTop
So what type of collection is a Set? A Set doesn't allow duplicates so every entry within a Set must be unique and as the name implies the interface models the mathematical set abstraction. In practical
terms this means that we can never have more than one element within the Set referencing the same object or more than one element within the Set referencing two objects that are considered equal. See the
Checking Object Equality section from the OO Concepts - The Object
Superclass section of the
site for a reminder of what constitutes object equality. We can also only have a maximum of one entry within the set with a value of null
.
HashSetsTop
A HashSet is an unordered and unsorted implementation of the Set<E>
interface, backed by a hash table using a HashMap
instance. This type of set
makes no guarantees over iteration order and in particular no guarantees that the iteration order will remain constant over time.
The hashcode of the object being inserted into the collection is used so the more efficient your override of the hashCode()
method of the
Object
class is, the more efficient the HashSet. This also means that if you want to find out if an object exists within the set, or you want to delete an
object from the set, then you need a reference to the object in question.
This type of set is a good choice when you want a collection with no duplicates and fast access and the order doesn't matter when iterating over it.
See HashSets for example code.
LinkedHashSetsTop
A LinkedHashSet is a Hash table and linked list implementation of the Set interface with predictable iteration order and differs from a HashSet by maintaintaining a doubly-linked list running through all of its elements. The linked list defines the iteration order of a LinkedHashSet by the order in which elements are inserted into the set.
This type of set is a good choice when you want a collection with no duplicates and fast access and you want to maintain an insertion order when iterating over it.
See LinkedHashSets for example code.
TreeSetsTop
A TreeSet is a sorted implementation of the Set<E>
interface, backed by a TreeMap
instance. This type of set guarantees that the set will be sorted
in ascending order, according to the natural order of the elements contained within it, or by a custom comparator/comparable, dependant upon the type of constructor used on creation.
The ordering maintained by a set whether through natural order of the elements contained within it or by a custom comparator/comparable, must be consistent with the equals()
method override rules of the Object
class if it is to correctly implement the Set<E>
interface and obey the rules of equality.
This type of set is a good choice when you want a sorted collection with no duplicates with the option of custom sorting.
See TreeSets for example code.
ListsTop
So what type of collection is a List? Lists are ordered collections that use an index for the ordering of the elements within the collection. As such lists have methods for the index that you don't find in other collection types within the The Collections Framework. An element is added to a list by specifying the index position or to the end of the list when no index is specified.
ArrayListsTop
An ArrayList is a resizable-array, ordered implementation of the List<E>
interface that permits all elements including null
. ArrayList allows random access to elements and is similar to
Vector apart from being unsynchronized.
This type of list is a good choice when you want fast access and iteration but are not doing a lot of insertions and deletions.
See ArrayLists for example code.
VectorsTop
A Vector is a resizable-array, ordered implementation of the List<E>
interface that permits all elements including null
. Vector allows random access to elements and is similar to
ArrayList apart from being synchronized. In fact Vector is one of the two original collections shipped with Java, the other being Hashtable. Vector was retrofitted to implement List<E>
interface, so that it became a part of The Collections Framework when the framework was introduced in version 1.2.
There is no sound reason since the introduction of the ArrayList<E>
class to use the Vector<E>
class. If you need an ArrayList to be synchronized this can be achieved using
methods of the Collections
class without all the overheads of using Vector<E>
synchronized methods.
See Vectors for example code.
LinkedListsTop
A LinkedList is an ordered implementation of the List<E>
interface that is accessed sequentialy using a doubly-linked list running through all of its elements. Because of this, the sequential
access and the fact that the LinkedList<E>
class also implements the Queue<E>
interface, there are methods unique to this type of List.
This type of list is a good choice when you are doing a lot of insertions and deletions and because of the implementation of the Queue<E>
interface is also a good candidate for stacks and queues.
See LinkedLists for example code.
QueuesTop
So what type of collection is a Queue? A Queue is an ordered list of actions that are to be processed and as such does not allow null
elements. Typically, a Queue would be in FIFO
(first-in, first-out) order although other orders such as LIFO (last-in, first-out), natural-ordering and custom comparator order are also possible.
PriorityQueuesTop
A PriorityQueue is a sorted, ordered implementation of the Queue<E>
interface that permits all elements except null
. A PriorityQueue queue is ordered in PIPO
(priority-in, priority-out) order. This is achieved by natural order or a custom comparator to do the sorting and elements sorted first get the highest priority, in other words the ordering of the
elements represents their priority within the queue.
See PriorityQueues for example code.
Map Hierarchy DiagramTop
The diagram below is a representation of the Map hierarchy and covers the interfaces and classes studied in the Java section. The diagram has several interfaces missing and also the
java.util.WeakHashMap<K,V>
and java.util.IdentityHashMap<K,V>
concrete implemetations which are not covered on the site, but should help in visualisation:
Map Interfaces & ClassesTop
The table below gives a description of each interface and class within the above diagram. Click a link to go to detailed information about a particular class:
Interface/Class | Description |
---|---|
Map<K,V> | The root interface in the map hierarchy which the SortedMap<K,V> interface extends. |
HashMap<K,V> | Hash table based implementation of the Map<K,V> interface that
implements all optional map operations and permits all elements, including null and the null key. |
LinkedHashMap<K,V> | Hash table and linked list implementation of the Map<K,V> interface
with predictable ordering and permits all elements, including null elements. |
Hashtable<K,V> | Hash table implementation of the Map<K,V> interface
that doesn't allow null elements or null keys. |
SortedMap<K,V> | Interface for sorted map elements. |
TreeMap<K,V> | Random access tree based implementation of the SortedMap<K,V> interface. |
Map Types and OrderingTop
The table below extrapolates the commented information about ordering from the diagram above into a more reable tabular format:
Collection Type | Ordering | |
---|---|---|
Map | Ordered | Sorted |
HashMap<K,V> | No | No |
LinkedHashMap<K,V> | By insertion order or last access order | No |
Hashtable<K,V> | No | No |
TreeMap<K,V> | Sorted | By natural order or bespoke order |
MapsTop
So what type of collection is a Map? Maps have unique identifiers as
their keys which are mapped to a value, similar to key/value pairs in other languages. A Map doesn't allow duplicates keys so every entry within a Map must be unique. In practical terms this means that we can
never have more than one element within the Map referencing the same object or more than one element within the Map referencing two objects that are considered equal. See the
Checking Object Equality section from the OO Concepts - The Object
Superclass section of the site for a
reminder of what constitutes object equality. We can also only have a maximum of one entry within the set with a value of null
.
Another feature of Map collection is how the Map<K,V>
interface provides three different collection views, allowing us to view a map collection as a set of keys, a collection of values and
as a set of key/value mappings.
HashMapTop
A HashMap is an unordered and unsorted implementation of the Map<K,V>
interface that allows null
values and a null
key. This type of map makes no guarantees over
iteration order and in particular no guarantees that the iteration order will remain constant over time.
The hashcode of the object being inserted into the collection is used so the more efficient your override of the hashCode()
method of the
Object
class is, the more efficient the HashMap. This also means that if you want to find out if an object exists within the map, or you want to delete an
object from the map, then you need a reference to the object in question.
This type of map is a good choice when you want a collection with no duplicates and fast access and the order doesn't matter when iterating over it.
See HashMap for example code.
LinkedHashMapsTop
A LinkedHashMap is a Hash table and linked list implementation of the Map<K,V>
interface with predictable iteration order and differs from a HashSet by maintaintaining a doubly-linked list
running through all of its elements. The linked list defines the iteration order of a LinkedHashMap by the order in which elements are inserted into the set.
This type of set is a good choice when you want a collection with no duplicates and fast access and you want to maintain an insertion order when iterating over it.
See LinkedHashMaps for example code.
HashtablesTop
A Hashtable is an unordered and unsorted implementation of the Map<K,V>
interface that doesn't allow null
values or a null
key. This type of map makes no guarantees
over iteration order and in particular no guarantees that the iteration order will remain constant over time. Hashtable allows random access to elements and is similar to HashMap apart from being synchronized.
In fact Hashtable is one of the two original collections shipped with Java, the other being Vector.
There is no sound reason since the introduction of the HashMap<K,V>
class to use the Hashtable<K,V>
class. If you need an HashMap to be synchronized this can be achieved
using methods of the Collections
class without all the overheads of using Hashtable<K,V>
synchronized methods.
See Hashtables for example code.
TreeMapsTop
A TreeMap is a sorted implementation of the Map<K,V>
interface. This type of map guarantees that the map will be sorted in ascending order, according to the natural order of the
elements contained within it, or by a custom comparator/comparable, dependant upon the type of constructor used on creation.
The ordering maintained by a map whether through natural order of the elements contained within it or by a custom comparator/comparable, must be consistent with the equals()
method override rules of the Object
class if it is to correctly implement the Map<K,V>
interface and obey the rules of equality.
This type of map is a good choice when you want a sorted collection with no duplicates with the option of custom sorting.
See TreeMaps for example code.
Utilities Hierarchy DiagramTop
The diagram below is a representation of the utilities hierarchy and covers the java.util.Arrays
and java.util.Collections
classes which were studied in the Java section. These classes
are genral utility classes and give us some useful static methods to use with arrays and collections.
Utilities ClassesTop
The table below gives a description of the classes within the above diagram. Click a link to go to detailed information about a particuar class:
Class | Description |
---|---|
java.util.Arrays | Utility class for manipulating arrays. |
java.util.Collections | Utility class for manipulating collections. |
Checking Object EqualityTop
The equals()
method of the Object
class indicates whether another object is equal to the invoking object by using the ==
relational operator. What the method is actually
doing is checking the bits in both reference variables and returning a boolean
value. Therefore the equals()
method of the Object
class only ever returns true
when the
reference variables refer to the same object. Following is the contract specified in the Oracle documentation for the equals()
method of the Object
class:
- It is reflexive: for any non-null reference value
x
,x.equals(x)
should returntrue
. - It is symmetric: for any non-null reference values
x
andy
,x.equals(y)
should returntrue
if and only ify.equals(x)
returnstrue
. - It is transitive: for any non-null reference values
x
,y
, andz
, ifx.equals(y)
returnstrue
andy.equals(z)
returnstrue
, thenx.equals(z)
should returntrue
. - It is consistent: for any non-null reference values
x
andy
, multiple invocations ofx.equals(y)
consistently returntrue
or consistently returnfalse
, provided no information used in equals comparisons on the objects is modified. - For any non-null reference value
x
,x.equals(null)
should returnfalse
.
Overriding equals()
Top
But what if we want to think of two objects as equal if part of their object state is shared? Well then we need to override the equals()
method of the Object
class.
See OO Concepts - Overriding equals()
for example code.
The hashCode()
MethodTop
The hashCode()
method returns the hash code value associated with the object. Following is the contract specified in the Oracle documentation for the hashCode()
method of the Object
class:
- Whenever it is invoked on the same object more than once during an execution of a Java application, the
hashCode()
method must consistently return the same integer, provided no information used inequals()
comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application. - If two objects are equal according to the
equals(Object)
method, then calling thehashCode()
method on each of the two objects must produce the same integer result. - It is not required that if two objects are unequal according to the
equals(Object)
method, then calling thehashCode()
method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
From this contract we can see that the hashCode()
method of the Object
class, relies on the equals()
method of the Object
class. What this means to us is whenever we
override the equals
method of the Object
class, we should also override the hashCode()
method to keep the general contract for the hashCode()
method,
which states that equal objects must have equal hash codes. We can tabularize the association between the equals
and hashCode()
contracts to make understanding easier:
Condition | Mandatory | Not Mandatory But Possible |
---|---|---|
From the equals() viewpoint | ||
if x.equals(y) returns true | x.hashCode() == y.hashCode() | |
if x.equals(y) returns false | No hashCode() requirement | |
From the hashCode() viewpoint | ||
x.hashCode() == y.hashCode() | x.equals(y) returns true | |
x.hashCode() != y.hashCode() | x.equals(y) returns false |
See OO Concepts - Overriding hashCode()
for example code.
GenericsTop
When we talk about generics or a generic type what we are actually talking about is a parameterized type. So a class, interface or method that has has one or more generic types in
its definition can be thought of as generic and we can replace the generic type with a parameterized type. When we define the actual interface, class or method within our code we replace
the formal type parameter with the actual type parameter required. As an example we saw in the Collection Hierarchy Diagram that the List
interface is specified as List<E>
. The following code shows a declaration of an ArrayList<E>
using a polymorphic declaration that will
only except String objects:
/*
In the following code snippet we replace the formal type parameter E
with an actual type parameter String
*/
List<String> aList = new ArrayList<String>();
So what do generics bring to the party? Even pre 1.5 we could create generalized interfaces, classes or methods by using references of the Object
superclass.
So you will see a lot of generalized pre 1.5 stuff that works on Object
but this approach could not guarantee type safety. The problem being that you
need to cast from Object
to the actual type being worked on.
For instance every time you wanted to read from a collection you would need to cast every object you read. If an object of the wrong type was inserted into the collection then the cast could cause a runtime
exception. The exception might be in a program completely removed from the collection itself and might not happen for a long time, making every collection a possible ticking timebomb. The exception to this
were Array
objects where we have always implicitly declared the type on array creation, so arrays have always been type safe.
With generic type declarations of collections the compiler knows that only objects of the specified type can be added to this collection or throws up a compiler error. Through this mechanism we can achieve type safety for our collections and not have to worry about someone else inputting an invalid object into our collections.
Because of the huge impact generics had on the language the developers at Sun microsystems, the original writers of Java and still in control when 1.5 was released, made the decision to keep raw types in the language. This decision was made so that the billions of lines of code already written in Java were still compatible in java. So wherever you see generic code in the language you can, although not in any way recommended, use raw types without any parameterisation instead.
Term | Examples | Notes |
---|---|---|
Raw Type | List | No parameterization. |
Generic Type | List<E> | An interface, class or method followed by one or more formal type parameters within angled brackets. |
Formal Type parameter | E | One or more formal type parameters delimitied by commas, which can be thought of as placeholders for the actual type parameters. |
Parameterized Type | List<String> | An interface, class or method followed by one or more actual type parameters within angled brackets. |
Actual Type parameter | String | One or more actual type parameters delimitied by commas. Object types are allowed when using generics but primitive types are not. |
Bounded Type | <T extends Number> | Create a superclass boundary which all actual type parameters must be or extend from. |
Unbounded Wildcard Type | List<?> | Unknown type. |
Bounded Wildcard Types | List<? extends Number> | Create a superclass boundary which all unknown types must be or extend from. |
Click a link in the table above to see usage for the generic type.
java.util.Arrays
ClassTop
The java.util.Arrays
class gives us methods to search, compare, fill, get a hashcode, sort and give a readable string representation of arrays. All the methods accept Object
and most
are overloaded to accept a primitive type, appropriate to the method in question. The class also includes a static factory that allows arrays to be viewed as lists.
See java.util.Arrays
for example code.
java.util.Collections
ClassTop
The java.util.Collections
class static methods give us ways to search, typesafe, empty, synchronize, immute, sort and several other ways to work with our collections. As a reminder collections
only work with reference variables, so if you are using primitive variables then these need to be boxed before adding to a collection, or use an array instead.
See java.util.Collections
for example code.
Sorting CollectionsTop
In our final look at collection we look at sorting our collections using the Comparable
and Comparator
interfaces. We have already seen examples of sorted collections when we looked at
the TreeSet and TreeMap classes in the Java section. We used some simple code examples to show
how these collection types worked and were sorted in natural order according to their types. Maybe what wasn't apparent at the time was that their was no problem with ordering these collections as we
were using String
and Integer
objects implement the Comparable
interface and its only method compareTo()
. We also used the Collections.sort()
method in the last lesson to sort a list, this also used a String
object.
But what if we create a TreeSet or TreeMap collection or try to use the Collections.sort()
method to sort a custom class? To answer this question we will use the Contractor
class we created in the Overriding equals()
section of the The Object
Superclass lesson. The Contractor
class was tested to ensure that the overrides of
equals()
and hashCode()
worked correctly. When creating your own collections you will
always want to override these methods if you want meaningful, efficient collections.
java.lang.Comparable<T>
InterfaceTop
Classes can be sorted by using the java.lang.Comparable
interface. We need to implement the Comparable
interface and its only method compareTo()
within our classes. The compareTo()
method returns a negative integer, zero, or a positive integer dependant upon this object being less than, equal to, or greater than the specified object.
See the java.lang.Comparable
interface for example code.
java.util.Comparator<T>
InterfaceTop
The second way to sort collections is by using the java.util.Comparator
interface. The Comparable
interface works fine but you have to change the
class you want sorted and is a 'one-trick pony' because you can only sort the collection in one way. But what about sorting classes we can't modify or we want to sort in different ways. Luckily for us Java
also has the java.util.Comparator<T>
interface, which allows us to sort a class without changing the code within that class, which also means we can have multiple sort sequences as well. All
we need to do is create a class that implements the java.util.Comparator<T>
interface and its compare()
method.
See the java.util.Comparator<T>
interface for example code.
Related Java Tutorials
Collections/Generics - Collections Overview
Collections/Generics - Sets
Collections/Generics - Lists
Collections/Generics - Queues
Collections/Generics - Maps
Collections/Generics - Generics
Collections/Generics - Sorting Collections
OO Concepts - The Object Superclass