Java I/O FundamentalsJ8 Home « Java I/O Fundamentals

We start this lesson by looking at code that uses the primitive wrapper classes and/or autoboxing and unboxing. We then look at the differences between the String, StringBuilder, and StringBuffer classes. After this we investigate how we can use classes within the java.io package to read from files, write to files and use the BufferedReader, BufferedWriter, File, FileReader, FileWriter and PrintWriter sometimes in unison to create a software solution. We then use classes from the java.text package to correctly format or parse dates, numbers, and currency values for a specific locale including usage of the appropriate methods to use the default locale or a specific locale. We finish the lesson by looking at regular expressions and examine how to format and tokenize our data using java.

Lets take a look at the points outlined at Oracle Java SE 8 Programmer II for this part of the certification.

  • Java I/O Fundamentals
    1. Read and write data from the console.
    2. Use BufferedReader, BufferedWriter, File, FileReader, FileWriter, FileInputStream, FileOutputStream, ObjectOutputStream, ObjectInputStream, and PrintWriter in the java.io package.

The Wrapper ClassesTop

The wrapper classes provided in Java allow us to wrap any of the primitive types in an object. The benefits of this are two-fold:

  1. Wrapped primitives can be used in object-centric activities such as collections or where we need to use an object.
  2. Wrapped primitives give us access to utility functions that we can use with primitives.

The following table lists the primitive types, their wrapper classes and the constructors available:

Primitive Type Description Wrapper Constructor Arguments Examples
boolean and char
booleantrue/false valuesBooleanboolean or StringBoolean b1 = new Boolean(true);
Boolean b2 = new Boolean("false");
charCharacterCharactercharCharacter c = new Character('a');
signed numeric integers
byte8-bit integerBytebyte or Stringbyte b = 24;
Byte b1 = new Byte(b);
Byte b2 = new Byte("24");
shortShort integerShortshort or Stringshort s = 48;
Short s1 = new Short(s);
Short s2 = new Short("48");
intIntegerIntegerint or StringInteger i1 = new Integer(64);
Integer i2 = new Integer("64");
longLong integerLonglong or StringLong l1 = new Long (128);
Long l2 = new Long ("128");
signed floating point
floatSingle-precision floatFloatdouble, float or Stringdouble d = 123.456;
Float f1 = new Float(d);
Float f2 = new Float(123.456f);
Float f3 = new Float("123.456f");
doubleDouble-precision floatDoubledouble or StringDouble d1 = new Double(456.789);
Double d2 = new Double("456.789");

Strings In JavaTop

Most computer languages use the standard 8-bit ASCII character set which has a range of 0 to 127 to represent characters of a string. Java uses the Unicode character set which has a range of 0 to 65,536 that can represent any character found in any human language. The ASCII character set is a subset of Unicode and as such ASCII character are still valid in Java. In Java strings are objects and like any other object it means we can create (instantiate) them. Another thing to note about strings in Java is that they are immutable, which means once you have assigned a value to a String object it can never be changed.

See String Immutability for more on this.

See String Creation & Efficiency for more information on how Java uses the string constant pool.

String, StringBuilder, StringBuffer DifferencesTop

String immutability can improve efficiency, the downside is we can also have a lot of strings that get lost in the string constant pool, when we reassign or discard our reference variables.

Luckily for us Java comes with the predefined StringBuffer and StringBuilder classes which we can use to modify strings without the pros and cons of immutability. So when you are doing a lot of string manipulation these are the classes to use. The StringBuilder class was introduced in java as an alternative to the older StringBuffer class. Both these classes have the same API apart from the StringBuffer class being thread safe and having synchronized methods. For most situations when manipulating strings, thread safety isn't an issue and so the StringBuilder class is the better option for efficiency.

See the StringBuilder class lesson for more information about the StringBuilder and StringBuffer classes.

Character StreamsTop

Character streams are defined within two class hierarchies, one for input and one for output:

  • The Writer class is the abstract superclass of all character output streams
  • The Reader class is the abstract superclass of all character input streams

These classes define the characteristics that are common to character input and character output streams, which are implemented in the concrete subclasses of each hierarchy.

Character Output Stream HierarchyTop

The diagram below shows the classes in the character output stream hierarchy of which the Writer class is the abstract superclass.:

character input output hierarchy
Class Description
WriterAbstract character stream superclass which describes this type of output stream.
BufferedWriterBuffered output character stream.
CharArrayWriterCharacter buffer output stream.
FilterWriterAbstract character stream for writing filtered streams.
OuputStreamWriterOutput Stream that acts as a bridge for encoding byte streams from character streams.
FileWriterOutput stream for writing characters to a file.
PipedWriterPiped character output stream.
PrintWriterConvenience output character stream to add functionality to another stream, an example being to print to the console using print() and println().
StringWriterOutput stream for writing characters to a string.

Click on one of the class links in the table above to see usage for the character output streams required for certification.

Character Input Stream HierarchyTop

The diagram below shows the classes in the character input stream hierarchy of which the Reader class is the abstract superclass.:

character input stream hierarchy
Class Description
ReaderAbstract character stream superclass which describes this type of input stream.
BufferedReaderBuffered input character stream.
LineNumberReaderInput character stream that keeps a count of line numbers.
CharArrayReaderCharacter buffer input stream.
FilterReaderAbstract character stream for reading filtered streams.
PushbackReaderCharacter stream reader containing functionality to return characters to the input stream.
InputStreamReaderInput Stream that acts as a bridge for decoding byte streams into character streams.
FileReaderInput stream for reading characters from a file.
PipedReaderPiped character input stream.
StringReaderInput stream for reading characters from a string.

Click on one of the class links in the table above to see usage for the character input streams required for certification.

Other Java I/O ClassesTop

The diagram below shows some other pertinent classes in the java.io package not covered in the byte and character streams above:

Other java.io classes
Class Description
FileAbstract representation of file and directory pathnames.
FileDescriptorOpaque handle to the underlying machine-specific structure.
RandomAccessFileAllows reading and writing of bytes to a random access file.
StreamTokenizerInput stream to be parsed into 'tokens'.

Click on the class link in the table above to see usage for the File class required for certification.

The java.io.File ClassTop

We will finish this lesson with a talk about the File class that exists within the Java.io package and how we use objects of this class to represent an actual file and directory pathname that may or may not exist already on a hard drive. A File object doesn't contain the file in question or any data associated with the file, it just acts like a pointer to said file and can be a relative or absolute pathname:

  1. Relative URL - The common use of a relative URL is by omitting the protocol and server name, as documents generally reside on the same server. So this would be directoryName/fileName.extension.
    For example the relative url images/icon.png
  2. Absolute URL - An absolute url is the complete address of the resource.
    For example the absolute url https://server2client.com/images/icon.png

Date, Number & Currency ClassesTop

The table below gives a brief description of the date, number and currency classes that we will be using in this lesson:

Class Description
java.util.DateThe Date class allows us to create an object that represents a specific instant in time.
java.util.CalendarThe Calendar class allows us get an instance of a Calendar object which we can use to convert and manipulate dates and times.
java.util.LocaleThe Locale class allows us to create an object that represents a specific geographical, political, or cultural region of the world. We can then use the Locale object in conjunction with the DateFormat or NumberFormat classes to get locale specific dates, times, numbers and currencies for that locale.
java.text.DateFormatThe DateFormat class provides us with methods to format dates in various styles and for different locales.
java.text.NumberFormatThe NumberFormat class provides us with methods to format numbers and currencies for different locales

The java.util.Date ClassTop

The java.util.Date Class represents a specific instant in time that gives us millisecond precision. The Date class is intended to reflect an internationally standardized Computer Date and Time which starts from 1 January 1970 00:00:00 UTC (Universal Time Coordinated). Date object creation using the long argument is meant to reflect dates in milliseconds strarting from this date. Looking at the official documentation for the java.util.Date Class we can see there is one other non-deprecated Date constructor that creates a Date object that represents a date and time when the the object was allocated to the nearest millisecond.

A lot of the methods in the Date class were deprecated from the language in the JDK 1.1 release as the Date class didn't do a very good job of handling internationalisation and localisation. The java.util.Calendar was introduced into the language in the JDK 1.1 release to replace the Date class for easier date manipulation, internationalisation and localisation and we will look at this class in the next part of this lesson.

The Date object still serves some purpose though:

  • It's an easy way to get the current date and time if you don't want to do a lot of manipulation, localise or internationalise the date and time.
  • Good for getting a universal time not affected by time zones.
  • Can be useful for simple date comparisons.
  • The DateFormat class needs a Date object for some of its formatting methods. So a Date object can act as a bridge between date manipulation in the Calendar class and formatting of the manipulated date in the DateFormat class.

See Date Example for code examples.

java.util.Calendar ClassTop

We can manipulate Date objects using the getTime() and setTime() methods of the java.util.Date class, but this is a cumbersome way to manipualte dates. The java.util.Calendar class is an easier way to manipulate dates and offers a lot of methods to do it.To get instances of the java.util.Calendar class we have to use one of the overloaded getInstance static factory methods.

See Calendar Example for code examples.

java.util.Locale ClassTop

The java.util.Locale class allows us to create an object that represents a specific geographical, political, or cultural region of the world. We can then use the Locale object in conjunction with the java.util.DateFormat or java.util.NumberFormat classes to get locale specific dates, times, numbers and currencies for that locale.

Operations that require specific locale information to perform tasks are known as locale-sensitive and as such use a Locale object to tailor user information accordingly. As examples, this could include the way dates are displayed, the way numbers are formatted or the way currencies are viewed.

So once we set a specific locale we can use this object in conjunction with the java.util.DateFormat or java.util.NumberFormat classes to format locale-sensitive data for our users. In this way we can handle internationalisation and localisation when we want it, or do nothing and automatically use a default locale otherwise.

See Locale Example for code examples.

java.util.DateFormat ClassTop

The java.util.DateFormat class allows us to, not suprisingly to get formatted dates and times as well as parsing dates and times in a language-independent manner. On inspection of the official documentation for the DateFormat class we can see it is abstract and so cannot be instantiated directly. To get instances of the DateFormat class we have to use one of the overloaded static factory methods available and we can also pass options to the constructor to return dates and times with different levels of detail.

The table below just shows the formats we can pass to the static factory methods of the DateFormat class:

Date Format Style Description
SHORTCompletely numeric representation for a date or time.
MEDIUMA longer representation than SHORT.
LONGA longer representation than MEDIUM.
FULLLongest representation for a date or time.

See Date Format Example for code examples.

java.util.NumberFormat ClassTop

The java.util.NumberFormat class allows us to get formatted number as well as parsing numbers for any locale. On inspection of the official documentation for the java.util.NumberFormat class we can see it is abstract and so cannot be instantiated directly. To get instances of the NumberFormat class we have to use one of the overloaded static factory methods available and we can also pass options to the constructor to return formatted numbers or currencies.

See Number Format Example for code examples.

See Locale-sensitive Dates & Numbers for code examples of locale sensive data using a combination of the classes discussed in this lesson.

Regular ExpressionsTop

A regular expressions is a string containing normal characters as well as metacharacters which make a pattern we can use to match data. The metacharacters are used to represent concepts such as positioning, quantity and character types. The terminology used when searching through data for specific characters or groups of characters is known as pattern matching and is generally done from the left to the right of the input character sequence.

Regular expressions are a large topic, but here we will just cover the parts you need to know for certification. For a full list of regex constructs visit the Oracle online version of documentation for the JavaTM 2 Platform Standard Edition 5.0 API Specification and scroll down the top left pane and click on java.util.regex.

MeatacChar Meaning Examples
Escape/Unescape
\ Used to escape characters that are treated literally within regular expressions or alternatively to unescape special characters Literal Content
d matches the character d
\\d matches a digit character

Unescape Special Characters
d+ matches one or more character d
d\\+ matches d+
Quantifiers
? Matches preceding item 0 or 1 times do?
Every dig has its day
Every dog has its day
Shut that doooor
Can you see me
* Matches preceding item 0 or more times do*
Every dig has its day
Every dog has its day
Shut that doooor
Can you see me
+ Matches preceding item 1 or more times do+
Every dig has its day
Every dog has its day
Shut that doooor
Can you see me
Predefined Character Classes
. Matches any single character without newline characters except when the DOTALL flag is specified. \\.t
This Time tonight
this is good
\d Find a digit character.
Same as the range check [0-9].
\\d
Was it 76 or 77
\s Find a whitespace character. Example below words are greyed out and spaces are highlighted in red purely for emphasis
\\s
Beware of the dog
\w Find a word character.
A word character is a character in ranges a-z, A-Z, 0-9 and also includes the _ (underscore) symbol.
Same as the range check [A-Za-z0-9_].
\\w
76% off_sales. £12 only

See the Regular Expressions lesson for more code examples and usage of regular expressions.

FormattingTop

In this part of the lesson by looking at formatting our output and Java offers us different options for doing this. We will look at formatting data using the java.util.Formatter class as well as using the static format() method of the java.util.String class. We finish of our look at formatting output by looking at the printf() method contained in the java.io.PrintStream and java.io.PrintWriter classes.

Formatting OverviewTop

Producing formatted output requires a format string and an argument list. The formatted output is a String object which is derived from the formatting string that may contain fixed text as well as one or more embedded format specifiers, that are then applied to the argument list which can be set to null.

Format specifiers which have the argument list set to null have the following syntax:


// Format specifier syntax with null argument list 
%[flags][width]conversion
  • The optional flags is a set of characters that modify the output format where the set of valid flags depends on the conversion.
  • The optional width is a non-negative decimal integer indicating the minimum number of characters to be written to the output.
  • The required conversion is a character indicating content to be inserted in the output.

Format specifiers used to represent date and time types have the following syntax:


// Format specifier syntax with argument list for date and time types
%[argument_index$][flags][width]conversion
  • The optional argument_index is a decimal integer indicating the position of the argument in the argument list. The first argument is referenced by "1$", the second by "2$" and so on.
  • The optional flags and width are defined as above.
  • With dates the required conversion is a two character sequence where the first character is 't' or 'T' and the second character indicates the format to be used.

Format specifiers for general, character, and numeric types have the following syntax:


// Format specifier syntax with argument list for general, character, and numeric types
%[argument_index$][flags][width][.precision]conversion
  • The optional argument_index, flags and width are defined as above.
  • The optional precision is a non-negative decimal integer generally used to restrict the number of characters but specific behavior depends on the conversion.
  • The required conversion is a character indicating how the argument should be formatted, where the set of valid conversions for a given argument depend on the argument's data type.

The table below lists some conversions with their descriptions. You can find the complete list of flags and conversions in the API documentation for the java.util.Formatter class.

Conversion Symbols Description
aFormats boolean true or false
cFormats as a Unicode character
dFormats as a decimal integer
fFormats the argument as a floating point decimal.
oFormats as an octal integer
sFormats the argument as a string.
xFormats as a hexidecimal integer
ALocale-specific full name of day of the week, "Monday", "Tuesday"....
BLocale-specific full month name, "January", "February"....
YYear in format YYYY with leading zeros for years less than 1000

The java.util.Formatter ClassTop

The java.util.Formatter class allows us to format output through a wide variety of constructors. The API documentation is extremely detailed and we are just showing an example so you get the idea:

See java.util.Formatter for code examples and usage.

The String.format() MethodTop

The String.format() static method allows us to format an output string and is overloaded to accept a format string and argument list or a locale, format string and argument list. In our example we will use the second overloaded method which accepts a locale, format string and argument list:

See String.format() for code examples and usage.

The printf() MethodTop

The printf() method allows us to format output to a java.io.PrintStream or java.io.PrintWriter stream. These classes also contains a method called format() which produces the same results, so whatever you read here for the printf() method, can also be applied to the format() method. For our example we will use the printf() method from the PrintStream class. If you remember from the Java I/O Overview lesson System.out is of type PrintStream and so will be used for convenience:

See printf() for code examples and usage.

TokenizingTop

We finish off our tour of the Java API by looking at tokenizing our data. For this we will first look at the split() method of the String class which uses a regular expression delimiter to tokenize our data. After this we look at the java.io.Scanner class; objects of this class allow us to break input into tokens using a delimiter pattern which defaults to whitespace or can be set using a regular expression.

The split() MethodTop

The split() method will split a string around matches of the given regular expression, returning the results in a String array. The split() method is overloaded and will accept a regex string and a limit argument of type int denoting the number of times the pattern is to be applied. The second form just requires a regex string and in this form it is the same as invoking the split() method with the limit set to zero. An explanation of how values passed to the limit parameter affect the number of times the pattern is to be applied follows:

  • limit < 0
    Pattern will be applied as many times as possible, output array can have any length.
  • limit = 0
    Pattern will be applied as many times as possible, output array can have any length and trailing empty strings are discarded.
  • limit > 0
    Pattern will be applied at most limit - 1 times, output array length maximum <= limit and output array last entry will contain all input beyond last matched delimiter.

See the split() method for code examples.

The java.util.Scanner ClassTop

The java.util.Scanner class is a simple text scanner which allows us to parse primitive data types and strings using regular expressions. Objects of this class allow us to break input into tokens using a delimiter pattern. The resulting tokens can then be converted into values of different types using one of the nexttype methods available in the java.util.Scanner class. In our example we show how to use the Scanner class with the default delimiter of whitespace and also with a delimiter created using a regular expression.

See the java.util.Scanner class for code examples.

Related Java Tutorials

Fundamentals - Primitive Variables
API Contents - The String class
API Contents - The StringBuilder class
API Contents - The java.io.File Class
API Contents - The java.io.PrintWriter Class
API Contents - The java.io.BufferedReader Class
API Contents - The java.io.FileReader Class
API Contents - The java.io.BufferedWriter Class
API Contents - The java.io.FileWriter Class
API Contents - Dates, Numbers & Currencies
API Contents - Regular Expressions
API Contents - Formatting & Tokenizing