Exploring Java

Chapter 8 - Input/Output Facilities

Contents:
Streams
Files
Serialization
Data compression

In this chapter, we'll continue our exploration of the Java API by looking at many of the classes in the java.io package. These classes support a number of forms of input and output; I expect you'll use them often in your Java applications. Figure 8.1 shows the class hierarchy of the java.io package.

We'll start by looking at the stream classes in java.io; these classes are all subclasses of the basic InputStream, OutputStream, Reader, and Writer classes. Then we'll examine the File class and discuss how you can interact with the filesystem using classes in java.io. Finally, we'll take a quick look at the data compression classes provided in java.util.zip.

8.1 Streams

All fundamental I/O in Java is based on streams. A stream represents a flow of data, or a channel of communication with (at least conceptually) a writer at one end and a reader at the other. When you are working with terminal input and output, reading or writing files, or communicating through sockets in Java, you are using a stream of one type or another. So you can see the forest without being distracted by the trees, I'll start by summarizing the different types of streams.

Figure 8.1: The java.io package

[Graphic: Figure 8-1]

InputStream/OutputStream

Abstract classes that define the basic functionality for reading or writing an unstructured sequence of bytes. All other byte streams in Java are built on top of the basic InputStream and OutputStream.

Reader/Writer

Abstract classes that define the basic functionality for reading or writing an unstructured sequence of characters. All other character streams in Java are built on top of Reader and Writer.

InputStreamReader/OutputStreamWriter

"Bridge" classes that convert bytes to characters and vice versa.

DataInputStream/DataOutputStream

Specialized stream filters that add the ability to read and write simple data types like numeric primitives and String objects.

BufferedInputStream/BufferedOutputStream /BufferedReader/BufferedWriter

Specialized streams that incorporate buffering for additional efficiency.

PrintWriter

A specialized character stream that makes it simple to print text.

PipedInputStream/PipedOutputStream /PipedReader/PipedWriter

"Double-ended" streams that always occur in pairs. Data written into a PipedOutputStream or PipedWriter is read from its corresponding PipedInputStream or PipedReader.

FileInputStream/FileOutputStream /FileReader/FileWriter

Implementations of InputStream, OutputStream, Reader, and Writer that read from and write to files on the local filesystem.

Streams in Java are one-way streets. The java.io input and output classes represent the ends of a simple stream, as shown in Figure 8.2. For bidirectional conversations, we use one of each type of stream.

Figure 8.2: Basic input and output stream functionality

[Graphic: Figure 8-2]

InputStream and OutputStream are abstract classes that define the lowest-level interface for all byte streams. They contain methods for reading or writing an unstructured flow of byte-level data. Because these classes are abstract, you can never create a "pure" input or output stream. Java implements subclasses of these for activities like reading and writing files, and communicating with sockets. Because all byte streams inherit the structure of InputStream or OutputStream, the various kinds of byte streams can be used interchangeably. For example, a method often takes an InputStream as an argument. This means the method accepts any subclass of InputStream. Specialized types of streams can also be layered to provide higher-level functionality, such as buffering or handling larger data types.

In Java 1.1, new classes based around Reader and Writer were added to the java.io package. Reader and Writer are very much like InputStream and OutputStream, except that they deal with characters instead of bytes. As true character streams, these classes correctly handle Unicode characters, which was not always the case with the byte streams. However, some sort of bridge is needed between these character streams and the byte streams of physical devices like disks and networks. InputStreamReader and OutputStreamWriter are special classes that use an encoding scheme to translate between character and byte streams.

We'll discuss all of the interesting stream types in this section, with the exception of FileInputStream, FileOutputStream, FileReader, and FileWriter. We'll postpone the discussion of file streams until the next section, where we'll cover issues involved with accessing the filesystem in Java.

Terminal I/O

The prototypical example of an InputStream object is the standard input of a Java application. Like stdin in C or cin in C++, this object reads data from the program's environment, which is usually a terminal window or a command pipe. The java.lang.System class, a general repository for system-related resources, provides a reference to standard input in the static variable in. System also provides objects for standard output and standard error in the out and err variables, respectively. The following example shows the correspondence:

InputStream stdin = System.in; 
OutputStream stdout = System.out; 
OutputStream stderr = System.err; 

This example hides the fact that System.out and System.err aren't really OutputStream objects, but more specialized and useful PrintStream objects. I'll explain these later, but for now we can reference out and err as OutputStream objects, since they are a kind of OutputStream by inheritance.

We can read a single byte at a time from standard input with the InputStream's read() method. If you look closely at the API, you'll see that the read() method of the base InputStream class is actually an abstract method. What lies behind System.in is an implementation of InputStream, so it's valid to call read() for this stream:

try { 
    int val = System.in.read(); 
    ... 
} 
catch ( IOException e ) { 
} 

As is the convention in C, read() provides a byte of information, but its return type is int. A return value of -1 indicates a normal end of stream has been reached; you'll need to test for this condition when using the simple read() method. If an error occurs during the read, an IOException is thrown. All basic input and output stream commands can throw an IOException, so you should arrange to catch and handle them as appropriate.

To retrieve the value as a byte, perform the cast:

byte b = (byte) val; 

Of course, you'll need to check for the end-of-stream condition before you perform the cast. An overloaded form of read() fills a byte array with as much data as possible up to the limit of the array size and returns the number of bytes read:

byte [] bity = new byte [1024]; 
int got = System.in.read( bity ); 

We can also check the number of bytes available for reading on an InputStream with the available() method. Once we have that information, we can create an array of exactly the right size:

int waiting = System.in.available(); 
if ( waiting > 0 ) { 
    byte [] data = new byte [ waiting ]; 
    System.in.read( data ); 
    ... 
} 

InputStream provides the skip() method as a way of jumping over a number of bytes. Depending on the implementation of the stream and if you aren't interested in the intermediate data, skipping bytes may be more efficient than reading them. The close() method shuts down the stream and frees up any associated system resources. It's a good idea to close a stream when you are done using it.

Character Streams

The InputStream and OutputStream subclasses of Java 1.0.2 included methods for reading and writing strings, but most of them operated by assuming that a sixteen-bit Unicode character was equivalent to an eight-bit byte in the stream. This only works for Latin-1 (ISO8859-1) characters, so the character stream classes Reader and Writer were introduced in Java 1.1. Two special classes, InputStreamReader and OutputStreamWriter, bridge the gap between the world of character streams and the world of byte streams. These are character streams that are wrapped around an underlying byte stream. An encoding scheme is used to convert between bytes and characters. An encoding scheme name can be specified in the constructor of InputStreamReader or OutputStreamWriter. Another constructor simply accepts the underlying stream and uses the system's default encoding scheme. For example, let's parse a human-readable string from the standard input into an integer. We'll assume that the bytes coming from System.in use the system's default encoding scheme.

try {
    InputStreamReader converter = new InputStreamReader(System.in);
    BufferedReader in = new BufferedReader(converter);
    
    String text = in.readLine();
    int i = NumberFormat.getInstance().parse(text).intValue();
} 
catch ( IOException e ) { }
catch ( ParseException pe ) { } 

First, we wrap an InputStreamReader around System.in. This object converts the incoming bytes of System.in to characters using the default encoding scheme. Then, we wrap a BufferedReader around the InputStreamReader. BufferedReader gives us the readLine() method, which we can use to retrieve a full line of text into a String. The string is then parsed into an integer using the techniques described in Chapter 7.

We could have programmed the previous example using only byte streams, and it would have worked for users in the United States, at least. So why go to the extra trouble of using character streams? Character streams were introduced in Java 1.1 to correctly support Unicode strings. Unicode was designed to support almost all of the written languages of the world. If you want to write a program that works in any part of the world, in any language, you definitely want to use streams that don't mangle Unicode.

So how do you decide when you need a byte stream and when you need a character stream? If you want to read or write character strings, use some variety of Reader or Writer. Otherwise a byte stream should suffice. Let's say, for example, that you want to read strings from a file that was written out by a Java 1.0.2 application. In this case you could simply create a FileReader, which will convert the bytes in the file to characters using the system's default encoding scheme. If you have a file in a specific encoding scheme, you can create an InputStreamReader with that encoding scheme and read characters from it. Another example comes from the Internet. Web servers serve files as byte streams. If you want to read Unicode strings from a file with a particular encoding scheme, you'll need an appropriate InputStreamReader wrapped around the socket's InputStream.

Stream Wrappers

What if we want to do more than read and write a mess of bytes or characters? Many of the InputStream, OutputStream, Reader, and Writer classes wrap other streams and add new features. A filtered stream takes another stream in its constructor; it delegates calls to the underlying stream while doing some additional processing of its own.

In Java 1.0.2, all wrapper streams were subclasses of FilterInputStream and FilterOutputStream. The character stream classes introduced in Java 1.1 break this pattern, but they operate in the same way. For example, BufferedInputStream extends FilterInputStream in the byte world, but BufferedReader extends Reader in the character world. It doesn't really matter--both classes accept a stream in their constructor and perform buffering. Like the byte stream classes, the character stream classes include the abstract FilterReader and FilterWriter classes, which simply pass all method calls to an underlying stream.

The FilterInputStream, FilterOutputStream, FilterReader, and FilterWriter classes themselves aren't useful; they must be subclassed and specialized to create a new type of filtering operation. For example, specialized wrapper streams like DataInputStream and DataOutputStream provide additional methods for reading and writing primitive data types.

As we said, when you create an instance of a filtered stream, you specify another stream in the constructor. The specialized stream wraps an additional layer of functionality around the other stream, as shown in Figure 8.3. Because filtered streams themselves are subclasses of the fundamental stream types, filtered streams can be layered on top of each other to provide different combinations of features. For example, you could wrap a PushbackReader around a LineNumberReader that was wrapped around a FileReader.

Figure 8.3: Layered streams

[Graphic: Figure 8-3]

Data streams

DataInputStream and DataOutputStream are filtered streams that let you read or write strings and primitive data types that comprise more than a single byte. DataInputStream and DataOutputStream implement the DataInput and DataOutput interfaces, respectively. These interfaces define the methods required for streams that read and write strings and Java primitive types in a machine-independent manner.

You can construct a DataInputStream from an InputStream and then use a method like readDouble() to read a primitive data type:

DataInputStream dis = new DataInputStream( System.in ); 
double d = dis.readDouble(); 

The above example wraps the standard input stream in a DataInputStream and uses it to read a double value. readDouble() reads bytes from the stream and constructs a double from them. All DataInputStream methods that read primitive types also read binary information.

The DataOutputStream class provides write methods that correspond to the read methods in DataInputStream. For example, writeInt() writes an integer in binary format to the underlying output stream.

The readUTF() and writeUTF() methods of DataInputStream and DataOutputStream read and write a Java String of Unicode characters using the UTF-8 "transformation format." UTF-8 is an ASCII-compatible encoding of Unicode characters commonly used for the transmission and storage of Unicode text.[1]

[1] Check out the URL http://www.stonehand.com/unicode/standard/utf8.html for more information on UTF-8.

We can use a DataInputStream with any kind of input stream, whether it be from a file, a socket, or standard input. The same applies to using a DataOutputStream, or, for that matter, any other specialized streams in java.io.

Buffered streams

The BufferedInputStream, BufferedOutputStream, BufferedReader, and BufferedWriter classes add a data buffer of a specified size to the stream path. A buffer can increase efficiency by reducing the number of physical read or write operations that correspond to read() or write() method calls. You create a buffered stream with an appropriate input or output stream and a buffer size. Furthermore, you can wrap another stream around a buffered stream so that it benefits from the buffering. Here's a simple buffered input stream:

BufferedInputStream bis = new BufferedInputStream(myInputStream, 4096); 
...
bis.read(); 

In this example, we specify a buffer size of 4096 bytes. If we leave off the size of the buffer in the constructor, a reasonably sized one is chosen for us. On our first call to read(), bis tries to fill the entire 4096-byte buffer with data. Thereafter, calls to read() retrieve data from the buffer until it's empty.

A BufferedOutputStream works in a similar way. Calls to write() store the data in a buffer; data is actually written only when the buffer fills up. You can also use the flush() method to wring out the contents of a BufferedOutputStream before the buffer is full.

Some input streams like BufferedInputStream support the ability to mark a location in the data and later reset the stream to that position. The mark() method sets the return point in the stream. It takes an integer value that specifies the number of bytes that can be read before the stream gives up and forgets about the mark. The reset() method returns the stream to the marked point; any data read after the call to mark() is read again.

This functionality is especially useful when you are reading the stream in a parser. You may occasionally fail to parse a structure and so must try something else. In this situation, you can have your parser generate an error (a homemade ParseException) and then reset the stream to the point before it began parsing the structure:

BufferedInputStream input; 
... 
try { 
    input.mark( MAX_DATA_STRUCTURE_SIZE ); 
    return( parseDataStructure( input ) ); 
} 
catch ( ParseException e ) { 
    input.reset(); 
    ... 
} 

The BufferedReader and BufferedWriter classes work just like their byte-based counterparts, but operate on characters instead of bytes.

Print streams

Another useful wrapper stream is java.io.PrintWriter. This class provides a suite of overloaded print() methods that turn their arguments into strings and push them out the stream. A complementary set of println() methods adds a newline to the end of the strings. PrintWriter is the more capable big brother of the PrintStream byte stream. PrintWriter is an unusual character stream because it can wrap either an OutputStream or another Writer. The System.out and System.err streams are PrintStream objects; you have already seen such streams strewn throughout this book:

System.out.print("Hello world...\n"); 
System.out.println("Hello world..."); 
System.out.println( "The answer is: " + 17 ); 
System.out.println( 3.14 ); 

In Java 1.1, the PrintStream class has been enhanced to translate characters to bytes using the system's default encoding scheme. Although PrintStream is not deprecated in Java 1.1, its constructors are. For all new development, use a PrintWriter instead of a PrintStream. Because a PrintWriter can wrap an OutputStream, the two classes are interchangeable.

When you create a PrintWriter object, you can pass an additional boolean value to the constructor. If this value is true, the PrintWriter automatically performs a flush() on the underlying OutputStream or Writer each time it sends a newline:

boolean autoFlush = true; 
PrintWriter p = new PrintWriter( myOutputStream, autoFlush ); 

When this technique is used with a buffered output stream, it corresponds to the behavior of terminals that send data line by line.

Unlike methods in other stream classes, the methods of PrintWriter and PrintStream do not throw IOExceptions. Instead, if we are interested, we can check for errors with the checkError() method:

System.out.println( reallyLongString ); 
if ( System.out.checkError() )                // Uh oh 

Pipes

Normally, our applications are directly involved with one side of a given stream at a time. PipedInputStream and PipedOutputStream (or PipedReader and PipedWriter), however, let us create two sides of a stream and connect them together, as shown in Figure 8.4. This provides a stream of communication between threads, for example.

To create a pipe, we use both a PipedInputStream and a PipedOutputStream. We can simply choose a side and then construct the other side using the first as an argument:

Figure 8.4: Piped streams

[Graphic: Figure 8-4]

PipedInputStream pin = new PipedInputStream(); 
PipedOutputStream pout = new PipedOutputStream( pin ); 

Alternatively :

PipedOutputStream pout = new PipedOutputStream( ); 
PipedInputStream pin = new PipedInputStream( pout ); 

In each of these examples, the effect is to produce an input stream, pin, and an output stream, pout, that are connected. Data written to pout can then be read by pin. It is also possible to create the PipedInputStream and the PipedOutputStream separately, and then connect them with the connect() method.

We can do exactly the same thing in the character-based world, using PipedReader and PipedWriter in place of PipedInputStream and PipedOutputStream.

Once the two ends of the pipe are connected, use the two streams as you would other input and output streams. You can use read() to read data from the PipedInputStream (or PipedReader) and write() to write data to the PipedOutputStream (or PipedWriter). If the internal buffer of the pipe fills up, the writer blocks and waits until more space is available. Conversely, if the pipe is empty, the reader blocks and waits until some data is available. Internally, the blocking is implemented with wait() and notifyAll(), as described in Chapter 6, Threads.

One advantage to using piped streams is that they provide stream functionality in our code, without compelling us to build new, specialized streams. For example, we can use pipes to create a simple logging facility for our application. We can send messages to the logging facility through an ordinary PrintWriter, and then it can do whatever processing or buffering is required before sending the messages off to their ultimate location. Because we are dealing with string messages, we use the character-based PipedReader and PipedWriter classes. The following example shows the skeleton of our logging facility:

import java.io.*; 
 
class LoggerDaemon extends Thread { 
    PipedReader in = new PipedReader();  
 
    LoggerDaemon() { 
        setDaemon( true ); 
        start(); 
    } 
 
    public void run() { 
        BufferedReader din = new BufferedReader( in ); 
        String s; 
  
        try { 
           while ( (s = din.readLine()) != null ) { 
                // process line of data 
                // ... 
            } 
        }  
        catch (IOException e ) { } 
    } 
 
    PrintWriter getWriter() throws IOException { 
        return new PrintWriter( new PipedWriter( in ) ); 
    } 
} 
 
class myApplication { 
    public static void main ( String [] args ) throws IOException { 
        PrintWriter out = new LoggerDaemon().getWriter(); 
 
        out.println("Application starting..."); 
        // ... 
        out.println("Warning: does not compute!"); 
        // ... 
    } 
} 

LoggerDaemon is a daemon thread, so it will die when our application exits. LoggerDaemon reads strings from its end of the pipe, the PipedReader in. LoggerDaemon also provides a method, getWriter(), that returns a PipedWriter that is connected to its input stream. Simply create a new LoggerDaemon and fetch the output stream to begin sending messages.

In order to read strings with the readLine() method, LoggerDaemon wraps a BufferedReader around its PipedReader. For convenience, it also presents its PipedWriter as a PrintWriter, rather than a simple Writer.

One advantage of implementing LoggerDaemon with pipes is that we can log messages as easily as we write text to a terminal or any other stream. In other words, we can use all our normal tools and techniques. Another advantage is that the processing happens in another thread, so we can go about our business while the processing takes place.

There is nothing stopping us from connecting more than two piped streams. For example, we could chain multiple pipes together to perform a series of filtering operations.

Strings to Streams and Back

The StringReader class is another useful stream class. The stream is created from a String; StringReader essentially wraps stream functionality around a String. Here's how to use a StringReader:

String data = "There once was a man from Nantucket..."; 
StringReader sr = new StringReader( data ); 
 
char T = (char)sr.read(); 
char h = (char)sr.read(); 
char e = (char)sr.read(); 

Note that you will still have to catch IOExceptions thrown by some of the StringReader's methods.

The StringReader class is useful when you want to read data in a String as if it were coming from a stream, such as a file, pipe, or socket. For example, suppose you create a parser that expects to read tokens from a stream. But you want to provide a method that also parses a big string. You can easily add one using StringReader.

Turning things around, the StringWriter class lets us write to a character string through an output stream. The internal string grows as necessary to accommodate the data. In the following example, we create a StringWriter and wrap it in a PrintWriter for convenience:

StringWriter buffer = new StringWriter(); 
PrintWriter out = new PrintWriter( buffer ); 
 
out.println("A moose once bit my sister."); 
out.println("No, really!"); 
 
String results = buffer.toString(); 

First we print a few lines to the output stream, to give it some data, then retrieve the results as a string with the toString() method. Alternately, we could get the results as a StringBuffer with the getBuffer() method.

The StringWriter class is useful if you want to capture the output of something that normally sends output to a stream, such as a file or the console. A PrintWriter wrapped around a StringWriter competes with StringBuffer as the easiest way to construct large strings piece by piece. While using a StringBuffer is more efficient, PrintWriter provides more functionality than the normal append() method used by StringBuffer.

rot13InputStream

Before we leave streams, let's try our hand at making one of our own. I mentioned earlier that specialized stream wrappers are built on top of the FilterInputStream and FilterOutputStream classes. It's quite easy to create our own subclass of FilterInputStream that can be wrapped around other streams to add new functionality.

The following example, rot13InputStream, performs a rot13 operation on the bytes that it reads. rot13 is a trivial algorithm that shifts alphanumeric letters to make them not quite human-readable; it's cute because it's symmetric. That is, to "un-rot13" some text, simply rot13 it again. We'll use the rot13InputStream class again in the crypt protocol handler example in Chapter 9, Network Programming, so we've put the class in the example.io package to facilitate reuse. Here's our rot13InputStream class:

package example.io; 
import java.io.*; 
 
public class rot13InputStream extends FilterInputStream { 
 
    public rot13InputStream ( InputStream i ) { 
        super( i ); 
    } 
 
    public int read() throws IOException { 
        return rot13( in.read() ); 
    } 
 
    private int rot13 ( int c ) { 
        if ( (c >= 'A') && (c <= 'Z') )            c=(((c-'A')+13)%26)+'A';        if ( (c >= 'a') && (c <= 'z') ) 
            c=(((c-'a')+13)%26)+'a'; 
        return c; 
    } } 

The FilterInputStream needs to be initialized with an InputStream; this is the stream to be filtered. We provide an appropriate constructor for the rot13InputStream class and invoke the parent constructor with a call to super(). FilterInputStream contains a protected instance variable, in, where it stores the stream reference and makes it available to the rest of our class.

The primary feature of a FilterInputStream is that it overrides the normal InputStream methods to delegate calls to the InputStream in the variable in. So, for instance, a call to read() simply turns around and calls read() on in to fetch a byte. An instance of FilterInputStream itself could be instantiated from an InputStream; it would pass its method calls on to that stream and serve as a pass-through filter. To make things interesting, we can override methods of the FilterInputStream class and do extra work on the data as it passes through.

In our example, we have overridden the read() method to fetch bytes from the underlying InputStream, in, and then perform the rot13 shift on the data before returning it. Note that the rot13() method shifts alphabetic characters, while simply passing all other values, including the end of stream value (-1). Our subclass now acts like a rot13 filter. All other normal functionality of an InputStream, like skip() and available() is unmodified, so calls to these methods are answered by the underlying InputStream.

Strictly speaking, rot13InputStream only works on an ASCII byte stream, since the underlying algorithm is based on the Roman alphabet. A more generalized character scrambling algorithm would have to be based on FilterReader to handle Unicode correctly.

8.2 Files

Unless otherwise restricted, a Java application can read and write to the host filesystem with the same level of access as the user who runs the Java interpreter. Java applets and other kinds of networked applications can, of course, be restricted by the SecurityManager and cut off from these services. We'll discuss applet access at the end of this section. First, let's take a look at the tools for basic file access.

Working with files in Java is still somewhat problematic. The host filesystem lies outside of Java's virtual environment, in the real world, and can therefore still suffer from architecture and implementation differences. Java tries to mask some of these differences by providing information to help an application tailor itself to the local environment; I'll mention these areas as they occur.

java.io.File

The java.io.File class encapsulates access to information about a file or directory entry in the filesystem. It gets attribute information about a file, lists the entries in a directory, and performs basic filesystem operations like removing a file or making a directory. While the File object handles these tasks, it doesn't provide direct access for reading and writing file data; there are specialized streams for that purpose.

File constructors

You can create an instance of File from a String pathname as follows:

File fooFile = new File( "/tmp/foo.txt" ); 
File barDir = new File( "/tmp/bar" ); 

You can also create a file with a relative path like:

File f = new File( "foo" ); 

In this case, Java works relative to the current directory of the Java interpreter. You can determine the current directory by checking the user.dir property in the System Properties list (System.getProperty('user.dir')).

An overloaded version of the File constructor lets you specify the directory path and filename as separate String objects:

File fooFile = new File( "/tmp", "foo.txt" ); 

With yet another variation, you can specify the directory with a File object and the filename with a String:

File tmpDir = new File( "/tmp" ); 
File fooFile = new File ( tmpDir, "foo.txt" ); 

None of the File constructors throw any exceptions. This means the object is created whether or not the file or directory actually exists; it isn't an error to create a File object for an nonexistent file. You can use the exists() method to find out whether the file or directory exists.

Path localization

One of the reasons that working with files in Java is problematic is that pathnames are expected to follow the conventions of the local filesystem. Java's designers intend to provide an abstraction that deals with most system-dependent filename features, such as the file separator, path separator, device specifier, and root directory. Unfortunately, not all of these features are implemented in the current version.

On some systems, Java can compensate for differences such as the direction of the file separator slashes in the above string. For example, in the current implementation on Windows platforms, Java accepts paths with either forward slashes or backslashes. However, under Solaris, Java accepts only paths with forward slashes.

Your best bet is to make sure you follow the filename conventions of the host filesystem. If your application is just opening and saving files at the user's request, you should be able to handle that functionality with the java.awt.FileDialog class. This class encapsulates a graphical file-selection dialog box. The methods of the FileDialog take care of system-dependent filename features for you.

If your application needs to deal with files on its own behalf, however, things get a little more complicated. The File class contains a few static variables to make this task easier. File.separator defines a String that specifies the file separator on the local host (e.g., "/" on UNIX and Macintosh systems and "\" on Windows systems), while File.separatorChar provides the same information in character form. File.pathSeparator defines a String that separates items in a path (e.g., ":" on UNIX systems; ";" on Macintosh and Windows systems); File.pathSeparatorChar provides the information in character form.

You can use this system-dependent information in several ways. Probably the simplest way to localize pathnames is to pick a convention you use internally, say "/", and do a String replace to substitute for the localized separator character:

// We'll use forward slash as our standard 
String path = "mail/1995/june/merle"; 
path = path.replace('/', File.separatorChar); 
File mailbox = new File( path ); 

Alternately, you could work with the components of a pathname and built the local pathname when you need it:

String [] path = { "mail", "1995", "june", "merle" }; 
  
StringBuffer sb = new StringBuffer(path[0]); 
for (int i=1; i< path.length; i++) 
    sb.append( File.separator + path[i] ); 
  File mailbox = new File( sb.toString() ); 

One thing to remember is that Java interprets the backslash character (\) as an escape character when used in a String. To get a backslash in a String, you have to use " \\".

File methods

Once we have a valid File object, we can use it to ask for information about the file itself and to perform standard operations on it. A number of methods lets us ask certain questions about the File. For example, isFile() returns true if the File represents a file, while isDirectory() returns true if it's a directory. isAbsolute() indicates whether the File has an absolute or relative path specification.

The components of the File pathname are available through the following methods: getName(), getPath(), getAbsolutePath(), and getParent(). getName() returns a String for the filename without any directory information; getPath() returns the directory information without the filename. If the File has an absolute path specification, getAbsolutePath() returns that path. Otherwise it returns the relative path appended to the current working directory. getParent() returns the parent directory of the File.

Interestingly, the string returned by getPath() or getAbsolutePath() may not be the same, case-wise, as the actual name of the file. You can retrieve the case-correct version of the file's path using getCanonicalPath(). In Windows 95, for example, you can create a File object whose getAbsolutePath() is C:\Autoexec.bat but whose getCanonicalPath() is C:\AUTOEXEC.BAT.

We can get the modification time of a file or directory with lastModified(). This time value is not useful as an absolute time; you should use it only to compare two modification times. We can also get the size of the file in bytes with length(). Here's a fragment of code that prints some information about a file:

File fooFile = new File( "/tmp/boofa" ); 
 
String type = fooFile.isFile() ? "File " : "Directory "; 
String name = fooFile.getName(); 
long len = fooFile.length(); 
System.out.println(type + name + ", " + len + " bytes " ); 

If the File object corresponds to a directory, we can list the files in the directory with the list() method:

String [] files = fooFile.list(); 

list() returns an array of String objects that contains filenames. (You might expect that list() would return an Enumeration instead of an array, but it doesn't.)

If the File refers to a nonexistent directory, we can create the directory with mkdir() or mkdirs(). mkdir() creates a single directory; mkdirs() creates all of the directories in a File specification. Use renameTo() to rename a file or directory and delete() to delete a file or directory. Note that File doesn't provide a method to create a file; creation is handled with a FileOutputStream as we'll discuss in a moment.

Table 8.1 summarizes the methods provided by the File class.

Table 8.1: File Methods

Method

Return type

Description

canRead()

boolean

Is the file (or directory) readable?

canWrite()

boolean

Is the file (or directory) writable?

delete()

boolean

Deletes the file (or directory)

exists()

boolean

Does the file (or directory) exist?

getAbsolutePath()

String

Returns the absolute path of the file (or directory)

getCanonicalPath()

String

Returns the absolute, case-correct path of the file (or directory)

getName()

String

Returns the name of the file (or directory)

getParent()

String

Returns the name of the parent directory of the file (or directory)

getPath()

String

Returns the path of the file (or directory)

isAbsolute()

boolean

Is the filename (or directory name) absolute?

isDirectory()

boolean

Is the item a directory?

isFile()

boolean

Is the item a file?

lastModified()

long

Returns the last modification time of the file (or directory)

length()

long

Returns the length of the file

list()

String []

Returns a list of files in the directory

mkdir()

boolean

Creates the directory

mkdirs()

boolean

Creates all directories in the path

renameTo(File dest)

boolean

Renames the file (or directory)

File Streams

Java provides two specialized streams for reading and writing files in the filesystem: FileInputStream and FileOutputStream. These streams provide the basic InputStream and OutputStream functionality applied to reading and writing the contents of files. They can be combined with the filtered streams described earlier to work with files in the same way we do other stream communications.

Because FileInputStream is a subclass of InputStream, it inherits all standard InputStream functionality for reading the contents of a file. FileInputStream provides only a low-level interface to reading data, however, so you'll typically wrap another stream like a DataInputStream around the FileInputStream.

You can create a FileInputStream from a String pathname or a File object:

FileInputStream foois = new FileInputStream( fooFile ); 
FileInputStream passwdis = new FileInputStream( "/etc/passwd" ); 

When you create a FileInputStream, Java attempts to open the specified file. Thus, the FileInputStream constructors can throw a FileNotFoundException if the specified file doesn't exist, or an IOException if some other I/O error occurs. You should be sure to catch and handle these exceptions in your code. When the stream is first created, its available() method and the File object's length() method should return the same value. Be sure to call the close() method when you are done with the file.

To read characters from a file, you can wrap an InputStreamReader around a FileInputStream. If you want to use the default character encoding scheme, you can use the FileReader class instead, which is provided as a convenience. FileReader works just like FileInputStream, except that it reads characters instead of bytes and wraps a Reader instead of an InputStream.

The following class, ListIt, is a small utility that displays the contents of a file or directory to standard output:

import java.io.*; 
 
class ListIt { 
    public static void main ( String args[] ) throws Exception { 
        File file =  new File( args[0] ); 
 
        if ( !file.exists() || !file.canRead() ) { 
            System.out.println( "Can't read " + file ); 
            return; 
        } 
 
        if ( file.isDirectory() ) { 
            String [] files = file.list(); 
            for (int i=0; i< files.length; i++) 
                System.out.println( files[i] ); 
        } 
        else 
            try { 
                FileReader fr = new FileReader ( file );
                BufferedReader in = new BufferedReader( fr );
                String line;
                while ((line = in.readLine()) != null)
                    System.out.println(line);
            } 
            catch ( FileNotFoundException e ) {
                System.out.println( "File Disappeared" ); 
            }
    } } 

ListIt constructs a File object from its first command-line argument and tests the File to see if it exists and is readable. If the File is a directory, ListIt prints the names of the files in the directory. Otherwise, ListIt reads and prints the file.

FileOutputStream is a subclass of OutputStream, so it inherits all the standard OutputStream functionality for writing to a file. Just like FileInputStream though, FileOutputStream provides only a low-level interface to writing data. You'll typically wrap another stream like a DataOutputStream or a PrintStream around the FileOutputStream to provide higher-level functionality. You can create a FileOutputStream from a String pathname or a File object. Unlike FileInputStream however, the FileOutputStream constructors don't throw a FileNotFoundException. If the specified file doesn't exist, the FileOutputStream creates the file. The FileOutputStream constructors can throw an IOException if some other I/O error occurs, so you still need to handle this exception.

If the specified file does exist, the FileOutputStream opens it for writing. When you actually call a write() method, the new data overwrites the current contents of the file. If you need to append data to an existing file, you should use a different constructor that accepts an append flag, as shown here:

FileInputStream foois = new FileInputStream( fooFile ); 
FileInputStream passwdis = new FileInputStream( "/etc/passwd" ); 

Antoher alternative for appending files is to use a RandomAccessFile, as I'll discuss shortly.

To write characters (instead of bytes) to a file, you can wrap an OutputStreamWriter around a FileOutputStream. If you want to use the default character encoding scheme, you can use the FileWriter class instead, which is provided as a convenience. FileWriter works just like FileOutputStream, except that it writes characters instead of bytes and wraps a Writer instead of an OutputStream.

The following example reads a line of data from standard input and writes it to the file /tmp/foo.txt:

String s = new BufferedReader( new InputStreamReader( System.in ) ).readLine(); 
 
File out = new File( "/tmp/foo.txt" ); 
FileWriter fw = new FileWriter ( out ); 
PrintWriter pw = new PrintWriter( fw, true ) 
pw.println( s ); 

Notice how we have wrapped a PrintWriter around the FileWriter to facilitate writing the data. To be a good filesystem citizen, you need to call the close() method when you are done with the FileWriter.

java.io.RandomAccessFile

The java.io.RandomAccessFile class provides the ability to read and write data from or to any specified location in a file. RandomAccessFile implements both the DataInput and DataOutput interfaces, so you can use it to read and write strings and Java primitive types. In other words, RandomAccessFile defines the same methods for reading and writing data as DataInputStream and DataOutputStream. However, because the class provides random, rather than sequential, access to file data, it's not a subclass of either InputStream or OutputStream.

You can create a RandomAccessFile from a String pathname or a File object. The constructor also takes a second String argument that specifies the mode of the file. Use "r" for a read-only file or "rw" for a read-write file. Here's how to create a simple database to keep track of user information:

try { 
    RandomAccessFile users = new RandomAccessFile( "Users", "rw" ); 
    ... 
} 
catch (IOException e) {  
} 

When you create a RandomAccessFile in read-only mode, Java tries to open the specified file. If the file doesn't exist, RandomAccessFile throws an IOException. If, however, you are creating a RandomAccessFile in read-write mode, the object creates the file if it doesn't exist. The constructor can still throw an IOException if some other I/O error occurs, so you still need to handle this exception.

After you have created a RandomAccessFile, call any of the normal reading and writing methods, just as you would with a DataInputStream or DataOutputStream. If you try to write to a read-only file, the write method throws an IOException.

What makes a RandomAccessFile special is the seek() method. This method takes a long value and uses it to set the location for reading and writing in the file. You can use the getFilePointer() method to get the current location. If you need to append data on the end of the file, use length() to determine that location. You can write or seek beyond the end of a file, but you can't read beyond the end of a file. The read methods throws a EOFException if you try to do this.

Here's an example of writing some data to our user database:

users.seek( userNum * RECORDSIZE ); 
users.writeUTF( userName ); 
users.writeInt( userID ); 

One caveat to notice with this example is that we need to be sure that the String length for userName, along with any data that comes after it, fits within the boundaries of the record size.

Applets and Files

For security reasons, applets are not permitted to read and write to arbitrary places in the filesystem. The ability of an applet to read and write files, as with any kind of system resource, is under the control of a SecurityManager object. A SecurityManager is installed by the application that is running the applet, such as an applet viewer or Java-enabled Web browser. All filesystem access must first pass the scrutiny of the SecurityManager. With that in mind, applet-viewer applications are free to implement their own schemes for what, if any, access an applet may have.

For example, Sun's HotJava Web browser allows applets to have access to specific files designated by the user in an access-control list. Netscape Navigator, on the other hand, currently doesn't allow applets any access to the filesystem.

It isn't unusual to want an applet to maintain some kind of state information on the system where it's running. But for a Java applet that is restricted from access to the local filesystem, the only option is to store data over the network on its server. Although, at the moment, the Web is a relatively static, read-only environment, applets have at their disposal powerful, general means for communicating data over networks, as you'll see in Chapter 9, Network Programming. The only limitation is that, by convention, an applet's network communication is restricted to the server that launched it. This limits the options for where the data will reside.

The only means of writing data to a server in Java is through a network socket. In Chapter 9, Network Programming we'll look at building networked applications with sockets in detail. With the tools of that chapter it's possible to build powerful client/server applications.