Contents:
Sockets
Datagram
Sockets
Working
with URLs
Web
Browsers and Handlers
Writing
a Content Handler
Writing
a Protocol Handler
The network is the soul of Java. Most of what is new and exciting
about Java centers around the potential for new kinds of dynamic, networked
applications. This chapter discusses the java.net package, which contains classes
for communications and working with networked resources. These classes fall
into two categories: the sockets API and classes for working with Uniform
Resource Locators (URLs). Figure
9.1 shows all of the classes in java.net.
Java's sockets interface provides access to the standard network
protocols used for communications between hosts on the Internet. Sockets are
the mechanism underlying all other kinds of portable networked communications.
Your processes can use sockets to communicate with a server or peer
applications on the Net, but you have to implement your own application-level
protocols for handling and interpreting the data. Higher-level functionality,
like remote procedure calls and distributed objects, are implemented with
sockets.
The Java URL classes provide an API for accessing well-defined
networked resources, like documents and applications on servers. The classes
use an extensible set of prefabricated protocol and content handlers to perform
the necessary communication and data conversion for accessing URL resources.
With URLs, an application can fetch a complete file or database record from a
server on the network with just a few lines of code. Applications like Web
browsers, which deal with networked content, use the URL class to
simplify the task of network programming. They also take advantage of the
dynamic nature of Java, which allows handlers for new types of URLs to be added
on the fly. As new types of servers and new formats for content evolve,
additional URL handlers can be supplied to retrieve and interpret the data
without modifying the original application.
In this chapter, I'll try to provide some practical and realistic
examples of Java network programming using both APIs. Sadly, the current state
of affairs is disappointing. The real release of HotJava isn't available, and
Netscape Navigator imposes many restrictions on what you can do. In addition, a
few standards that we need haven't been defined. Nevertheless, you can use all
of Java's networking capabilities to build your own free-standing applications.
I'll point out the shortcomings with Netscape Navigator and the standards scene
as I go along.
Sockets are a low-level programming interface for networked
communications. They send streams of data between applications that may or may
not be on the same host. Sockets originated in BSD UNIX and are, in other
languages, hairy and complicated things with lots of small parts that can break
off and choke little children. The reason for this is that most socket APIs can
be used with almost any kind of underlying network protocol. Since the
protocols that transport data across the network can have radically different
features, the socket interface can be quite complex. (For a discussion of
sockets in general, see UNIX Network Programming, by Richard Stevens
[Prentice-Hall].)
Java supports a simplified object-oriented interface to sockets
that makes network communications considerably easier. If you have done network
programming using sockets in C or another structured language, you should be
pleasantly surprised at how simple things can be when objects encapsulate the
gory details. If this is the first time you've come across sockets, you'll find
that talking to another application can be as simple as reading a file or
getting user input. Most forms of I/O in Java, including network I/O, use the
stream classes described in Chapter
8, Input/Output Facilities. Streams provide a unified I/O interface;
reading or writing across the Internet is similar to reading or writing a file
on the local system.
Java provides different kinds of sockets to support two distinct
classes of underlying protocols. In this first section, we'll look at Java's Socket class,
which uses a connection-oriented protocol. A connection-oriented
protocol gives you the equivalent of a telephone conversation; after
establishing a connection, two applications can send data back and forth; the
connection stays in place even when no one is talking. The protocol ensures
that no data is lost and that it always arrives in order. In the next section
we'll look at the DatagramSocket
class, which uses a connectionless protocol. A connectionless protocol
is more like the postal service. Applications can send short messages to each
other, but no attempt is made to keep the connection open between messages, to
keep the messages in order, or even to guarantee that they arrive.
In theory, just about any protocol family can be used underneath
the socket layer: Novell's IPX, Apple's AppleTalk, even the old ChaosNet
protocols. But this isn't a theoretical world. In practice, there's only one
protocol family people care about on the Internet, and only one protocol family
Java supports: the Internet protocols, IP. The Socket class speaks TCP, and the DatagramSocket
class speaks UDP, both standard Internet protocols. These protocols are
available on any system that is connected to the Internet.
When writing network applications, it's common to talk about
clients and servers. The distinction is increasingly vague, but the side that
initiates the conversation is usually the client. The side that accepts
the request to talk is usually the server. In the case where there are
two peer applications using sockets to talk, the distinction is less important,
but for simplicity we'll use the above definition.
For our purposes, the most important difference between a client
and a server is that a client can create a socket to initiate a conversation
with a server application at any time, while a server must prepare to listen
for incoming conversations in advance. The java.net.Socket class represents a
single side of a socket connection on either the client or server. In addition,
the server uses the java.net.ServerSocket
class to wait for connections from clients. An application acting as a server
creates a ServerSocket
object and waits, blocked in a call to its accept() method, until a connection
arrives. When it does, the accept()
method creates a Socket
object the server uses to communicate with the client. A server carries on multiple
conversations at once; there is only a single ServerSocket, but one active Socket object
for each client, as shown in Figure
9.2.
A client needs two pieces of information to locate and connect to
another server on the Internet: a hostname (used to find the host's network
address) and a port number. The port number is an identifier that
differentiates between multiple clients or servers on the same host. A server
application listens on a prearranged port while waiting for connections. Clients
select the port number assigned to the service they want to access. If you
think of the host computers as hotels and the applications as guests, then the
ports are like the guests' room numbers. For one guest to call another, he or
she must know the other party's hotel name and room number.
A client application opens a connection to a server by
constructing a Socket
that specifies the hostname and port number of the desired server:
try {
Socket sock = new Socket("wupost.wustl.edu", 25);
}
catch ( UnknownHostException e ) {
System.out.println("Can't find host.");
}
catch ( IOException e ) {
System.out.println("Error connecting to host.");
}
This code fragment attempts to connect a Socket to port
25 (the SMTP mail service) of the host wupost.wustl.edu. The client handles the
possibility that the hostname can't be resolved (UnknownHostException) and that it
might not be able to connect to it (IOException).
As an alternative to using a hostname, you can provide a string
version of the host's IP address:
Socket sock = new Socket("128.252.120.1", 25); // wupost.wustl.edu
Once a connection is made, input and output streams can be
retrieved with the Socket
getInputStream()
and getOutputStream()
methods. The following (rather arbitrary and strange) conversation illustrates
sending and receiving some data with the streams. Refer to Chapter
8, Input/Output Facilities for a complete discussion of working with
streams.
try {
Socket server = new Socket("foo.bar.com", 1234);
InputStream in = server.getInputStream();
OutputStream out = server.getOutputStream();
// Write a byte
out.write(42);
// Say "Hello" (send newline delimited string)
PrintStream pout = new PrintStream( out );
pout.println("Hello!");
// Read a byte
Byte back = in.read();
// Read a newline delimited string
DataInputStream din = new DataInputStream( in );
String response = din.readLine();
server.close();
}
catch (IOException e ) { }
In the exchange above, the client first creates a Socket for
communicating with the server. The Socket constructor specifies the
server's hostname (foo.bar.com) and a prearranged port number (1234). Once the
connection is established, the client writes a single byte to the server using
the OutputStream's
write()
method. It then wraps a PrintStream
around the OutputStream
in order to send text more easily. Next, it performs the complementary
operations, reading a byte from the server using InputStream's read() and then
creating a DataInputStream
from which to get a string of text. Finally, it terminates the connection with
the close()
method. All these operations have the potential to generate IOExceptions;
the catch
clause is where our application would deal with these.
After a connection is established, a server application uses the
same kind of Socket
object for its side of the communications. However, to accept a connection from
a client, it must first create a ServerSocket, bound to the correct port. Let's
recreate the previous conversation from the server's point of view:
// Meanwhile, on foo.bar.com...
try {
ServerSocket listener = new ServerSocket( 1234 );
while ( !finished ) {
Socket aClient = listener.accept(); // wait for connection
InputStream in = aClient.getInputStream();
OutputStream out = aClient.getOutputStream();
// Read a byte
Byte importantByte = in.read();
// Read a string
DataInputStream din = new DataInputStream( in );
String request = din.readLine();
// Write a byte
out.write(43);
// Say "Goodbye"
PrintStream pout = new PrintStream( out );
pout.println("Goodbye!");
aClient.close();
}
listener.close();
}
catch (IOException e ) { }
First, our server creates a ServerSocket attached to port 1234. On
some systems there are rules about what ports an application can use. Port
numbers below 1024 are usually reserved for system processes and standard,
well-known services, so we pick a port number outside of this range. The ServerSocket
need be created only once. Thereafter we can accept as many connections as
arrive.
Next we enter a loop, waiting for the accept() method of the ServerSocket to
return an active Socket
connection from a client. When a connection has been established, we perform
the server side of our dialog, then close the connection and return to the top
of the loop to wait for another connection. Finally, when the server
application wants to stop listening for connections altogether, it calls the close() method
of the ServerSocket.[1]
[1] A somewhat
obscure security feature in TCP/IP specifies that if a server socket actively
closes a connection while a client is connected, it may not be able to bind
(attach itself) to the same port on the server host again for a period of time
(the maximum time to live of a packet on the network). It's possible to turn
off this feature, and it's likely that your Java implementation will have done
so.
As you can see, this server is single-threaded; it handles one
connection at a time; it doesn't call accept() to listen for a new connection
until it's finished with the current connection. A more realistic server would
have a loop that accepts connections concurrently and passes them off to their
own threads for processing. (Our tiny HTTP daemon in a later section will do
just this.)
The examples above presuppose the client has permission to
connect to the server, and that the server is allowed to listen on the
specified socket. This is not always the case. Specifically, applets and other
applications run under the auspices of a SecurityManager that can impose
arbitrary restrictions on what hosts they may or may not talk to, and whether
they can listen for connections. The security policy imposed by the current
version of Netscape Navigator allows applets to open socket connections only to
the host that served them. That is, they can talk back only to the server from
which their class files were retrieved. Applets are not allowed to open server
sockets themselves.
Now, this doesn't meant an applet can't cooperate with its server
to communicate with anyone, anywhere. A server could run a proxy that lets the
applet communicate indirectly with anyone it likes. What the current security
policy prevents is malicious applets roaming around inside corporate firewalls.
It places the burden of security on the originating server, and not the client
machine. Restricting access to the originating server limits the usefulness of
"trojan" applications that do annoying things from the client side. You
won't let your proxy mail bomb people, because you'll be blamed.
Many networked workstations run a time service that dispenses
their local clock time on a well-known port. This was a precursor of NTP, the
more general Network Time Protocol. In the next example, DateAtHost,
we'll make a specialized subclass of java.util.Date that fetches the time
from a remote host instead of initializing itself from the local clock. (See Chapter
7, Basic Utility Classes for a complete discussion of the Date class.)
DateAtHost
connects to the time service (port 37) and reads four bytes representing the
time on the remote host. These four bytes are interpreted as an integer
representing the number of seconds since the turn of the century. DateAtHost
converts this to Java's variant of the absolute time (milliseconds since
January 1, 1970, a date that should be familiar to UNIX users) and then uses
the remote host's time to initialize itself:
import java.net.Socket;
import java.io.*;
public class DateAtHost extends java.util.Date {
static int timePort = 37;
static final long offset = 2208988800L; // Seconds from century to
// Jan 1, 1970 00:00 GMT
public DateAtHost( String host ) throws IOException {
this( host, timePort );
}
public DateAtHost( String host, int port ) throws IOException {
Socket sock = new Socket( host, port );
DataInputStream din =
new DataInputStream(sock.getInputStream());
int time = din.readInt();
sock.close();
setTime( (((1L << 32) + time) - offset) * 1000 );
} }
That's all there is to it. It's not very long, even with a few
frills. We have supplied two possible constructors for DateAtHost.
Normally we'll use the first, which simply takes the name of the remote host as
an argument. The second, overloaded constructor specifies the hostname and the
port number of the remote time service. (If the time service were running on a
nonstandard port, we would use the second constructor to specify the alternate
port number.) This second constructor does the work of making the connection
and setting the time. The first constructor simply invokes the second (using
the this()
construct) with the default port as an argument. Supplying simplified
constructors that invoke their siblings with default arguments is a common and
useful technique.
The second constructor opens a socket to the specified port on
the remote host. It creates a DataInputStream to wrap the input stream and then
reads a 4-byte integer using the readInt() method. It's no coincidence the bytes are
in the right order. Java's DataInputStream
and DataOutputStream
classes work with the bytes of integer types in network byte order (most
significant to least significant). The time protocol (and other standard
network protocols that deal with binary data) also uses the network byte order,
so we don't need to call any conversion routines. (Explicit data conversions
would probably be necessary if we were using a nonstandard protocol, especially
when talking to a non-Java client or server.) After reading the data, we're
finished with the socket, so we close it, terminating the connection to the
server. Finally, the constructor initializes the rest of the object by calling Date's setTime() method
with the calculated time value.[2]
[2] The conversion
first creates a long value, which is the unsigned equivalent of the integer time. It
subtracts an offset to make the time relative to the epoch (January 1, 1970)
rather than the century, and multiples by 1000 to convert to milliseconds.
The DateAtHost
class can work with a time retrieved from a remote host almost as easily as Date is used
with the time on the local host. The only additional overhead is that we have
to deal with the possible IOException
that can be thrown by the DateAtHost
constructor:
try {
Date d = new DateAtHost( "sura.net" );
System.out.println( "The time over there is: " + d );
int hours = d.getHours();
int minutes = d.getMinutes();
...
}
catch ( IOException e ) { }
This example fetches the time at the host sura.net and prints its
value. It then looks at some components of the time using the getHours() and getMinutes()
methods of the Date
class.
Have you ever wanted your very own Web server? Well, you're in
luck. In this section, we're going to build TinyHttpd, a minimal but functional HTTP
daemon. TinyHttpd
listens on a specified port and services simple HTTP "get file"
requests. They look something like this:
GET /path/filename [optional stuff]
Your Web browser sends one or more as lines for each document it
retrieves. Upon reading the request, the server tries to open the specified
file and send its contents. If that document contains references to images or
other items to be displayed inline, the browser continues with additional GET requests.
For best performance (especially in a time-slicing environment), TinyHttpd
services each request in its own thread. Therefore, TinyHttpd can service several
requests concurrently.
Over and above the limitations imposed by its simplicity, TinyHttpd
suffers from the limitations imposed by the fickleness of filesystem access, as
discussed in Chapter
8, Input/Output Facilities. It's important to remember that file
pathnames are still architecture dependent--as is the concept of a filesystem
to begin with. This example should work, as is, on UNIX and DOS-like systems,
but may require some customizations to account for differences on other
platforms. It's possible to write more elaborate code that uses the
environmental information provided by Java to tailor itself to the local
system. (Chapter
8, Input/Output Facilities gives some hints about how to do this).
WARNING:
This example will serve files from your host without protection.
Don't try this at work.
Now, without further ado, here's TinyHttpd:
import java.net.*;
import java.io.*;
import java.util.*;
public class TinyHttpd {
public static void main( String argv[] ) throws IOException {
ServerSocket ss = new ServerSocket(Integer.parseInt(argv[0]));
while ( true )
new TinyHttpdConnection( ss.accept() );
}
}
class TinyHttpdConnection extends Thread {
Socket sock;
TinyHttpdConnection ( Socket s ) {
sock = s;
setPriority( NORM_PRIORITY - 1 );
start();
}
public void run() {
try {
OutputStream out = sock.getOutputStream();
String req =
new DataInputStream(sock.getInputStream()).readLine();
System.out.println( "Request: "+req );
StringTokenizer st = new StringTokenizer( req );
if ( (st.countTokens() >= 2) &&
st.nextToken().equals("GET") ) {
if ( (req = st.nextToken()).startsWith("/") )
req = req.substring( 1 );
if ( req.endsWith("/") || req.equals("") )
req = req + "index.html";
try {
FileInputStream fis = new FileInputStream ( req );
byte [] data = new byte [ fis.available() ];
fis.read( data );
out.write( data );
}
catch ( FileNotFoundException e )
new PrintStream( out ).println("404 Not Found");
} else
new PrintStream( out ).println( "400 Bad Request" );
sock.close();
}
catch ( IOException e )
System.out.println( "I/O error " + e );
}
}
Compile TinyHttpd
and place it in your class path. Go to a directory with some interesting
documents and start the daemon, specifying an unused port number as an
argument. For example:
% java TinyHttpd 1234
You should now be able to use your Web browser to retrieve files
from your host. You'll have to specify the nonstandard port number in the URL.
For example, if your hostname is foo.bar.com, and you started the server as
above, you could reference a file as in:
http://foo.bar.com:1234/welcome.html
TinyHttpd
looks for files relative to its current directory, so the pathnames you provide
should be relative to that location. Retrieved some files? Al'righty then,
let's take a closer look.
TinyHttpd
is comprised of two classes. The public TinyHttpd class contains the main() method of
our standalone application. It begins by creating a ServerSocket, attached to the
specified port. It then loops, waiting for client connections and creating
instances of the second class, a TinyHttpdConnection thread, to service each request.
The while
loop waits for the ServerSocket
accept()
method to return a new Socket
for each client connection. The Socket is passed as an argument to construct the TinyHttpdConnection
thread that handles it.
TinyHttpdConnection
is a subclass of Thread.
It lives long enough to process one client connection and then dies. TinyHttpdConnection's
constructor does three things. After saving the Socket argument for its caller,
it adjusts its own priority and then invokes start() to bring its run() method to
life. By lowering its priority to NORM_PRIORITY-1 (just below the default priority),
we ensure that the threads servicing established connections won't block TinyHttpd's main
thread from accepting new requests. (On a time-slicing system, this is less
important.)
The body of TinyHttpdConnection's
run()
method is where all the magic happens. First, we fetch an OutputStream for
talking back to our client. The second line reads the GET request from
the InputStream
into the variable req.
This request is a single newline-terminated String that looks like the GET request we
described earlier. Since this is the only time we read from this socket, it's
hard to resist the urge to be terse. Alternatively, we could break that
statement into three steps: getting the InputStream, creating the DataInputStream
wrapper, and reading the line. The three-line version is certainly more
readable and should not be noticeably slower.
We then parse the contents of req to extract a filename. The next few
lines are a brief exercise in string manipulation. We create a StringTokenizer
and make sure there are at least two tokens. Using nextToken(), we take the first
token and make sure it's the word GET. (If both conditions aren't met, we have an
error.) Then we take the next token (which should be a filename), assign it to req , and
check whether it begins with "/". If so, we use substring() to
strip the first character, giving us a filename relative to the current
directory. If it doesn't begin with "/", the filename is already
relative to the current directory. Finally, we check to see if the requested
filename looks like a directory name (i.e., ends in slash) or is empty. In
these cases, we append the familiar default filename index.html.
Once we have the filename, we try to open the specified file and
load its contents into a large byte array. (We did something similar in the ListIt example
in Chapter
8, Input/Output Facilities.) If all goes well, we write the data out
to client on the OutputStream.
If we can't parse the request or the file doesn't exist, we wrap our OutputStream
with a PrintStream
to make it easier to send a textual message. Then we return an appropriate HTTP
error message. Finally, we close the socket and return from run(), removing
our Thread.
The biggest problem with TinyHttpd is that there are no
restrictions on the files it can access. With a little trickery, the daemon
will happily send any file in your filesystem to the client. It would be nice
if we could restrict TinyHttpd
to files that are in the current directory, or a subdirectory. To make the
daemon safer, let's add a security manager. I discussed the general framework
for security managers in Chapter
7, Basic Utility Classes. Normally, a security manager is used to
prevent Java code downloaded over the Net from doing anything suspicious.
However, a security manager will serve nicely to restrict file access in a
self-contained application.
Here's the code for the security manager class:
import java.io.*;
class TinyHttpdSecurityManager extends SecurityManager {
public void checkAccess(Thread g) { };
public void checkListen(int port) { };
public void checkLink(String lib) { };
public void checkPropertyAccess(String key) { };
public void checkAccept(String host, int port) { };
public void checkWrite(FileDescriptor fd) { };
public void checkRead(FileDescriptor fd) { };
public void checkRead( String s ) {
if ( new File(s).isAbsolute() || (s.indexOf("..") != -1) )
throw new
SecurityException("Access to file : "+s+" denied.");
}
}
The heart of this security manager is the checkRead()
method. It checks two things: it makes sure that the pathname we've been given
isn't an absolute path, which could name any file in the filesystem; and it
makes sure the pathname doesn't have a double dot (..) in it, which refers to the
parent of the current directory. With these two restrictions, we can be sure
(at least on a UNIX or DOS-like filesystem) that we have restricted access to
only subdirectories of the current directory. If the pathname is absolute or
contains "..",
checkRead()
throws a SecurityException.
The other do-nothing method implementations--e.g., checkAccess()--allow
the daemon to do its work without interference from the security manager. If we
don't install a security manager, the application runs with no restrictions. However,
as soon as we install any security manager, we inherit implementations of many
"check" routines. The default implementations won't let you do
anything; they just throw a security exception as soon as they are called. We
have to open holes so the daemon can do its own work; it still has to accept
connections, listen on sockets, create threads, read property lists, etc.
Therefore, we override the default checks with routines that allow these
things.
Now you're thinking, isn't that overly permissive? Not for this
application; after all, TinyHttpd
never tries to load foreign classes from the Net. The only code we are
executing is our own, and it's assumed we won't do anything dangerous. If we
were planning to execute untrusted code, the security manager would have to be
more careful about what to permit.
Now that we have a security manager, we must modify TinyHttpd to use
it. Two changes are necessary: we must install the security manager and catch
the security exceptions it generates. To install the security manager, add the
following code at the beginning of TinyHttpd's main() method:
System.setSecurityManager( new TinyHttpdSecurityManager() );
To catch the security exception, add the following catch clause
after FileNotFoundException's
catch
clause:
catch ( SecurityException e )
new PrintStream( out ).println( "403 Forbidden" );
Now the daemon can't access anything that isn't within the
current directory or a subdirectory. If it tries to, the security manager
throws an exception and prevents access to the file. The daemon then returns a
standard HTTP error message to the client.
TinyHttpd
still has room for improvement. First, it consumes a lot of memory by
allocating a huge array to read the entire contents of the file all at once. A
more realistic implementation would use a buffer and send large amounts of data
in several passes. TinyHttpd
also fails to deal with simple things like directories. It wouldn't be hard to
add a few lines of code (again, refer to the ListIt example in Chapter
8, Input/Output Facilities) to read a directory and generate linked
HTML listings like most Web servers do.
TinyHttpd
used a Socket
to create a connection to the client using the TCP protocol. In that example,
TCP itself took care of data integrity; we didn't have to worry about data
arriving out of order or incorrect. Now we'll take a walk on the wild side.
We'll build an applet that uses a java.net.DatagramSocket, which uses the UDP
protocol. A datagram is sort of like a "data telegram": it's a
discrete chunk of data transmitted in one packet. Unlike the previous example,
where we could get a convenient OutputStream from our Socket and write the data as if
writing to a file, with a DatagramSocket
we have to work one datagram at a time. (Of course, the TCP protocol was taking
our OutputStream
and slicing the data into packets, but we didn't have to worry about those
details).
UDP doesn't guarantee that the data will get through. If the data
do get through, it may not arrive in the right order; it's even possible for
duplicate datagrams to arrive. Using UDP is something like cutting the pages
out of the encyclopedia, putting them into separate envelopes, and mailing them
to your friend. If your friend wants to read the encyclopedia, it's his or her
job to put the pages in order. If some pages got lost in the mail, your friend
has to send you a letter asking for replacements.
Obviously, you wouldn't use UDP to send a huge amount of data.
But it's significantly more efficient than TCP, particularly if you don't care
about the order in which messages arrive, or whether the data arrive at all.
For example, in a database lookup, the client can send a query; the server's
response itself constitutes an acknowledgment. If the response doesn't arrive
within a certain time, the client can send another query. It shouldn't be hard
for the client to match responses to its original queries. Some important
applications that use UDP are the Domain Name System (DNS) and Sun's Network
Filesystem (NFS).
In this section we'll build a simple applet, HeartBeat, that
sends a datagram to its server each time it's started and stopped. (See Chapter
10, Understand the Abstract Windowing Toolkit for a complete
discussion of the Applet
class.) We'll also build a simple standalone server application, Pulse, that receives
that datagrams and prints them. By tracking the output, you could have a crude
measure of who is currently looking at your Web page at any given time. This is
an ideal application for UDP: we don't want the overhead of a TCP socket, and
if datagrams get lost, it's no big deal.
First, the HeartBeat
applet:
import java.net.*;
import java.io.*;
public class HeartBeat extends java.applet.Applet {
String myHost;
int myPort;
public void init() {
myHost = getCodeBase().getHost();
myPort = Integer.parseInt( getParameter("myPort") );
}
private void sendMessage( String message ) {
try {
byte [] data = new byte [ message.length() ];
message.getBytes(0, data.length, data, 0);
InetAddress addr = InetAddress.getByName( myHost );
DatagramPacket pack =
new DatagramPacket(data, data.length, addr, myPort);
DatagramSocket ds = new DatagramSocket();
ds.send( pack );
ds.close();
}
catch ( IOException e )
System.out.println( e );
}
public void start() {
sendMessage("Arrived");
}
public void stop() {
sendMessage("Departed");
}
}
Compile the applet and include it in an HTML document with an <applet>
tag:
<applet height=10 width=10 code=HeartBeat>
<param name="myPort" value="1234">
</applet>
The myPort
parameter should specify the port number on which our server application
listens for data.
Next, the server-side application, Pulse:
import java.net.*;
import java.io.*;
public class Pulse {
public static void main( String [] argv ) throws IOException {
DatagramSocket s =
new DatagramSocket(Integer.parseInt(argv[0]));
while ( true ) {
DatagramPacket packet = new DatagramPacket(new byte
[1024], 1024);
s.receive( packet );
String message = new String(packet.getData(), 0, 0,
packet.getLength());
System.out.println( "Heartbeat from: " +
packet.getAddress().getHostName() + " - " + message );
}
}
}
Compile Pulse
and run it on your Web server, specifying a port number as an argument:
% java Pulse 1234
The port number should be the same as the one you used in the myPort parameter
of the <applet>
tag for HeartBeat.
Now, pull up the Web page in your browser. You won't see anything
there (a better application might do something visual as well), but you should
get a blip from the Pulse
application. Leave the page and return to it a few times. Each time the applet
is started or stopped, it sends a message:
Heartbeat from: foo.bar.com - Arrived
Heartbeat from: foo.bar.com - Departed
Heartbeat from: foo.bar.com - Arrived
Heartbeat from: foo.bar.com - Departed
...
Cool, eh? Just remember the datagrams are not guaranteed to
arrive (although it's unlikely you'll see them fail), and it's possible that
you could miss an arrival or a departure. Now let's look at the code.
HeartBeat
overrides the init(),
start(),
and stop()
methods of the Applet
class, and implements one private method of its own, sendMessage(),
that sends a datagram. HeartBeat
begins its life in init(),
where it determines the destination for its messages. It uses the Applet getCodeBase()
and getHost()
methods to find the name of its originating host and fetches the correct port
number from the myPort
parameter of the HTML tag. After init() has finished, the start() and stop() methods
are called whenever the applet is started or stopped. These methods merely call
sendMessage()
with the appropriate message.
sendMessage()
is responsible for sending a String message to the server as a datagram. It takes
the text as an argument, constructs a datagram packet containing the message,
and then sends the datagram. All of the datagram information, including the
destination and port number, are packed into a java.net.DatagramPacket object. The DatagramPacket
is like an addressed envelope, stuffed with our bytes. After the DatagramPacket
is created, sendMessage()
simply has to open a DatagramSocket
and send it.
The first four lines of sendMessage() build the DatagramPacket:
try {
byte [] data = new byte [ message.length() ];
message.getBytes(0, data.length, data, 0);
InetAddress addr = InetAddress.getByName( myHost );
DatagramPacket pack =
new DatagramPacket(data, data.length, addr, myPort );
First, the contents of message are placed into an array of
bytes called data.
Next a java.net.InetAddress
object is created from the name myHost. An InetAddress simply holds the network
address information for a host in a special format. We get an InetAddress
object for our host by using the static getByName() method of the InetAddress
class. (We can't construct an InetAddress object directly.) Finally, we call the DatagramPacket constructor
with four arguments: the byte array containing our data, the length of the
data, the destination address object, and the port number.
The remaining lines construct a default client DatagramSocket
and call its send()
method to transmit the DatagramPacket;
after sending the datagram, we close the socket:
DatagramSocket ds = new DatagramSocket();
ds.send( pack );
ds.close();
Two operations throw a type of IOException: the InetAddress.getByName()
lookup and the DatagramSocket
send().
InetAddress.getByName()
can throw an UnknownHostException,
which is a type of IOException
that indicates that the host name can't be resolved. If send() throws an
IOException,
it implies a serious client side problem in talking to the network. We need to
catch these exceptions; our catch
block simply prints a message telling us that something went wrong. If we get
one of these exceptions, we can assume the datagram never arrived. However, we
can't assume the converse. Even if we don't get an exception, we still don't
know that the host is actually accessible or that the data actually arrived;
with a DatagramSocket,
we never find out.
The Pulse
server corresponds to the HeartBeat
applet. First, it creates a DatagramSocket
to listen on our prearranged port. This time, we specify a port number in the
constructor; we get the port number from the command line as a string (argv[0]) and
convert it to an integer with Integer.parseInt(). Note the difference between this
call to the constructor and the call in HeartBeat. In the server, we need to
listen for incoming datagrams on a prearranged port, so we need to specify the
port when creating the DatagramSocket.
In the client, we need only to send datagrams, so we don't have to specify the
port in advance; we build the port number into the DatagramPacket itself.
Second, Pulse
creates an empty DatagramPacket
of a fixed size to receive an incoming datagram. This alternative constructor
for DatagramPacket
takes a byte array and a length as arguments. As much data as possible is
stored in the byte array when it's received. (A practical limit on the size of
a UDP datagram is 8K.) Finally, Pulse calls the DatagramSocket's receive() method
to wait for a packet to arrive. When a packet arrives, its contents are
printed.
As you can see, working with DatagramSocket is slightly more tedious
than working with Sockets.
With datagrams, it's harder to spackle over the messiness of the socket
interface. However, the Java API rather slavishly follows the UNIX interface,
and that doesn't help. I don't see any reason why we have to prepare a datagram
to hand to receive()
(at least for the current functionality); receive() ought to create an appropriate
object on its own and hand it to us, saving us the effort of building the
datagram in advance and unpacking the data from it afterwards. It's easy to
imagine other conveniences; perhaps we'll have them in a future release.
A URL points to an object on the Internet. It's a collection of
information that identifies an item, tells you where to find it, and specifies
a method for communicating with it or retrieving it from its source. A URL
refers to any kind of information source. It might point to static data, such
as a file on a local filesystem, a Web server, or an FTP archive; or it can point
to a more dynamic object such as a news article on a news spool or a record in
a WAIS database. URLs can even refer to less tangible resources such as Telnet
sessions and mailing addresses.
A URL is usually presented as a string of text, like an address.[3]
Since there are many different ways to locate an item on the Net, and different
mediums and transports require different kinds of information, there are
different formats for different kinds of URLs. The most common form specifies
three things: a network host or server, the name of the item and its location
on that host, and a protocol by which the host should communicate:
[3] The term URL
was coined by the Uniform Resource Identifier (URI) working group of the IETF
to distinguish URLs from the more general notion of Uniform Resource Names or
URNs. URLs are really just static addresses, whereas URNs would be more
persistent and abstract identifiers used to resolve the location of an object
anywhere on the Net. URLs are defined in RFC 1738 and RFC 1808.
protocol://hostname/location/item
protocol
is an identifier such as "http," "ftp," or
"gopher"; hostname
is an Internet hostname; and the location and item
components form a path that identifies the object on that host. Variants of
this form allow extra information to be packed into the URL, specifying things
like port numbers for the communications protocol and fragment identifiers that
reference parts inside the object.
We sometimes speak of a URL that is relative to a base URL. In
that case we are using the base URL as a starting point and supplying
additional information. For example, the base URL might point to a directory on
a Web server; a relative URL might name a particular file in that directory.
A URL is represented by an instance of the java.net.URL
class. A URL
object manages all information in a URL string and provides methods for
retrieving the object it identifies. We can construct a URL object from
a URL specification string or from its component parts:
try {
URL aDoc = new URL( "http://foo.bar.com/documents/homepage.html" );
URL sameDoc =
new URL("http","foo.bar.com","documents/homepage.html");
}
catch ( MalformedURLException e ) { }
The two URL
objects above point to the same network resource, the homepage.html
document on the server foo.bar.com. Whether or not the resource actually exists
and is available isn't known until we try to access it. At this point, the URL object just
contains data about the object's location and how to access it. No connection
to the server has been made. We can examine the URL's components with the getProtocol(), getHost(), and getFile()
methods. We can also compare it to another URL with the sameFile() method. sameFile()
determines if two URLs point to the same resource. It can be fooled, but sameFile does
more than compare the URLs for equality; it takes into account the possibility
that one server may have several names, and other factors.
When a URL
is created, its specification is parsed to identify the protocol component. If
the protocol doesn't make sense, or if Java can't find a protocol handler for
it, the URL constructor throws a MalformedURLException. A protocol handler is a Java
class that implements the communications protocol for accessing the URL
resource. For example, given an "http" URL, Java prepares to use the
HTTP protocol handler to retrieve documents from the specified server.
The most general way to get data back from URL is to ask
for an InputStream
from the URL
by calling openStream().
If you're writing an applet that will be running under Netscape, this is about
your only choice. In fact, it's a good choice if you want to receive continuous
updates from a dynamic information source. The drawback is that you have to
parse the contents of an object yourself. Not all types of URLs support the openStream()
method; you'll get an UnknownServiceException
if yours doesn't.
The following code reads a single line from an HTML file:
try {
URL url = new URL("http://server/index.html");
DataInputStream dis = new DataInputStream( url.openStream() );
String line = dis.readLine();
We ask for an InputStream
with openStream(),
and wrap it in a DataInputStream
to read a line of text. Here, because we are specifying the "http"
protocol in the URL, we still require the services of an HTTP protocol handler.
As we'll discuss more in a bit, that brings up some questions about what
handlers we have available to us and where. This example partially works around
those issues because no content handler is involved; we read the data and
interpret it as a content handler would. However, there are even more
limitations on what applets can do right now. For the time being, if you
construct URLs
relative to the applet's codeBase(),
you should be able to use them in applets as in the above example. This should
guarantee that the needed protocol is available and accessible to the applet.
Again, we'll discuss the more general issues a bit later.
openStream()
operates at a lower level than the more general content-handling mechanism
implemented by the URL
class. We showed it first because, until some things are settled, you'll be
limited as to when you can use URLs in their more powerful role. When a proper
content handler is available to Java (currently, only if you supply one with
your standalone application), you'll be able to retrieve the object the URL addresses as
a complete object, by calling the URL's getContent() method. getContent() initiates a
connection to the host, fetches the data for you, determines the data's MIME
type, and invokes a content handler to turn the data into a Java object.
For example: given the URL http://foo.bar.com/index.html,
a call to getContent()
uses the HTTP protocol handler to receive the data and the HTML content handler
to turn the data into some kind of object. A URL that points to a plain-text
file would use a text-content handler that might return a String object. A
GIF file might be turned into an Image object for display, using a GIF content
handler. If we accessed the GIF file using an "ftp" URL, Java would
use the same content handler, but would use the FTP protocol handler to receive
the data.
getContent()
returns the output of the content handler. Now we're faced with a problem:
exactly what did we get? Since the content handler can return almost anything,
the return type of getContent()
is Object.
Before doing anything meaningful with this Object, we must cast it into some other
data type that we can work with. For example, if we expect a String, we'll
cast the result of getContent()
to a String:
String content;
try
content = (String)myURL.getContent();
catch ( Exception e ) { }
Of course, we are presuming we will in fact get a String object
back from this URL.
If we're wrong, we'll get a ClassCastException.
Since it's common for servers to be confused (or even lie) about the MIME types
of the objects they serve, it's wise to catch that exception (it's a subclass
of RuntimeException,
so catching it is optional) or to check the type of the returned object with
the instanceof
operator:
if ( content instanceof String ) {
String s = (String)content;
...
Various kinds of errors can occur when trying to retrieve the
data. For example, getContent()
can throw an IOException
if there is a communications error; IOException is not a type of RuntimeException,
so we must catch it explicitly, or declare the method that calls getContent() can
throw it. Other kinds of errors can happen at the application level: some
knowledge of how the handlers deal with errors is necessary.
For example, consider a URL that refers to a nonexistent file on
an HTTP server. When requested, the server probably returns a valid HTML
document that consists of the familiar "404 Not Found" message. An
appropriate HTML content handler is invoked to interpret this and return it as
it would any other HTML object. At this point, there are several alternatives,
depending entirely on the content handler's implementation. It might return a String
containing the error message; it could also conceivably return some other kind
of object or throw a specialized subclass of IOException. To find out that an error
occurred, the application may have to look directly at the object returned from
getContent().
After all, what is an error to the application may not be an error as far as
the protocol or content handlers are concerned. "404 Not Found" isn't
an error at this level; it's a perfectly valid document.
Another type of error occurs if a content handler that
understands the data's MIME type isn't available. In this case, getContent()
invokes a minimal content handler used for data with an unknown type and
returns the data as a raw InputStream.
A sophisticated application might specialize this behavior to try to decide
what to do with the data on its own.
The openStream()
and getContent()
methods both implicitly create a connection to the remote URL object. For
some applications, it may be necessary to use the openConnection() method of the URL to interact
directly with the protocol handler. openConnection() returns a URLConnection
object, which represents a single, active connection to the URL resource.
We'll examine URLConnections
further when we start writing protocol handlers.