Lesson 13

Sock it to them

This lesson concerns servers and clients and how to get computers to talk to each other via sockets. Every time you use a web browser or start up a telnet client, you are creating a client-side socket, which connects over the Internet to a similar socket on a server. Once connected, you can request stuff from the server, and the server can request stuff from you. You can even create pairs of sockets on the same computer, to get it to talk to itself. In Perl, sockets can be created in two ways: with some blood, sweat and tears (using the perl built-ins socket, connect, accept and so on), or with the IO::Socket::INET module and its friends. In this lesson, we'll look at how to write a very simple server/client pair that does nothing much using IO::Socket::INET. We'll also write a horribly primitive web-browser, i.e. an HTTP client, and a horribly primitive MP3 server.

Client/server pair

The IO::Socket::INET module is object-oriented, so have a brief look at Perl OO if you need a refresher. We'll look at how to code the server first.

#!/usr/bin/perl
use strict;
use IO::Socket::INET;
my $sock = new IO::Socket::INET
(
    LocalHost     => 'localhost',
    LocalPort     => 1200,
    Proto         => 'tcp',
    Listen        => 1,
    Reuse         => 1,
);
die "Can't create socket: $!\n" unless $sock;

This part of the code simply creates a new IO::Socket::INET object. The new method takes a hash of configuration options. We'll take these one at a time.

Servers and hosts

LocalHost specifies the name of the server (or 'host'). To run a local server (just for your machine), you'll need to call it localhost. Local clients will have to connect to the address 127.0.0.1, which is the internet protocol (IP) address set aside for the localhost internal loop.

If you're hazy on IP addresses, then writing servers is probably something you shouldn't be doing. Briefly, computers on the internet have IP addresses, which are something like telephone numbers for computers. These look like xxx.xxx.xxx.xxx, where the xxx are numbers. They are difficult to remember. Consequently, the internet is jammed full of 'domain name servers' (DNS) that do nothing more than convert them back and forth from human-speak to computer-speak. Type this at the command prompt:

ping www.perl.com
Pinging www.perl.com [208.201.239.56] with 32 bytes of data:
Reply ... blah

The ping command sees if a server is alive. See that it converts the www.perl.com domain name into a real IP address 208.201.239.56, by interrogating a DNS. Anyway, back to the matter in hand. The LocalHost parameter is just the name of the server.

Ports

If IP addresses are like telephone numbers, then ports are like telephone extensions. Servers may run more than one service. A server running a web (HTTP) service may also be running a telnet service (you'd hope not, but it's always possible) for configuring the web-server, and an FTP server for uploading files to serve, etc. To get at these different services, each one is given a port number. So, after connecting to the server itself, you have to ask for the right port. Some ports are very well known: web-servers (hypertext transfer protocol, HTTP, servers) invariably run on port 80, FTP (file transfer protocol) on port 21, and telnet on port 23. Avoid these, and indeed any port number below 1023. All this leads us to the next configuration, which is the LocalPort number, which is the port you want to use. If you're writing an FTP server (don't!), then use port 21, etc. We'll use a random port number > 1023.

Protocols

The next part of the configuration is the protocol: think of this as the language you are using to talk to the server. Communication over the internet is done using the Internet Protocol, and any packets of data you send to a server will start off with a 'header' saying 'I am an internet protocol data packet from xxx.xxx.xxx.xxx to xxx.xxx.xxx.xxx', in computer-ese.

If you peel off the IP header, you will come to the next layer of the packet onion, which tells you whether the packet is being sent using the TCP, ICMP or UDP protocol. TCP is the (transmission control) protocol you use if you want to make sure your packet gets to where it's supposed to arrive (or at least warns you if it gets lost). UDP and ICMP are the protocols you can use when you don't give a damn if your packet gets lost (ping uses ICMP). We will use TCP, which is generally what you will want to do too, for general purpose socket programming. The TCP header contains data about the port numbers to use, and so on.

If you peel off the TCP/UDP/ICMP header, you will come to the next layer of the onion, which contains the data you actually sent. Servers and clients have to speak the same language, so the data you send back and forth will have to be structured in a particular way. If you talk to an SMTP server (to send email), you have to abide by yet another protocol, the Simple Mail Transfer Protocol. This may sound quite clever, but when you send email, what actually happens is a rather mundane largely plain-text conversation like this:

Server: 220 server.somewhere.com Simple Mail Transfer Service Ready
Client: EHLO client.elsewhere.com
Server: 250 server.somewhere.com Hello client.elsewhere.com
Client: MAIL FROM:<foo@elsewhere.com>
Server: 250 OK
Client: RCPT TO:<bar@somewhere.com>
Server: 250 OK
Client: DATA
Server: 354 Send Data, end with <CRLF>.<CRLF>
Client: Message: blah...
    (some dull binary stuff involving Content-type:text/plain, 
    the message itself, and other such stuff).
Server: 250 OK
Client: .
Server: 250 OK
Client: QUIT
Server: 221 server.somewhere.com Closing transmission

As you can see, there's nothing very magical about any of this. There will be absolutely nothing magical about the server we'll be writing! BTW, the format of the Message: itself is subject to yet another protocol, as defined in Request For Comments (RFC) 822. RFCs are the Internet's official repository of standard protocols. You'll find the specs for SMTP in there somewhere too.

The rest

The remaining configurations are simple. Reuse means allow the port to be reused if the socket is closed abnormally. The Listen key takes a value containing the maximum number of client requests to keep in a queue while it deals with the last request. Here we will set it to one. We have now created a working server that will sit about on port 1200 doing nothing. Let's get it to do something useful. Append this to the code from above:

$sock->autoflush(1);
while ( my $sock_handle = $sock->accept() )
{
    print "Accepting connection from ", $sock_handle->peerhost(), ".\n";
    print $sock_handle "Welcome.\n";
    LINE: while ( $_ = <$sock_handle> )
    {
        chomp;
        print "\tClient request: $_\n";
        for ( $_ )
        {
            /^QUIT$/i && do
                {
                    print $sock_handle "Bye!\n";
                    close $sock_handle;
                    last LINE;
                };
            /^HELP$/i && do
                {
                    print $sock_handle "You'll be lucky.\n";
                    next LINE;
                };
            print $sock_handle "Acknowledged $_.\n";
        }
    }
}
close ( $sock );

Now this server turns on autoflushing for the socket (i.e. it prints immediately to the socket without buffering: otherwise you might print less to the socket than perl thinks is worth printing and Things Will Go Wrong). Then it uses the accept method to accept requests for connections from a client. The object returned by accept() possesses a number of methods you can call on it: particularly useful is peerhost(), which contains the client's IP address.

For each client, the server prints a welcome message, then goes into a loop, looking for HELP or QUIT commands, and otherwise just echoing back whatever it just read, with "Acknowledged: blah". Notice that the $sock_handle acts like a bidirectional filehandle, i.e. one you can both read from and write to. Herein lies a whole world of pain, which we'll get to in a minute. By the way, for( $switch ){ /case1/ && do{ blah() }; /case2/ && do { fooey() }; } is the standard perl idiom for a switch statement, unless you want to install the Switch module, which is very nice.

This server doesn't do anything very useful, it's completely insecure (although it doesn't actually do anything bad, like system($_); instead of print "Acknowledged: $_\n";), and it can only handle one connection at a time. Doing stuff is easy: you can code Perl, so the server can do pretty much anything (caveat scriptor). To make it secure, you'd want some sort of password authentication (and you'd probably want to RSA encrypt the data passed from the server to client, so plaintext passwords don't get sent over the network). And look at the fork() code to see how to spew off child processes to deal with multiple requests simultaneously. These are left as exercises for the reader.

Clients

The client is even easier to code up than the server. Here's the code:

#!/usr/bin/perl
use strict;
use IO::Socket::INET;
my $sock = new IO::Socket::INET
(
    PeerAddr => 'localhost',
    PeerPort => 1200,
    Proto => 'tcp',
);
die "Socket could not be created. $!\n" unless $sock;
$sock->autoflush(1);
my $ack = <$sock>;
print "Server says: $ack";
while ( <STDIN> )
{
    chomp;
    print "Sending message $_ to server\n";
    print $sock "$_\n";
    my $msg = <$sock>;
    print "Server says: $msg";
}
close ( $sock );

Again, we create a IO::Socket::INET object, configure it to connect to an address (the PeerAddr option, note that clients use PeerAddr, whereas servers use LocalHost). This address is obviously localhost, and we connect on the same port, using the same protocol, and turn on autoflushing. Now, we know what the server will do when we connect (look at the server code): it will print a welcome message. We must read this message, or everything will go horribly wrong. This is the world of hurt I mentioned earlier. If we don't know the exact format of the protocol we are creating, then we will end up trying to print to the socket when the server is trying to tell us something, and we will end up in deadlock.

Clients and servers must have a protocol that tells them when it's their turn to speak.

Our 'protocol' is simply a) the server speaks first, and b) when we write a line (terminated by a newline character), we expect a line back. So when you type something into the client, it sends it to the server, and then waits for an acknowledgement before trying to send anything else. At the other end, the server listens for a line from the client, then prints an acknowledgement, then listens again. Both ends know what the other end should be doing, and deadlock is avoided.

You could write more complicated protocols: part of the SMTP protocol from earlier was "after receiving the Message: command, the server listens until it gets a \n followed by a . followed by a \n, before it tries printing to the socket again". More exercises for the reader.

If you come to create a real server/client pair, you will need to consider the following things:

Crappy web browser

OK, let's write a client that connects to a web server and GETs pages (GET is a command in the HyperText Transfer Protocol (HTTP)). We'll leave writing the server to Apache. We'll do this in two stages. First we will create a subclass of HTML::Parser, which is a module that parses HTML (if you don't have it, then ppm/CPAN install it now!). We will call this MyParser.pm. Second, we will create a simple client that GETs a webpage and uses objects of class MyParser to parse it, then displays it.

So, firstly the subclassing (think of this as a refresher for OO Perl: it certainly doesn't have much to do with sockets!):

package MyParser;
use strict;
use HTML::Parser;
our @ISA = qw( HTML::Parser );
sub new
{
    my $proto = shift;
    my $class = ref( $proto ) || $proto;
    my $parser = HTML::Parser->new( api_version => 3 );
    $parser->handler( 'start', '_tag', "self, tagname, '+1' " );
        # handles <tag>
    $parser->handler( 'end', '_tag', "self, tagname, '-1' " );
        # handles </tag>
    $parser->handler( 'text', '_text', "self, dtext" );
        # handles what's between  them
    return bless $parser, $class;
}
sub _tag
{
    my ( $self, $tag, $num ) = @_;
    $self->{inside}->{$tag} += $num;
}
sub _text
{
    my ( $self, $text ) = @_;
    return if $self->{inside}->{script} 
        || $self->{inside}->{style};
    $self->{contents} .= $text;
}
sub contents
{
    my ( $self ) = @_;
    return $self->{contents};
}
sub clear
{
    my ( $self ) = @_;
    $self->{contents} = "";
}
1;

You may need to look at the POD for HTML::Parser to work out exactly what is going on here. When you call the new method of MyParser, MyParser will create an HTML::Parser object. Then it subclasses it, i.e. MyParser adds new functionality to HTML::Parser. It does this by a) putting HTML::Parser in @INC, so the MyParser class will inherit all the useful functions (like parse) from HTML::Parser. Then it sets up some handlers for things that the parser comes across (more in a minute). After setting up these handlers, it reblesses the HTML::Parser object into a new class, which will be the class MyParser (unless someone chooses to inherit new from MyParser, in which case, it could be anything!). This new object is then returned.

The handlers themselves are quite self explanatory:

$parser->handler( 'start', '_tag', "self, tagname, '+1'");

This sets up a handler: every time the parser comes across an 'opening' HTML tag, like <p>, it will invoke the private method _tag() on $parser and pass it the arguments in the double quoted string, i.e. self (i.e. the $parser object), the tagname (i.e. 'p') and the number +1. A similar handler deals with 'exiting' tags like </p>, the difference is that it passes -1 rather than +1, for reasons that will become obvious soon. A final handler for plain text (i.e. the stuff between tags) is set up as well. Now, when the parser comes across an 'opening' tag, it will invoke the _tag() method. This method alters the innards of the $parser object: it adds a hash key (remember most objects are hashrefs) called inside, which itself is a reference to another hash with keys that are the name of HTML tags the parser comes across. Hence, if the parser comes across <p>, the handler you have set up will invoke

$parser->_tag( 'p', '+1' )

and this will have the effect of incrementing $parser->{inside}->{p} by 1.

The _text handling method can now tell whether the text blah is in something 'real' like <i>blah</i>, or in junk like <script>blah</script> or <style>blah</style>. Consequently it can add just useful stuff to the contents key of the $parser hashref.

Hope this is all clear. The only public methods MyParser has are contents(), which simply returns what the _text method has been squirreling away, and clear(), which clears the stored text.

If you're a good boy/girl you will of course stuff this module full of POD, $VERSION, etc., now. Back to sockets, and the main script:

#!/usr/bin/perl
use strict;
use IO::Socket::INET;
use MyParser;
my $host = shift;
my $http = IO::Socket::INET->new
(
    PeerAddr => $host,
    PeerPort => "http(80)",
    Proto => 'tcp',
);
die unless $http;
$http -> autoflush(1);
print $http "GET /\n\n";
my $page;
while ( <$http> )
{
    $page .= $_;
}
close $http;
my $parser = new MyParser;
$parser->parse( $page );
my $contents = $parser->contents();
foreach ( split /\n/, $contents )
{
    next if /^\s+$/;
    s/\s+/ /g;
    print "$_\n";
}

Finally, we use IO::Socket::INET to create a socket object connected to a host shifted from the command line. We want to interrogate the HTTP port of the server, so we connect using PeerPort 80. The IO::Socket::INET module is bright enough to know that HTTP usually lives on port 80, so we can specify the port as 80, 'http' or even 'http(80)'. Then we print "GET /\n\n" to the socket (i.e. GET the root page from the server), abiding by the RFC for the HTTP protocol. Then we sit back, read the output from the socket, parse what we get with MyParser, tidy up the whitespace and print it out. Ta-dah: one extremely primitive web browser.

Of course, this is all a waste of time! Use the LWP modules to hide all the raw details of sockets and other such nastiness if you want a (much better) web browser. It also won't deal with more complex webpages, such as those in XML, for which you will need an XML parser.. Still, at least you won't be daunted by creating sockets, servers and clients now.

Crappy mp3 server

And for the other side of the coin, we'll write a primitive MP3 server (mostly cribbed from perlmonks, but with some modifications to make it easier!), which you can use with an MP3 player able to use HTTP to connect to websites (e.g. Windows Media Player). We'll take it a bit at a time, as there are a few new concepts here too:

#!/usr/bin/perl
use strict;
use IO::Socket::INET;
use File::Find;
my ( $sock, $connection );
$SIG{ INT } = sub
{
    $connection->DESTROY if defined $connection;
    $sock->DESTROY if defined $sock;
    die "Caught interrupt\n"
};
my $LIBRARY = "D:/Steve/mp3";

The %SIG hash is a special perl variable. It contains references to subroutines (either anonymous sub{}s, as here, or references to real ones:

$SIG{ INT } = \&interrupt_handling_subroutine;

If a Perl script is interrupted (with Control-C, or similar), it will usually just end there and then. However, if you create an INTerrupt SIGnal handler, this will be called before the script exits, allowing you to clean up any outstanding bits and pieces before the script is really laid to rest. Here we use it to shut down the various connections that the script has opened as it ran. We do this using the DESTROY method, which is called automatically when any object is destroyed (capital letters usually mean SOMETHING AUTOMAGICAL HAPPENING in Perl). Here we just call the DESTROY methods explicitly. You can write your own custom DESTROY subroutine when you code object oriented modules, or rely on perl to do it for you, depending on the complexity (and circularity) of your code. A word of warning: don't try to do anything too clever with $SIG{INT}. Keep it minimal: write an error report, clean up connections to external resources, etc. Note that we had to predeclare $sock and $connection, so that the %SIG hash has access to these lexically scoped variables.

$sock = new IO::Socket::INET
(
    LocalHost => '127.0.0.1',
    LocalPort => 80,
    Proto => 'tcp',
    Listen => 1,
    Reuse => 1,
);
die "Can't create socket: $!\n" unless $sock;
$sock->autoflush(1);

This should be obvious now. Here we have used the explicit IP address 127.0.0.1 of localhost.

my @songs;
find( sub { push @songs, "$File::Find::dir/$_" if /\.mp3$/; }, $LIBRARY );

This script uses File::Find to hunt for mp3's in the path defined in $LIBRARY. I usually put constants (or defaults) like this in capitals at the beginning of the script, so that they are obvious, easy to spot, and most importantly, easy to change.

CONNECTION: while ( $connection = $sock->accept() )
{
    print "Accepting connection from ", $connection->peerhost(), " .\n";
    print $connection "HTTP/1.0 200 OK\n";
    print $connection "Content-Type: audio/x-mp3stream\n";
    print $connection "Cache-Control: no-cache \n";
    print $connection "Pragma: no-cache \n";
    print $connection "Connection: close \n";
    print $connection "x-audiocast-name: Crappy MP3 Server\n\n";

This server is designed to be accessed by HTTP clients, so this is a big old pile of HTTP headers so the client knows what to expect.

    my $song = $songs[ rand @songs ];
    local *SONG;
    open SONG, "<", $song or die "Can't open $song: $!\n";
    print "\tPlaying $song\n";
    binmode( SONG ); #for windows

Here we grab a random song out of the @songs array, and open the file for reading. We've been very good and localised the filehandle (well, the typeglob) with local *SONG. You can't my filehandles, local is the best we can do to scope the variable (beware: local *SONG localises all variables named SONG, i.e. not only does it prevent you mucking up a filehandle called SONG in the body of the script, it will also prevent you accessing any variables named $SONG or %SONG from outside the scope too! local is evil. Use with caution).

The binmode function sets the mode of the filehandle to binary (rather than the default 'text' mode). Under UNIX, this does nothing, but under Windows, you really need to do this. The problem is that Windows thinks a newline character \n is actually a carriage return followed by a formfeed \r\f. UNIX thinks a newline character is just \f, and MacOS<10 thinks it's an \r. This is very confusing. Using binmode for binary filehandles under Windows prevents over-helpful translation of \r\f to \f, which would otherwise mangle your binary files. Hope that's clear: use binmode on Windows if you are opening a binary file.

Now we have opened the file, we are simply going to print it to the socket in 1024 byte chunks, and assume the client at the other end can buffer the data suitably (this is not a perfect solution, but it's easier to understand than streaming data):

    my ( $read_ok, $print_ok ) = ( 1, 1 );
    while ( $read_ok and $print_ok )
    {
        my $chunk;
        $read_ok = read ( SONG, $chunk, 1024 );
        if ( defined $chunk and defined $read_ok )
        {
            $print_ok = print $connection $chunk;
        }
    }
    close SONG;
    unless( defined $print_ok )
    {
        $connection->close();
        next CONNECTION;
    }
}
$sock->close();
exit( 0 );

The only new thing here is the read function: read takes three parameters, a filehandle, a scalar variable in which to dump stuff read from the filehandle, and a number of bytes to read. So here, we read the MP3 file in 1024 byte chunks from the filehandle SONG, dumping the bits into $chunk as we go. read returns the actual number of bytes read, 0 at the end of the file, and undef if something went wrong. Assuming there is something in $chunk, and nothing went wrong, we then print this $chunk to the socket.

You should now be able to run this server, and connect to it with any HTTP aware mp3 client. In Windows media player, it's as simple as File…Open URL…http://127.0.0.1/

And I really think that's enough now!

Next…