Classy
You are now an accomplished Perl programmer, and could happily hack away all day writing throwaway scripts. However, sooner or later, you will come to realise that there's more to life than little throwaway scripts. Suddenly, you find that you are copying and pasting reams of code from old scripts into new scripts. Then you find a big old bug in the code you've merrily pasted into forty scripts, and spend a day finding all occurrences of the bug to fix them.
There is another way.
If you ever use the same bit of code two or more times in a single script, you should put it in a subroutine. That way you only have to worry about debugging it once. Likewise, if you ever find yourself using the same subroutine or snippet of code in more than one script, bung it into a module.
Modules are Perl's not-so-secret weapon. If you've not been to CPAN yet, go there now. It's always a good idea (essential, I would argue), to have a look on CPAN before you start any significant project, as the chances are, someone else will have been there before you, written the code, worried about it, debugged it, put fifteen bells and twelve whistles onto it, and released it for all and sundry to use. Don't reinvent the wheel if you don't have to! (Although sometimes it's worth half-reinventing the wheel to prove you can do it for your own satisfaction). So, let's find out how to write a module, which we will imaginatively title MyModule.
There's a lot of things you can mess up if you're writing a module
from scratch, so the best way to do it, even for 'personal' modules you
have no intention of unleashing on the world, is to use a utility called
h2xs. Change to a directory you don't mind creating a
directory called MyModule in, and type:
h2xs -AXn MyModule
at the command prompt. The A and X switches
tell perl to make a vanilla module, not a weird-ass XS C-extension. The n switch
tells perl the name of your module. If all goes well, you will now have a
directory called MyModule containing the files:
Changes Makefile.PL MANIFEST MyModule.pm README test.pl
You needn't worry about:
Changes(changes since your previous release, i.e. none)Makefile.PL(a thing for helping compile your module, only needed if you plan to distribute the module)MANIFEST(a list of the files in the distribution, ditto)README(what the module does, ditto)test.pl(which tests to ensure the module works, which you can write if you like later). This will be replaced by a1.tin a directory called t under newer perls.
unless you actually plan on unleashing your module on the world. The
meat of the module distribution is MyModule.pm
(pm is 'perl module'), which will contain a template
something along the lines of (commenty bits removed):
package MyModule;
use 5.008;
use strict;
use warnings;
require Exporter;
our @ISA = qw(Exporter);
our %EXPORT_TAGS = ( 'all' => [ qw( ) ] );
our @EXPORT_OK = ( @{ $EXPORT_TAGS{'all'} } );
our @EXPORT = qw( );
our $VERSION = '0.01';
# Preloaded methods go here.
1;
__END__
=head1 NAME
MyModule - Perl extension for blah blah blah
=head1 SYNOPSIS
use MyModule;
=head1 DESCRIPTION
Stub documentation for MyModule, created by h2xs.
=head2 EXPORT
None by default.
=head1 AUTHOR
A. U. Thor, E<lt>a.u.thor@a.galaxy.far.far.awayE<gt>
=head1 SEE ALSO
L<perl>.
=cut
Let's take this a bit at a time, so we can find out how to edit this
template to do our bidding. Later on, I would recommend either subverting
the output of h2xs, writing your own boilerplate from
scratch, or using Module::Starter, but for the moment, we'll
look at the dirty details of creating modules without the syntactic
sugar…
package MyModule;
The first thing that should be at
the top of any module is a package statement. If you
read the bit on symbol tables, you
may have a vague idea about what a package is. A
package (or name-space) is a way of letting you use the same
names for variables and subroutines in different parts of a program. For
example:
#!/usr/bin/perl package Foo; $e = "hello"; print "In package Foo, \$e is $e\n"; package Bar; $e = "goodbye"; print "But in package Bar, \$e is $e\n"; print "You can still see \$e in package Foo if you fully qualify it...\n"; print "\$Foo::e is still $Foo::e\n";
In package Foo, $e is hello But in package Bar, $e is goodbye But you can still see $e in package Foo if you fully qualify it... $Foo::e is still hello
In the same way that a command shell will assume you mean the
file.ext in the current working directory, perl assumes you
mean the variable called $e in the current
package. The reason you've not seen the word
package at the top of every script so far is that perl
automatically assumes you are working in package main;
unless you tell it otherwise explicitly. Think of main as
your home package if you like. If you want to fiddle with things from
other packages, you'll need to 'fully qualify' their names
with :: double colons, which are similar to the
/ delimiter in the shell. If you think of
package like chdir, and :: as
/, all will become clear. So the package variable:
$e
in package Foo is called:
$Foo::e
and the subroutine:
function()
in package Foo::Parp is called:
&Foo::Parp::function()
if you have to fully qualify them. Note in the second case that you
can have subpackages (of a sort) with more than one ::
double colon. The reason we create modules in new packages is that if we
wrote this:
# my module $x = "blah"; # my script $x = "bobble";
then when we used the module, our script would overwrite the module's
definition of $x, because they would share the same
namespace. When you create modules, you create a new namespace where you
can make and manipulate variables to your heart's content without having
to worry about trashing other people's variables and subroutines of the
same name in other packages. (Note that lexical variables
don't suffer from this problem, which is another reason to use
strict;).
That's pretty largely all there is to packages. You can
define several in one file, or spread one over several files, but the
'natural' size of one package is one file. If you create a
file called MyModule.pm, and let it contain the
package MyModule, then perl will be very happy, and the rest
of this tutorial will be nice and easy. Otherwise you're on your own!
Next up:
use 5.008; use strict; use warnings;
For the sake of paranoia, use 5.008; means
'die if the version of perl you're running is less than
5.008'. This may be important if you're using something new for perl,
like Unicode support, that old versions of perl don't support. use
strict; is something you ought to have been doing for a while now,
and use warnings; is the newer and better way of saying
-w that we've been using for a while now. Incidentally, if
you hadn't realised, every time you've written use strict;
or use ANYTHING; at the top of a script, you've been using
other people's modules. Modules written with lowercase names like
strict are often called pragmata or pragma modules:
they generally affect how perl deals with your script itself, rather than
giving you extra functionality.
require Exporter;
Now we get into the nitty gritty. For the moment, we'll look at the
require keyword: Exporter is just a perl module
that exports things (like subroutines) from one package to
another. require is very similar to use in that
it loads in the contents of a module, so that you have access to its
functions from your scripts.
The difference between require and use is
that require doesn't import any functions into your
package, and it does what it does at compile-time rather
than run-time.
Qué?! Well, if you were writing a script (which by default would
define itself in package main), and you wanted to use the
function parse() from package MyModule, you
have two ways of doing it.
You can require MyModule; and then call
the function with 'fully qualified' names (the :: double
colon syntax):
# we're in package main if we don't say we're not require MyModule; my ( @parsed ) = MyModule::parse( @things_to_parse );
Alternatively, you can use MyModule;
which (if suitably set up) will export the function
parse() from package MyModule into
package main (or wherever you're working), so you can use it
more easily:
# use exports the functions from package MyModule to package main use MyModule; my ( @parsed ) = parse( @things_to_parse );
No need to fully qualify the function name. When you require
Exporter; you are asking perl to read in the Exporter
module, but not to import any functions from it. As we don't actually
want to import functions from the Exporter module, we
require, not use it. The other thing about
use its that it does its thing at compile-time, rather than
run-time: this means that when your script is compiled by perl, it will
check to see if you have all the requisite modules before executing
anything, and if you don't have them all, it will die.
require doesn't do this compile-time checking.
You may be able to guess therefore, that use MyModule; is
exactly equivalent to:
BEGIN { require MyModule; import( MyModule ) }
BEGIN{} is a special block that is automagically called
by perl when its starts: it makes things happen at the very beginning of
compiling a script (END{} is similar: it's executed just
before your program ends). This effects the compile-time checking.
import() is just a subroutine in the
MyModule.pm file that tells perl which functions to import
into the caller's namespace (i.e. the package,
probably main, that the script use-ing the
module is working in). This effects the function importing.
Now it's all very well saying perl will import functions from one package to another, but where does perl look for these packages in the first place? Well, when you create a perl module, you need to save it somewhere perl can find it. Use this:
#!/usr/bin/perl print "$_\n" foreach @INC;
to list the places in your computer's filesystem that perl will search
for modules in. @INC is like the PATH
environmental variable for perl. You'll notice that ".", the
current working directory (CWD), is one of the places on the list. So if
you put MyModule.pm in your CWD, it will be found and used
by perl when a script says use MyModule;. What about that
Package::Subpackage business? If you create a directory
called MyModule in the CWD (say D:/Steve/),
then create a file called Subpackage.pm, perl would look for
the package MyModule::Subpackage in
D:/Steve/MyModule/Subpackage.pm. See what I mean about
:: being like the path delimiter / ?
So now you know how to go about writing a module: you simply need to
write some functions, and write a subroutine called import
that exports these functions from one package to another. The latter is a
simple matter of setting a typeglob in the caller's symbol table to a
reference to the subroutine you wish to export.
Erm, yeah. In fact, almost no-one rolls their own import
function. Almost everyone just borrows the one in Exporter,
which is what:
our @ISA = qw( Exporter );
is for. @ISA (that's @rray 'is a') is where you can put
the names of modules that you want perl to search in, to find functions
you can't be bothered to define. So, if you can't be bothered to
define import() yourself, you can tell perl to look for this
function in Exporter.pm instead, hence:
our @ISA = qw( Exporter ); # MyModule IS A Exporter, and inherits functions # I can't be bothered to define from it
So now, when a script use-s MyModule, it
will use the import() method from the Exporter
module to furnish the script with whatever functions you chose to export
from MyModule.pm. Hope this is all clear!
Well, perhaps not entirely clear, if you're
wondering what our does. As you may have
guessed, our is related to my. When you
use strict; all variables have to be nailed down to a
particular lexical scope with my, and will disappear from
the symbol table, making them inaccessible from other scopes and
packages. If they're not nailed down, perl will barf. This you know.
However, what happens if you do want someone to be able to see the value
of a variable in your module? For example, in the module File::Find, the variable
$dir contains the current directory being processed, which
is a useful bit of information for scripts using the module. But if you
make $dir a lexically scoped my variable, it
will be invisible outside of the scope in which it is created. For
modules, this means invisible outside of the module itself.
Oops.
This is what our is for. our
explicitly allows you to share nasty global variables, which is
exactly what strict doesn't like. our allows
you to circumvent strict for variables you really
do want to be accessible from anywhere using the
$Package::variable or @MyModule::ISA notation.
Since @ISA needs to be visible outside the scope in which it
is defined (Exporter uses it), we must our it,
not my it.
That's the worst bit over. The rest of it is just prettification of
the interface. The next lines of MyModule tell the
Exporter module which functions to export from the module if
someone use-s it.
our %EXPORT_TAGS = ( 'all' => [ qw( ) ] );
our @EXPORT_OK = ( @{ $EXPORT_TAGS{'all'} } );
our @EXPORT = qw( );
our $VERSION = '0.01';
$VERSION is obvious. Like use 5.008; you can
also use MyModule 0.02; This makes perl die if
the version of MyModule you have is older than the version
you want to use.
@EXPORT is the easiest way of exporting functions. If
your module contained three functions sublime(),
boil() and melt(), and you wanted to export all
of them to the caller's namespace:
our @EXPORT = qw( sublime boil melt );
would do just that. However, people usually prefer to selectively
import functions, and the use of @EXPORT is discouraged
unless your module is just one or two functions (like
File::Find or File::Path). This is what
@EXPORT_OK is for. Ignore the @{ $EXPORT_TAGS{'all'}
} bit for the minute. If you wanted people to be able to import
these three functions selectively, you could do this:
our @EXPORT_OK = qw( sublime boil melt );
Then users of your module could:
use MyModule "sublime", "boil"; # or use MyModule qw( sublime boil ); # avoid all those quotes
if they had no interest in importing the melt() function
and polluting their namespace.
Finally, the %EXPORT_TAGS is very useful: it allows you
to define groups of functions to export (see
CGI for an example). Say you want people to be able to
import your three functions as a lump without having to go to all the
trouble of writing three whole things:
use MyModule qw( sublime boil melt );
you can create an export tag called all, which contains
all three functions. %EXPORT_TAGS is just a hash of
key/value pairs. The keys are the names of the tags you want
to define, and the values are an arrayref of the functions
you want to dump in the tag:
our %EXPORT_TAGS = ( 'all' => [ qw( sublime boil melt ) ] ); # or our %EXPORT_TAGS = ( 'all' => [ "sublime", "boil", "melt" ] );
With this defined, you can:
use MyModule qw( :all );
and Exporter will conveniently translate the tag
:all into the list of three functions you have defined with
the all key in the %EXPORT_TAGS hash. If you do
define an :all tag, which is probably good practice, you can
then use it in @EXPORT_OK:
our @EXPORT_OK = ( @{ $EXPORT_TAGS{'all'} } );
That is, it's OK to export all the functions referred to by the
all value out of %EXPORT_TAGS. Note the
@{ } dereferencing syntax from the last lesson.
Finally, after all the package, exportation and global
variables nonsense, we finally get onto the beef:
# Preloaded methods go here. 1; __END__
This bit is just a perl program. Go write it in the space #
Preloaded methods go here. Mostly, you'll only be defining
subroutines here, since these are what you usually want to export. The
1; is needed because all modules have to return TRUE when
they load: this ensures they do. The __END__ token is a
signal to perl to stop reading, since after this comes the documentation
for the module, and this is of interest only to perldoc, not
to perl itself.
Perldocumenting yourself
=head1 NAME MyModule - Perl extension for blah blah blah =head1 SYNOPSIS use MyModule; =head1 DESCRIPTION Stub documentation for MyModule, created by h2xs. =head2 EXPORT None by default. =head1 AUTHOR A. U. Thor, E<lt>a.u.thor@a.galaxy.far.far.awayE<gt> =head1 SEE ALSO L<perl>. =cut
Perl documentation is written in POD (plain old documentation) format,
which is a markup language like HTML, but simpler. perldoc
can read and display the POD embedded in a module, which makes it the
perfect tool for documenting your module so you don't forget how it
works, and so others can use it without getting up close and personal
with the source code. Things starting = are processing
directives. I think you can guess what head1 and
head2 do. =cut is the signal for the end of the
POD. Some other useful directives are:
=over 4
and
=back
=over indents the text by some amount (here 4 spaces),
and =back restores the indent to 0. You'll notice that if
you want a newline in your POD, you need a blank line: POD is otherwise
newline-insensitive.
=item * function()
is used to create itemised lists, with a pretty * as a
bullet point. Like HTML, POD uses angle brackets to mark up certain bits
of text, but unlike HTML/XML (with its <open-tag>
</close-tag> syntax), the thing you want to italicise, or
whatever, goes inside the brackets:
I<text>
will put text in italics. B<text> does
bold, C<blah> does code, L<foobar>
does links (here L<perl> links to the perl manpages),
and E<> does escapes like E<lt> and
E<gt> for < and >.
Documenting your code is essential if you want people to use it: don't
fall into the trap of assuming a) everyone's stupid and you're going to
let them wallow in it or b) everyone will know how to use your code by
osmosing it in. Documentation is extremely important: if you have a
memory like mine, you won't remember how to use your own scripts in six
month's time, so write the documentation now, so you don't have to
remember the entire script later. Nuff rant. The easiest way to learn POD
documentation is to use perldoc to read some prettily
formatted, then look at the module itself to see what it looks like in
code. It's not very difficult. Just do it.
My first module
The hello world module. I think this should all be very obvious
(srand seeds perl's random number generator,
rand(NUMBER) generates a random number between 0 and NUMBER,
and ||= is an assignment operator for ||, which
is a perl idiom for 'default': A ||= B is the shorthand for
A = A || B, which means 'A equals B unless A already equals
something other than 0 or undef'):
package Hello;
use 5.006;
use strict;
use warnings;
require Exporter;
our @ISA = qw( Exporter );
#no need for export tags or for export_ok in a single function module
our @EXPORT = qw( hello );
our $VERSION = '0.01';
srand;
sub hello
{
my $name = shift;
$name ||= "you";
my $message = rand(1) > 0.5 ? "a waste of time" : "a lot of fun";
return "Hello, $name, isn't this $message?\n";
}
1; # Magical TRUE value that all modules must return when they are loaded
__END__
=head1 NAME
Hello - Perl extension for printing a stupid message
=head1 SYNOPSIS
use Hello;
$msg = hello( "Steve" );
print $msg;
=head1 DESCRIPTION
Stub documentation for MyModule, created by h2xs. It looks like the author
of the module took careful note of the importance of documentation,
and here it is:
=head2 EXPORTED FUNCTIONS
=item * hello( $arg )
=over 4
Randomly prints one of two stupid message for $arg, which should be a name,
but will default to 'you'.
=back
=head1 AUTHOR
Steve Cook, E<lt>steve@steve.gb.comE<gt>
=head1 SEE ALSO
L<perl>.
=cut
Then all we need to do is save the module in the root of one of the
directories in @INC (i.e. the CWD, or similar)
and:
#!/usr/bin/perl use strict; use warnings; use Hello; hello( "Perl novice" );
Object orientation
Well, that's how to write a perl module that exports some functions that others might find useful. What about you Java programmers who just have to encapsulate everything into an object? For those who have no idea what an object is, think of Windows, or the Gnome desktop: object oriented programming (OOP) doesn't really have anything to do with graphical user interfaces, but they are similar in that they abstract the implementation from the interface: it doesn't matter how icky the goo of code and data under the bonnet is, all you get to see is the shiny buttons and pretty output.
Some definitions: an object is a thingy (in perl, objects generally are thingies, i.e. a gelatinous mass of references), containing data which has some associated methods, which do something to the data when you call them. Object oriented programming has a lot of pretentious terminology, so keep you eye out for high faluting words for simple ideas. The main idea of OOP is to keep data and the functions that manipulate that data together in an otherwise opaque object. Simple as that.
In object oriented programming, everything starts by creating an
object of a particular class, usually
with the new method:
my $cat = Cat->new(); # create new object $cat of class Cat
and continues by making the object do things to itself, such as with a
method called feed:
$cat->feed( "Mechanically recovered meat sludge" );
# invoke method feed on object $cat
If you were writing this with 'normal' non-OO perl, you might create a
hashref called $cat:
$cat = { stomach => "empty" };
and write a function called feed():
sub feed
{
my ( $cat, $food ) = @_;
$cat->{ stomach } = $food;
# better start getting used to these reference thingies
}
so you could call:
feed( $cat, "Mechanically recovered meat sludge" );
to feed the cat. However, in OOP, the data and the functions (methods)
are incestuously tied up with each other. This is bad (as it makes OO
programs chunkier and slower) and good (because it hides all the
implementation under the bonnet, and keeps the data within the object,
rather than cluttering up your program with lots of variables). Although
the non-OO program above with $cat and sub feed
works fine, you have to worry about the $cat, what its keys
and values are, what the return values of sub feed are, and
the fact that cat is a hashref (not an arrayref), and so on. And so would
anyone else trying to write new functions for the cat such as
worm and spay. In OOP, the object is the centre
of all data and manipulations thereof. OO encapsulates all the details of
what is going on, so the user doesn't have to see the code's innards, and
presents them in a black box with big shiny buttons called methods. All
you need to know is which buttons to press (see the Windows analogy); you
need know nothing about what is going on inside.
As a user of the code anyway. If you want to write the code, you'll have to know the guts intimately. Objects are implemented by simple modules in perl 5. In fact, lets dump the terminology for a minute: in Perl:
- A class is just a module.
- An object is just a reference or similar thingy.
- A method is just a subroutine.
Classes are actually easier to write than vanilla modules at first.
Here is the start of an OO Cat module:
package Cat; use 5.008; use strict; use warnings; our $VERSION = '0.01'; our @ISA = (); #We'll fill in the gaps here presently 1; __END__
There's no need to worry about exporting functions, as the whole point
of objects is that objects look after their own functions (methods)
themselves. Hence, no @EXPORT, etc.
@ISA takes on a special importance in OO programming. As we
said earlier, @ISA contains places to look if you can't find
a function in the module itself. In OO programming, looking somewhere
else is called inheriting methods. We'll cover this
presently.
Now, as you may have gathered, Cat is actually a
class, not an object. An object is a
particular instance of a class: the object Steve Cook is
a particular instance of class Human, perhaps. The class (module)
provides the code to generate new objects, hence every class needs
something to make new objects with, a 'class
method' called a constructor that
instantiates new objects. In perl, you can call this
method anything you like, but it's best to stick with common parlance and
call it new like everyone else:
sub new
{
my $class = shift;
my $self = { stomach => "empty" }; # lovely hashrefs
bless $self, $class;
return $self;
}
This method can be called in two equivalent ways in a script:
use Cat; my $mr_tibbles = new Cat; my $mrs_tibbles = Cat->new();
I prefer the latter (the former can lead to some nasty syntactic
ambiguities). The new method is just a subroutine, a factory
for making objects of class Cat. When you create an OO
module, you need to be aware of one extremely important fact: the
name of the class, or the object you call a method on, is the first thing
in the @_ of the subroutine that implements it.
So:
Cat->new();
will do something along the lines of calling the function new(
"Cat" ); in package Cat. This seems fairly obvious,
but wait till we get to 'object methods'. So the
new() method we wrote gets "Cat" when it it called and it
shifts this into $class. So it will know what
sort of an object it should make. As an aside, don't be tempted to
hardcode the class, as in:
$class = "Cat";
because this will break should anyone want to make a 'subclass' out of your
class: if someone wants to implement a class called
Tabby and inherit your new()
constructor, the hardcoded new() will merrily make objects
of the wrong class (i.e. Cat, not
Tabby). This is a Bad Thing.
Next, the constructor creates the data the object needs. This is
conventionally called $self, but doesn't have to be. This is
conventionally a hashref, but doesn't have to be. TIMTOWTDI. Here, we are
implementing the Cat as a hash. Almost. Perl objects are
scalars, so we'll actually use an anonymous hashref. In this we put our 'stomach'
stuff. Then comes the important bit. We know our class. We have our data.
We need to glue these together to form an object. bless does
this:
bless $self, $class;
makes the data in $self an instance of class
$class. And it returns this blessed hashref, to be captured
by our user's script in $mr_tibbles. That is all there is to
constructing an object: in fact, all a constructor need really do is:
sub new { bless {}, $_[0] } # perl's smallest constructor
Now, if you wanted to see what $mr_tibbles actually looks
like on the inside, you can investigate him using the dereferencing
operator->, so:
$contents = $mr_tibbles->{ "Stomach" };
will get you 'empty'. To be really clever:
use Data::Dumper; Dumper( $mr_tibbles );
Will spray $mr_tibbles 's guts out all over the screen.
However, such direct dissection is generally considered extremely bad OO
form. The only way to investigate $mr_tibbles should be
via the object methods (big shiny buttons ahoy)
that we can call on him:
sub feed
{
my ( $self, $food ) = @_;
$self->{ stomach } = $food if defined $food;
return $food;
}
feed() is such a method. You call the method with a
-> (which is the same as . for most OO
languages, and due to mutate in Perl
6):
$mr_tibbles->feed( "Mechanically recovered meat sludge" );
The -> here is being used not to dereference
a reference, but to call a method on $mr_tibbles. This dual
use for -> confused the life out of me at first, but if
you're careful to note the brackets, you'll be OK:
$thing->{ key };
# hashref dereference, note the {}
$thing->[ index ];
# arrayref dereference, note the []
$thing->( args );
#coderef dereference, note the ()
$thing->method( args );
# method call on object $thing, optional arguments in ()
Now, remember what I said: the object ($mr_tibbles) you
call an method on is the first thing passed in @_. So to the
method feed, @_ is ( $mr_tibbles,
"Mechanically recovered meat sludge" ). These are assigned to
$self and $food respectively. Then, if the
$food is defined, it's put into $mr_tibbles 's
stomach with $self->{ stomach } = $food;
If no food is passed:
$contents = $mr_tibbles->feed();
does nothing to $mr_tibbles: the stomach contents are
unchanged. However, via return $food; the method
can both alter (mutate) $mr_tibbles's
stomach contents and just report (access) what he's
eaten. How useful.
The ref operator will usually return what a reference
refers to (ARRAY, SCALAR, HASH, etc.), as you know. However, if
we call it on an object, it will return the class the object belongs to.
So:
ref ( $mr_tibbles );
This can be useful for debugging. We will now add some more object methods:
sub hairball
{
my ( $self ) = @_;
my $vomit = $self->feed();
$self->feed( "empty" );
return $vomit;
}
This demonstrates that you could (and probably should) use methods even within the class. You could've written:
sub hairball
{
my ( $self ) = @_;
my $vomit = $self->{ stomach };
$self->{ stomach } = "empty";
return $vomit;
}
and manipulated the cat's innards directly, but using the first version protects you from your own changes to your own code: let your methods do everything for you and it will save you a lot of grief when you decide to rearrange the innards of the cat later.
A little earlier, we mentioned inheritance, but what is it? To see,
let's implement a rudimentary Tabby class that inherits from
Cat.
package Tabby;
use strict;
# blah blah blah,
our @ISA = qw( Cat );
sub miaow
{
my( $self ) = @_;
print "Miaow\n";
}
1;
When you:
my $tabitha = new Tabby;
you'll get a new Tabby cat. Even though there's no method
called new() in package Tabby. That's where
@ISA comes in: a Tabby IS A Cat,
and if perl can't find the relevant method in Tabby, it'll
search the packages in @ISA (i.e. Cat) to find
the method instead. So Tabby does exactly what
Cat does, only you can make her miaow. Wow. If
you wanted to be more practical, you could define your own
new:
package Tabby;
use strict;
# blah blah blah,
our @ISA = qw( Cat );
sub new
{
my ( $class ) = @_;
my $self = $class->SUPER::new();
# inherit cattiness from SUPER-class, i.e. Cat, by calling the
# superclass's constructor
$self->{ breed } = "Tabby";
bless $self, $class; # re-bless the cat into a tabby
return $self;
}
sub breed
{
my ( $self ) = @_;
return $self->{ breed };
}
1;
When we make a new Tabby, we're actually making a new
Cat, by calling the SUPER::new() method. We
then shove an extra bit of information into the $self
hashref, and rebless it.
What happens if you want your whole class to have some data (rather than each individual object)? Say you want to know how many cats you have created:
package Cat;
my $census = 0;
sub new
{
my $class = shift;
my $self = { stomach => "empty", _census => \$census };
bless $self, $class;
++ ${ $self->{ _census } };
return $self;
}
sub census
{
$self = shift;
return ref $self ? ${ $self->{ _census } } : $census;
}
sub DESTROY
{
$self = shift;
-- ${ $self->{ _census } };
}
The part that initially implements the census is:
my $self = { stomach => "empty", _census => \$census };
Now, you may be wondering why we're using some tortuous scalar reference, and then having to do some horrible backflips to increase the census by one later:
++ ${ $self->{ _census } };
and to retrieve it in the method census(), (note this can
and should be able to be called as a class or object method):
return ref $self ? ${ $self->{ _census } } : $census;
and to decrement it in the destructor method
DESTROY, which is automatically called when an object is
destroyed:
--${ $self->{ _census } };
The reason for the scalar reference is that if we don't take a
reference to $census, and instead try to decrement
$census directly in DESTROY, we could end up
decrementing the wrong $census if our object methods
(e.g. DESTROY) were inherited. For example, if
Tabby inherited Cat's DESTROY
method and you directly decremented $census in this method,
you'd end up decrementing the Cat census, not the
Tabby census. This is because the $census that
DESTROY can see is the one defined in package
Cat. Technically 'object methods execute in the context in which
they were defined (i.e. package Cat), not in the
context that invoked them (i.e. package Tabby)'.
Decrementing the Cat census when a Tabby is
DESTROYed is probably not what you want to do (or it might
be: either way, you need to think about it). However, by always using a
reference to $census, we ensure that if Tabby
inherits DESTROY, but supplies its own $census
class data and new constructor, then DESTROY
will decrement the Tabby $census, not the Cat
$census. (Read that again until it makes sense!).
You may also be wondering why the underscore in _census.
The reason for the underscore is that _hashkeys and
_methods look special to C++ programmers, since they
indicate that the data are private. In OO perl, it's considered bad form
for a script to mess with the insides of an object (like the value of
$self->{ stomach }) directly. It's considered
unforgivably bad form to mess with a private $self->{
_underscored } value. You can do it (unlike in C++, where
private means private, with razor wire), but there's
probably a very good reason why you shouldn't.
There's loads more to OO programming, if you're interested, try
perldoc perltoot, which'll tell you even more of the gory
details of method inheritance, multiple inheritance (a class can inherit
from more than one parent), the SUPER and
UNIVERSAL classes, and AUTOLOAD. A brief word
on the last: when you write OO perl, you will soon get bored of creating
a thousand methods all of the form:
sub X
{
my ( $self, $x ) = @_;
$self->{ X } = $x if defined $x;
return $x;
}
to access the data in the object. Autoloading allows perl to mimic methods for these, so you don't have to. Autoloading is a dirty hack though so don't use it. ☺.
