What does "S3 methods" mean in R?
From http://adv-r.had.co.nz/OO-essentials.html:
R’s three OO systems differ in how classes and methods are defined:
S3 implements a style of OO programming called generic-function OO. This is different from most programming languages, like Java, C++ and C#, which implement message-passing OO. With message-passing, messages (methods) are sent to objects and the object determines which function to call. Typically, this object has a special appearance in the method call, usually appearing before the name of the method/message: e.g. canvas.drawRect("blue"). S3 is different. While computations are still carried out via methods, a special type of function called a generic function decides which method to call, e.g., drawRect(canvas, "blue"). S3 is a very casual system. It has no formal definition of classes.
S4 works similarly to S3, but is more formal. There are two major differences to S3. S4 has formal class definitions, which describe the representation and inheritance for each class, and has special helper functions for defining generics and methods. S4 also has multiple dispatch, which means that generic functions can pick methods based on the class of any number of arguments, not just one.
Reference classes, called RC for short, are quite different from S3 and S4. RC implements message-passing OO, so methods belong to classes, not functions. $ is used to separate objects and methods, so method calls look like canvas$drawRect("blue"). RC objects are also mutable: they don’t use R’s usual copy-on-modify semantics, but are modified in place. This makes them harder to reason about, but allows them to solve problems that are difficult to solve with S3 or S4.
There’s also one other system that’s not quite OO, but it’s important to mention here:
- base types, the internal C-level types that underlie the other OO systems. Base types are mostly manipulated using C code, but they’re important to know about because they provide the building blocks for the other OO systems.
Most of the relevant information can be found by looking at ?S3
or ?UseMethod
, but in a nutshell:
S3 refers to a scheme of method dispatching. If you've used R for a while, you'll notice that there are print
, predict
and summary
methods for a lot of different kinds of objects.
In S3, this works by:
- setting the class of objects of
interest (e.g.: the return value of a
call to method
glm
has classglm
) - providing a method with the general
name (e.g.
print
), then a dot, and then the classname (e.g.:print.glm
) - some preparation has to have been
done to this general name (
print
) for this to work, but if you're simply looking to conform yourself to existing method names, you don't need this (see the help I refered to earlier if you do).
To the eye of the beholder, and particularly, the user of your newly created funky model fitting package, it is much more convenient to be able to type predict(myfit, type="class")
than predict.mykindoffit(myfit, type="class")
.
There is quite a bit more to it, but this should get you started. There are quite a few disadvantages to this way of dispatching methods based upon an attribute (class) of objects (and C purists probably lie awake at night in horror of it), but for a lot of situations, it works decently. With the current version of R, newer ways have been implemented (S4 and reference classes), but most people still (only) use S3.
To get you started with S3, look at the code for the median
function. Typing median
at the command prompt reveals that it has one line in its body, namely
UseMethod("median")
That means that it is an S3 method. In other words, you can have a different median
function for different S3 classes. To list all the possible median methods, type
methods(median) #actually not that interesting.
In this case, there's only one method, the default, which is called for anything. You can see the code for that by typing
median.default
A much more interesting example is the print
function, which has many different methods.
methods(print) #very exciting
Notice that some of the methods have *
s next to their name. That means that they are hidden inside some package's namespace. Use find
to find out which package they are in. For example
find("acf") #it's in the stats package
stats:::print.acf
I came to this question mostly wondering where the names came from. It appears from this wikipedia article that the name refers to the version of the S Programming Language that R is based on. The method dispatching schemes described in the other answers come from S and are labelled appropriately according to version.