Condition ( | ) in R formula
The general way it is used is dependent ~ independent | grouping
You can read more here http://talklab.psy.gla.ac.uk/KeepItMaximalR2.pdf
The symbol |
means different things depending on the context:
The general case
In general, |
means OR. General modeling functions will see any |
as a logic operator and carry it out. This is the equivalent of using another operator, eg ^
as in:
lm(y~ x + x^2)
The operator is carried out first, and this new variable is then used to construct the model matrix and do the fitting.
In your code, |
also means OR. You have to keep in mind that R interpretes numeric values also as logical when you use any logical operator. A 0 is seen as FALSE, anything else as TRUE.
So your call to lm
constructs a model of y in function of x OR z
. This doesn't make any sense. Given the values of x
, this will just be y ~ TRUE
. This is also the reason your model doesn't fit. Your model matrix has 2 columns with 1's, one for the intercept and one for the only value in x|z
, being TRUE
. Hence your coefficient for x|z
can't even be calculated, as shown from the output:
> lm(y ~ x|z)
Call:
lm(formula = y ~ x | z)
Coefficients:
(Intercept) x | zTRUE
-0.01925 NA
Inside formulas for mixed models
In mixed models (eg lme4
package), |
is used to indicate a random effect. A term like + 1|X
means: "fit a random intercept for every category in X". You can translate the |
as "given". So you can see the term as "fit an intercept, given X". If you keep this in mind, the use of |
in specifications of correlation structures in eg the nlme
or mgcv
will make more sense to you.
You still have to be careful, as the exact way |
is interpreted depends largely on the package you use. So the only way to really know what it means in the context of the modeling function you use, is to check that in the manual of that package.
Other uses
There are some other functions and packages that use the |
symbol in a formula interface. Also here it pretty much boils down to indicating some kind of group. One example is the use of |
in the lattice graphic system. There it is used for faceting, as shown by the following code:
library(lattice)
densityplot(~Sepal.Width|Species,
data = iris,
main="Density Plot by Species",
xlab="Sepal width")