How can I replace one term in an R formula with two?
You can use the substitute
function for this
substitute(y ~ b*x + z, list(x=quote(x_part1 + x_part2)))
# y ~ b * (x_part1 + x_part2) + z
Here we use the named list to tell R to replace the variable x
with the expression x_part1 + x_part2
You can write a recursive function to modify the expression tree of the formula:
replace_term <- function(f, old, new){
n <- length(f)
if(n > 1) {
for(i in 1:n) f[[i]] <- Recall(f[[i]], old, new)
return(f)
}
if(f == old) new else f
}
Which you can use to modify eg interactions:
> replace_term(y~x*a+z - x, quote(x), quote(x1 + x2))
y ~ (x1 + x2) * a + z - (x1 + x2)
How about working with the formula as a string? Many base R models like lm()
accept a string formulas (and you can always use formula()
otherwise). In this case, you can use something like gsub()
:
f1 <- "y ~ x + z"
f2 <- "y ~ b*x + z"
gsub("x", "(x_part1 + x_part2)", f1)
#> [1] "y ~ (x_part1 + x_part2) + z"
gsub("x", "(x_part1 + x_part2)", f2)
#> [1] "y ~ b*(x_part1 + x_part2) + z"
For example, with mtcars
data set, and say we want to replace mpg
(x) with disp + hp
(x_part1 + x_part2):
f1 <- "qsec ~ mpg + cyl"
f2 <- "qsec ~ wt*mpg + cyl"
f1 <- gsub("mpg", "(disp + hp)", f1)
f2 <- gsub("mpg", "(disp + hp)", f2)
lm(f1, data = mtcars)
#>
#> Call:
#> lm(formula = f1, data = mtcars)
#>
#> Coefficients:
#> (Intercept) disp hp cyl
#> 22.04376 0.01017 -0.02074 -0.56571
lm(f2, data = mtcars)
#>
#> Call:
#> lm(formula = f2, data = mtcars)
#>
#> Coefficients:
#> (Intercept) wt disp hp cyl
#> 20.421318 1.554904 0.026837 -0.056141 -0.876182
#> wt:disp wt:hp
#> -0.006895 0.011126