Is `if` faster than ifelse?

This is more of an extended comment building on Roman's answer, but I need the code utilities to expound:

Roman is correct that if is faster than ifelse, but I am under the impression that the speed boost of if isn't particularly interesting since it isn't something that can easily be harnessed through vectorization. That is to say, if is only advantageous over ifelse when the cond/test argument is of length 1.

Consider the following function which is an admittedly weak attempt at vectorizing if without having the side effect of evaluating both the yes and no conditions as ifelse does.

ifelse2 <- function(test, yes, no){
 result <- rep(NA, length(test))
 for (i in seq_along(test)){
   result[i] <- `if`(test[i], yes[i], no[i])
 }
 result
}

ifelse2a <- function(test, yes, no){
  sapply(seq_along(test),
         function(i) `if`(test[i], yes[i], no[i]))
}

ifelse3 <- function(test, yes, no){
  result <- rep(NA, length(test))
  logic <- test
  result[logic] <- yes[logic]
  result[!logic] <- no[!logic]
  result
}


set.seed(pi)
x <- rnorm(1000)

library(microbenchmark)
microbenchmark(
  standard = ifelse(x < 0, x^2, x),
  modified = ifelse2(x < 0, x^2, x),
  modified_apply = ifelse2a(x < 0, x^2, x),
  third = ifelse3(x < 0, x^2, x),
  fourth = c(x, x^2)[1L + ( x < 0 )],
  fourth_modified = c(x, x^2)[seq_along(x) + length(x) * (x < 0)]
)

Unit: microseconds
            expr     min      lq      mean  median       uq      max neval cld
        standard  52.198  56.011  97.54633  58.357  68.7675 1707.291   100 ab 
        modified  91.787  93.254 131.34023  94.133  98.3850 3601.967   100  b 
  modified_apply 645.146 653.797 718.20309 661.568 676.0840 3703.138   100   c
           third  20.528  22.873  76.29753  25.513  27.4190 3294.350   100 ab 
          fourth  15.249  16.129  19.10237  16.715  20.9675   43.695   100 a  
 fourth_modified  19.061  19.941  22.66834  20.528  22.4335   40.468   100 a 

SOME EDITS: Thanks to Frank and Richard Scriven for noticing my shortcomings.

As you can see, the process of breaking up the vector to be suitable to pass to if is a time consuming process and ends up being slower than just running ifelse (which is probably why no one has bothered to implement my solution).

If you're really desperate for an increase in speed, you can use the ifelse3 approach above. Or better yet, Frank's less obvious* but brilliant solution.

  • by 'less obvious' I mean, it took me two seconds to realize what he did. And per nicola's comment below, please note that this works only when yes and no have length 1, otherwise you'll want to stick with ifelse3

if is a primitive (complied) function called through the .Primitive interface, while ifelse is R bytecode, so it seems that if will be faster. Running some quick benchmarks

> microbenchmark(`if`(TRUE, "a", "b"), ifelse(TRUE, "a", "b"))
Unit: nanoseconds
                   expr  min   lq    mean median     uq   max neval cld
 if (TRUE) "a" else "b"   46   54  372.59   60.0   68.0 30007   100  a 
 ifelse(TRUE, "a", "b") 1212 1327 1581.62 1442.5 1617.5 11743   100   b

> microbenchmark(`if`(FALSE, "a", "b"), ifelse(FALSE, "a", "b"))
Unit: nanoseconds
                    expr  min   lq    mean median   uq   max neval cld
 if (FALSE) "a" else "b"   47   55   91.64   61.5   73  2550   100  a 
 ifelse(FALSE, "a", "b") 1256 1346 1688.78 1460.0 1677 17260   100   b

It seems that if not taking into account the code that is in actual branches, if is at least 20x faster than ifelse. However, note that this doesn't account the complexity of expression being tested and possible optimizations on that.

Update: Please note that this quick benchmark represent a very simplified and somewhat biased use case of if vs ifelse (as pointed out in the comments). While it is correct, it underrepresents the ifelse use cases, for that Benjamin's answer seems to provided more fair comparison.