gsub return an empty string when no match is found

I'd probably go a different route, since the sapply doesn't seem necessary to me as these functions are vectorized already:

fun <- function(x){
    ind <- grep(".*(Ref. (\\d+)).*",x,value = FALSE)
    x <- gsub(".*(Ref. (\\d+)).*", "\\1", x)
    x[-ind] <- ""
    x
}

fun(data)

according to the documentation, this is a feature of gsub it returns the input string if there are no matches to the supplied pattern matches returns the entire string.

here, I use the function grepl first to return a logical vector of the presence/absence of the pattern in the given string:

ifelse(grepl(".*(Ref. (\\d+)).*", data), 
      gsub(".*(Ref. (\\d+)).*", "\\1", data), 
      "")

embedding this in a function:

mygsub <- function(x){
     ans <- ifelse(grepl(".*(Ref. (\\d+)).*", x), 
              gsub(".*(Ref. (\\d+)).*", "\\1", x), 
              "")
     return(ans)
}

mygsub(data)

xs <- sapply(data, function(x) gsub(".*(Ref. (\\d+)).*", "\\1", x))
xs[xs==data] <- ""
xs
#[1] "Ref. 12" ""       

Tags:

Regex

R

Gsub