First entry from string split

If you need to extract the first (or nth) entry from each split, use:

word <- c('apple-orange-strawberry','chocolate')

sapply(strsplit(word,"-"), `[`, 1)
#[1] "apple"     "chocolate"

Or faster and more explictly:

vapply(strsplit(word,"-"), `[`, 1, FUN.VALUE=character(1))
#[1] "apple"     "chocolate"

Both bits of code will cope well with selecting whichever value in the split list, and will deal with cases that are outside the range:

vapply(strsplit(word,"-"), `[`, 2, FUN.VALUE=character(1))
#[1] "orange" NA  

For example

word <- 'apple-orange-strawberry'

strsplit(word, "-")[[1]][1]
[1] "apple"

or, equivalently

unlist(strsplit(word, "-"))[1].

Essentially the idea is that split gives a list as a result, whose elements have to be accessed either by slicing (the former case) or by unlisting (the latter).

If you want to apply the method to an entire column:

first.word <- function(my.string){
    unlist(strsplit(my.string, "-"))[1]
}

words <- c('apple-orange-strawberry', 'orange-juice')

R: sapply(words, first.word)
apple-orange-strawberry            orange-juice 
                "apple"                "orange"

Tags:

Split

R