Extract string between /

as.numeric(gsub("^.*D([0-9]+).*$", "\\1", mystrings))


Using str_extract from the stringr package:

as.numeric(str_extract(mystrings, perl('(?<=/[A-Z])[0-9]+(?=/)')))

@Arun stole my thunder, so I'm giving my initial long-winded example.

cut.to.pieces <- strsplit(mystrings, split = "/")
got.second <- lapply(cut.to.pieces, "[", 2)
get.numbers <- unlist(got.second)
as.numeric(gsub(pattern = "[[:alpha:]]", replacement = "", x = get.numbers, perl = TRUE))
[1]  2  9 22 22

> gsub("(^.+/[A-Z]+)(\\d+)(/.+$)", "\\2", mystrings)
[1] "2"  "9"  "22" "22"

You would "read" (or "parse") that regex pattern as splitting any matched string into three parts:

1) anything up to and including the first forward slash followed by a sequence of capital letters,

2) any digits(= "\d") in a sequence before the next slash and ,

3) from the next slash to the end.

And then only returning the second part....

Non-matched character strings would be returned unaltered.

Tags:

Regex

R