Using grep to help subset a data frame in R
It's pretty straightforward using [
to extract:
grep
will give you the position in which it matched your search pattern (unless you use value = TRUE
).
grep("^G45", My.Data$x)
# [1] 2
Since you're searching within the values of a single column, that actually corresponds to the row index. So, use that with [
(where you would use My.Data[rows, cols]
to get specific rows and columns).
My.Data[grep("^G45", My.Data$x), ]
# x y
# 2 G459 2
The help-page for subset
shows how you can use grep
and grepl
with subset
if you prefer using this function over [
. Here's an example.
subset(My.Data, grepl("^G45", My.Data$x))
# x y
# 2 G459 2
As of R 3.3, there's now also the startsWith
function, which you can again use with subset
(or with any of the other approaches above). According to the help page for the function, it's considerably faster than using substring
or grepl
.
subset(My.Data, startsWith(as.character(x), "G45"))
# x y
# 2 G459 2