Using regexp to select rows in R dataframe
Here you go.
First recreate your data:
dat <- read.table(text="
aName bName pName call alleles logRatio strength
AX-11086564 F08_ADN103 2011-02-10_R10 AB CG 0.363371 10.184215
AX-11086564 A01_CD1919 2011-02-24_R11 BB GG -1.352707 9.54909
AX-11086564 B05_CD2920 2011-01-27_R6 AB CG -0.183802 9.766334
AX-11086564 D04_CD5950 2011-02-09_R9 AB CG 0.162586 10.165051
AX-11086564 D07_CD6025 2011-02-10_R10 AB CG -0.397097 9.940238
AX-11086564 B05_CD3630 2011-02-02_R7 AA CC 2.349906 9.153076
AX-11086564 D04_ADN103 2011-02-10_R2 BB GG -1.898088 9.872966
AX-11086564 A01_CD2588 2011-01-27_R5 BB GG -1.208094 9.239801
", header=TRUE)
Next, use grepl
to construct a logical index of matches:
index1 <- with(dat, grepl("ADN", bName))
index2 <- with(dat, grepl("2011-02-10_R2", pName))
Now subset using the &
operator:
dat[index1 & index2, ]
aName bName pName call alleles logRatio strength
7 AX-11086564 D04_ADN103 2011-02-10_R2 BB GG -1.898088 9.872966
Corrected according Andrie advice. I hope this should work. :)
df[grepl("ADN", df$bName),]
df[grepl("ADN", df$bName) & df$pName == "2011-02-10_R2",]
subset(dat, grepl("ADN", bName) & pName == "2011-02-10_R2" )
Note "&" (and not "&&" which is not vectorized) and that "==" (and not"=" which is assignment).
Note that you could have used:
dat[ with(dat, grepl("ADN", bName) & pName == "2011-02-10_R2" ) , ]
... and that might be preferable when used inside functions, however, that will return NA values for any lines where dat$pName is NA. That defect (which some regard as a feature) could be removed by the addition of & !is.na(dat$pName)
to the logical expression.