Complexe non-equi merge in R

So I was very close. I had 2 problems, first a bad installation of the data.table package (Data table error could not find function ".") caused an obscure error.

After having fixed that, I got closer an found that :

dt1[dt2, on=.(sp=sp, dbh>=dbh_min, dbh<=dbh_max), nomatch=0]

gave me what I wanted with a bad dbh column. Inverting the command with:

dt2[dt1, on=.(sp=sp, dbh_min<=dbh, dbh_max>=dbh)]

fixed the problem with only one useless extra column.


For "between" joins like this one, one could also use data.table::foverlaps, which joins two data.table's on ranges that overlap, instead of using non-equi joins.

Taking the same example, the following code would produce the desired outcome.

# foverlap tests the overlap of two ranges.  Create a second column,
# dbh2, as the end point of the range.
dt1[, dbh2 := dbh]

# foverlap requires the second argument to be keyed
setkey(dt1, sp, dbh, dbh2)

# find rows where dbh falls between dbh_min and dbh_max, and drop unnecessary
# columns afterwards
foverlaps(dt2, dt1, by.x = c("sp", "dbh_min", "dbh_max"), by.y = key(dt1),
          nomatch = 0)[
  ,
  -c("dbh2", "dbh_min", "dbh_max")
]

#  sp dbh gr_sp dhb_clas
#  1: SAB  10   RES        s
#  2: SAB  12   RES        s
#  3: SAB  16   RES        m
#  4: SAB  22   RES        l
#  5: EPN  12   RES        s
#  6: EPN  16   RES        m
#  7: BOP  10   DEC        s
#  8: BOP  12   DEC        s
#  9: BOP  14   DEC        s
# 10: BOP  20   DEC        m
# 11: BOP  26   DEC        l
# 12: PET  12   DEC        s
# 13: PET  16   DEC        s
# 14: PET  18   DEC        s

Tags:

R

Data.Table