R how to identify distance of last occurrence
I would suggest creating a grouping column based on when there is a switch from FALSE to TRUE:
# create group column
d[c(light), group := cumsum(light)]
d[is.na(group), group:=0L]
d[, group := cumsum(group)]
d
Then simply tally by group, using cumsum
and negating light
:
d[, distance := cumsum(!light), by=group]
# remove the group column for cleanliness
d[, group := NULL]
Results:
d
date light distance
1: 2013-06-01 TRUE 0
2: 2013-06-02 FALSE 1
3: 2013-06-03 FALSE 2
4: 2013-06-04 TRUE 0
5: 2013-06-05 TRUE 0
6: 2013-06-06 FALSE 1
7: 2013-06-07 FALSE 2
8: 2013-06-08 TRUE 0
I added a few rows
This should do it:
d[, distance := 1:.N - 1, by = cumsum(light)]
or this:
d[, distance := .I - .I[1], by = cumsum(light)]
And if you want to actually count number of days as opposed to row-distance, you could use:
d[, distance := as.numeric(as.POSIXct(date, format = "%m/%d/%Y") -
as.POSIXct(date[1], format = "%m/%d/%Y"),
units = 'days'),
by = cumsum(light)]
An approach using run length encoding (rle
) and sequence
(which is a wrapper for unlist(lapply(nvec, seq_len))
d[, distance := sequence(rle(light)$lengths)][(light), distance := 0]