How to get frequency counts using column breaks by row?
Alternatively to rle()
you can use diff()
:
dat %>%
group_by(name) %>%
summarise(ever_inv = sum(diff(c(0, srvc_inv)) > 0))
# A tibble: 1 x 2
# name ever_inv
# <fct> <int>
# 1 Bob 2
Assuming that srvc_inv
is either 0 or 1, diff(srvc_inv) == 1
only when xi is 1, and xi-1 is 0. It turns into 0 or -1 otherwise. I added 0 before srvc_inv
for a case when it starts from 1s run.
And with rle()
, from my opinion, there is even simpler solution:
dat %>%
group_by(name) %>%
summarise(ever_inv = sum(rle(srvc_inv)$value))
# A tibble: 1 x 2
# name ever_inv
# <fct> <int>
# 1 Bob 2
Assuming that srvc_inv
is either 0 or 1, that's enough just to sum values
component of rle
object, which returns the number of 1s runs.
One possibility could be:
dat %>%
group_by(name) %>%
mutate(rleid = with(rle(srvc_inv), rep(seq_along(lengths), lengths))) %>%
summarise(ever_inv = n_distinct(rleid[srvc_inv == 1]))
name ever_inv
<fct> <int>
1 Bob 2
One more solution based on base R rle
library(dplyr)
dat %>% group_by(name) %>%
summarise(ever_inv = length(with(rle(srvc_inv), lengths[values==1])))
# A tibble: 1 x 2
name ever_inv
<fct> <int>
1 Bob 2