Conditionally pasting values from one column to another in R
You can try pivoting it to long, doing your transformation, and then pivoting it back wide again.
library(dplyr)
library(tidyr)
url_col_names <- c("home", "login", "product_page")
df %>%
pivot_longer(url_col_names, names_to = "url", values_to = "url_duration") %>%
mutate(url_duration = url_duration * duration) %>%
pivot_wider(names_from = "url", values_from = "url_duration")
# A tibble: 6 x 7
ID URL_visit URL_name duration home login product_page
<dbl> <dbl> <fct> <dbl> <dbl> <dbl> <dbl>
1 1 1 home 14 14 0 0
2 1 2 login 40 0 40 0
3 1 3 product_page 233 0 0 233
4 2 1 home 8 8 0 0
5 3 1 home 76 76 0 0
6 3 2 product_page 561 0 0 561
Another way, probably more simple, is to do this.
df %>%
mutate(across(any_of(url_col_names), ~ . * duration))
ID URL_visit URL_name duration home login product_page
1 1 1 home 14 14 0 0
2 1 2 login 40 0 40 0
3 1 3 product_page 233 0 0 233
4 2 1 home 8 8 0 0
5 3 1 home 76 76 0 0
6 3 2 product_page 561 0 0 561
Edit
On another note, I imagine you created those indicator variables? If you are just hoping to replace them, then you actually might not need to create them to begin with. You can just pivot_wider()
from the start.
This would assume that your ID
and URL_visit
columns form a unique row combination.
df2 <- df[, 1:4]
df2 %>%
pivot_wider(names_from = "URL_name", values_from = "duration", values_fill = 0)
A simple multiplication should do the trick (this is equivalent to @Adam 's tidyverse solution above but in base R)
url_col_names <- c('home','login','product_page')
df$duration * df[,url_col_names] -> df[,url_col_names]
To rename the columns, you can do:
names(df)[names(df) %in% url_col_names] <- paste0(url_col_names, '_', 'duration')
Similar to @Adam, across()
can be used with ifelse()
in order to compute variables using a similar structure as the user mentioned:
library(dplyr)
#Data
df <- data.frame("ID" = c(1, 1, 1, 2, 3, 3),
"URL_visit" = c(1, 2, 3, 1, 1, 2), # e.g. customer ID #1 has visited 3 pages
"URL_name" = c("home", "login", "product_page", "home", "home", "product_page"),
"duration" = c(14, 40, 233, 8, 76, 561),
"home" = c(1, 0, 0, 1, 1, 0),
"login" = c(0, 1, 0, 0, 0, 0),
"product_page" = c(0, 0, 1, 0, 0, 1)
)
#Code
df %>%
mutate(across(c(home:product_page), ~ ifelse(.==1, duration, .)))
Output:
ID URL_visit URL_name duration home login product_page
1 1 1 home 14 14 0 0
2 1 2 login 40 0 40 0
3 1 3 product_page 233 0 0 233
4 2 1 home 8 8 0 0
5 3 1 home 76 76 0 0
6 3 2 product_page 561 0 0 561
Also, if the original variables need to be kept, this code can help:
df %>%
mutate(across(c(home:product_page),.fns = list(duration = ~ ifelse(.==1, duration, .)) ))
Output:
ID URL_visit URL_name duration home login product_page home_duration login_duration
1 1 1 home 14 1 0 0 14 0
2 1 2 login 40 0 1 0 0 40
3 1 3 product_page 233 0 0 1 0 0
4 2 1 home 8 1 0 0 8 0
5 3 1 home 76 1 0 0 76 0
6 3 2 product_page 561 0 0 1 0 0
product_page_duration
1 0
2 0
3 233
4 0
5 0
6 561