bind_rows of different data types
We can use rbindlist
from data.table
library(data.table)
rbindlist(list(ds_a, ds_b))
# x
#1: 1
#2: 2
#3: 3
#4: 4
#5: 5
#6: 6
#7: z1
#8: z2
Recently I switched to an approach that keeps all columns as strings initially(when converting from plain-text to a data.frame), then stacks, and finally converts the columns to an appropriate data type after it has all the rows to make a decision (using readr::type_convert()
).
It mimics this example. I haven't done any performance comparisons but there wasn't a noticeable difference (the internet is the real bottleneck). Also, I kinda like the idea of reducing the number of data type conversions.
library(magrittr)
col_types <- readr::cols(.default = readr::col_character())
raw_a <- "x,y\n1,21\n2,22\n3,23\n4,24\n5,25\n6,26"
raw_b <- "x,y\nz1,31\nz2,32"
ds_a <- readr::read_csv(raw_a, col_types=col_types)
ds_b <- readr::read_csv(raw_b, col_types=col_types)
list(ds_a, ds_b) %>%
dplyr::bind_rows() %>%
readr::type_convert()
#> Parsed with column specification:
#> cols(
#> x = col_character(),
#> y = col_double()
#> )
#> # A tibble: 8 x 2
#> x y
#> <chr> <dbl>
#> 1 1 21
#> 2 2 22
#> 3 3 23
#> 4 4 24
#> 5 5 25
#> 6 6 26
#> 7 z1 31
#> 8 z2 32
Created on 2019-12-03 by the reprex package (v0.3.0)