Split string with repeated delimiters
read.csv(text = gsub("([^,]+,[^,]+),", "\\1\n", example),
header = FALSE, stringsAsFactors = FALSE)
# V1 V2
# 1 namei1 namej1 surname1
# 2 name2 surnamei2 surnamej2
# 3 name3 surname3
We can split the string at ,
followed by zero or more spaces (\\s*
), then create a grouping variable based on the occurance of 'name' string and split
the vector
(v1
) into a list
of vector
s, rbind the
listelements and convert it to a
data.frame`
v1 <- strsplit(example, ",\\s*")[[1]]
setNames(do.call(rbind.data.frame, split(v1, cumsum(grepl('\\bname',
v1)))), paste0("V", 1:2))
# V1 V2
#1 namei1 namej1 surname1
#2 name2 surnamei2 surnamej2
#3 name3 surname3
Or another option is scan
and convert it to a two column matrix
as.data.frame( matrix(trimws(scan(text = example, sep=",",
what = "", quiet = TRUE)), byrow = TRUE, ncol = 2))
# V1 V2
#1 namei1 namej1 surname1
#2 name2 surnamei2 surnamej2
#3 name3 surname3
Or another option is gsub
where we replace the ,
followed by space and 'name' string with \n
and 'name' and use that in. read.csv
to split based on the delimiter ,
read.csv(text = gsub(", name", "\nname", example), header= FALSE)
# V1 V2
#1 namei1 namej1 surname1
#2 name2 surnamei2 surnamej2
#3 name3 surname3
data.frame(split(unlist(strsplit(example, ", ")), c(0, 1)))
# X0 X1
#1 namei1 namej1 surname1
#2 name2 surnamei2 surnamej2
#3 name3 surname3