Split columns by number in a dataframe

Using a simple gsub would be a choice for me:

section <- c('View 500', 'V458', '453')

cbind(section = trimws(gsub('[0-9]', '', section)), 
      section_numbers = trimws(gsub('[a-zA-Z]', '', section)))

I use trimws to just remove any unwanted white spaces.

Output:

    section section_numbers
[1,] "View"  "500"          
[2,] "V"     "458"          
[3,] ""      "453"

You can use tidyr for this:

tidyr::extract(df,section, c("section", "section number"), 
               regex="([[:alpha:]]*)[[:space:]]*([[:digit:]]*)")
  section section number
1    View            500
2       V            458
3                    453

You can use extract which also comes from the tidyr package, with which you can specify the capture group, make them optional here and it is pretty flexible to handle different cases:

library(tidyr)
df %>% extract(section, into = c("alpha", "numeric"), regex = "([a-zA-Z]+)?\\s?(\\d+)?")

#  alpha numeric
#1  View     500
#2     V     458
#3  <NA>     453

Split columns by number in a dataframe

Tags:

Split

Regex

R

Dataframe

Related

Recent Posts