Reordering columns in a large dataframe
The package dplyr
and the function dplyr::relocate
, a new verb introduced in dplyr 1.0.0
, does exactly what you are looking for with highly readable syntax.
df %>% dplyr::relocate(b, c, .after = f)
You can refer to columns by position. e.g.
df <- df[ ,c(1,4:6,2:3)]
> df
a d e f b c
1 1 Rabbit Cat Cat Cat Dog
2 2 Cat Dog Dog Dog Rabbit
3 3 Dog Dog Dog Rabbit Cat
4 4 Dog Rabbit Rabbit Cat Dog
5 5 Rabbit Cat Cat Dog Dog
If you're just moving certain columns to the end, you can create a little helper-function like the following:
movetolast <- function(data, move) {
data[c(setdiff(names(data), move), move)]
}
movetolast(df, c("b", "c"))
# a d e f b c
# 1 1 Rabbit Cat Cat Cat Dog
# 2 2 Cat Dog Dog Dog Rabbit
# 3 3 Dog Dog Dog Rabbit Cat
# 4 4 Dog Rabbit Rabbit Cat Dog
# 5 5 Rabbit Cat Cat Dog Dog
I would not recommend getting too into the habit of using column positions, especially not from a programmatic standpoint, since those positions might change.
"For fun" update
Here's an extended interpretation of the above function. It allows you to move columns to either the first or last position, or to be before or after another column.
moveMe <- function(data, tomove, where = "last", ba = NULL) {
temp <- setdiff(names(data), tomove)
x <- switch(
where,
first = data[c(tomove, temp)],
last = data[c(temp, tomove)],
before = {
if (is.null(ba)) stop("must specify ba column")
if (length(ba) > 1) stop("ba must be a single character string")
data[append(temp, values = tomove, after = (match(ba, temp)-1))]
},
after = {
if (is.null(ba)) stop("must specify ba column")
if (length(ba) > 1) stop("ba must be a single character string")
data[append(temp, values = tomove, after = (match(ba, temp)))]
})
x
}
Try it with the following.
moveMe(df, c("b", "c"))
moveMe(df, c("b", "c"), "first")
moveMe(df, c("b", "c"), "before", "e")
moveMe(df, c("b", "c"), "after", "e")
You'll need to adapt it to have some error checking--for instance, if you try to move columns "b" and "c" to "before c", you'll (obviously) get an error.
To move specific columns to the beginning or end of a data.frame, use select
from the dplyr package and its everything()
function. In this example we are sending to the end:
library(dplyr)
df %>%
select(-b, -c, everything())
a d e f b c
1 1 Rabbit Cat Cat Cat Dog
2 2 Cat Dog Dog Dog Rabbit
3 3 Dog Dog Dog Rabbit Cat
4 4 Dog Rabbit Rabbit Cat Dog
5 5 Rabbit Cat Cat Dog Dog
Without the negation, the columns would be sent to the front.