How does dplyr's select helper function everything() differ from copying?
Looking for references to everything
in ?select
, they have an example use for reordering columns:
# Reorder variables: keep the variable "Species" in the front
select(iris, Species, everything())
In this case the Species
column is moved to the first column, all columns are kept, and no columns are duplicated.
Select helpers are used for more than just the select
function - for example, in dplyr
version 1.0 and greater, you may want to use it in across()
to mutate or summarize all columns.
Since this question was asked, the select helpers have been broken out into their own package, tidyselect
. The tidyselect
page on CRAN has a lengthy list of reverse imports - it's likely that many of the packages importing tidyselect
have cases where everything()
is useful.
Another example use case:
# Moves the variable Petal.Length to the end
select(iris, -Petal.Length, everything())
(I saw it here: https://stackoverflow.com/a/30472217/4663008)
Either way, both Gregor's answer and mine are confusing to me - I would have expected Species to be duplicated in Gregor's example or removed in my example. e.g. if you try something more complicated based on the previous two examples, it doesn't work:
> dplyr::select(iris, Petal.Width, -Petal.Length, everything())
Petal.Width Sepal.Length Sepal.Width Petal.Length Species
1 0.2 5.1 3.5 1.4 setosa
2 0.2 4.9 3.0 1.4 setosa
3 0.2 4.7 3.2 1.3 setosa
Update: After a quick response from hadley on github, I found out that there is a special behaviour using everything() combined with a negative in the first position in select() that will start select() off with all the variables and then everything() draws them back out again. A negative variable in non-first positions do not work as one might expect.
I agree that the negative variable in first position and the everything() select_helper function needs to be better explained in the documentation
Update 2: the documentation for ?select
has now been updated to state "Positive values select variables; negative values to drop variables. If the first expression is negative, select() will automatically start with all variables."