Identifying duplicate columns in a dataframe

How about:

testframe[!duplicated(as.list(testframe))]

You can do with lapply:

testframe[!duplicated(lapply(testframe, summary))]

summary summarizes the distribution while ignoring the order.

Not 100% but I would use digest if the data is huge:

library(digest)
testframe[!duplicated(lapply(testframe, digest))]

A nice trick that you can use is to transpose your data frame and then check for duplicates.

duplicated(t(testframe))

Related