Aggregate a data frame based on unordered pairs of columns
One way is to create extra columns with pmax
and pmin
of id1
and id2
as follows. I'll use data.table
solution here.
require(data.table)
DT <- data.table(DF)
# Following mnel's suggestion, g1, g2 could be used directly in by
# and it could be even shortened by using `id1` and id2` as their names
DT.OUT <- DT[, list(size=sum(size)),
by=list(id1 = pmin(id1, id2), id2 = pmax(id1, id2))]
# id1 id2 size
# 1: 5400 5505 18
# 2: 5033 5458 1
# 3: 5452 2873 24
# 4: 5452 5213 2
# 5: 5452 4242 26
# 6: 4823 4823 4
an alternate method:
R> library(igraph)
R> DF
id1 id2 size
1 5400 5505 7
2 5033 5458 1
3 5452 2873 24
4 5452 5213 2
5 5452 4242 26
6 4823 4823 4
7 5505 5400 11
R> g <- graph.data.frame(DF, directed=F)
R> g <- simplify(g, edge.attr.comb="sum", remove.loops=FALSE)
R> DF <- get.data.frame(g)
R> DF
id1 id2 size
1 5400 5505 18
2 5033 5458 1
3 5452 2873 24
4 5452 5213 2
5 5452 4242 26
6 4823 4823 4