How to aggregate categorical data in R?

As mentioned in the comments, table is standard for this, like

table(stack(DT))

         ind
values    Category.x Category.y
  Better           2          2
  Similar          1          2
  Worse            1          0

Click to copy

table(value = unlist(DT), cat = names(DT)[col(DT)])

         cat
value     Category.x Category.y
  Better           2          2
  Similar          1          2
  Worse            1          0

Click to copy

with(reshape(DT, direction = "long", varying = 1:2), 
  table(value = Category, cat = time)
)

         cat
value     x y
  Better  2 2
  Similar 1 2
  Worse   1 0

Click to copy

sapply(df1, function(x) sapply(unique(unlist(df1)), function(y) sum(y == x)))
#        Category.x Category.y
#Better           2          2
#Similar          1          2
#Worse            1          0

One dplyr and tidyr possibility could be:

Click to copy

df %>%
 gather(var, val) %>%
 count(var, val) %>%
 spread(var, n, fill = 0)

  val     Category.x Category.y
  <chr>        <dbl>      <dbl>
1 Better           2          2
2 Similar          1          2
3 Worse            1          0

It, first, transforms the data from wide to long format, with column "var" including the variable names and column "val" the corresponding values. Second, it counts per "var" and "val". Finally, it spreads the data into the desired format.

Or with dplyr and reshape2 you can do:

Click to copy

df %>%
 mutate(rowid = row_number()) %>%
 melt(., id.vars = "rowid") %>%
 count(variable, value) %>%
 dcast(value ~ variable, value.var = "n", fill = 0)

    value Category.x Category.y
1  Better          2          2
2 Similar          1          2
3   Worse          1          0

How to aggregate categorical data in R?

Tags:

R

Aggregate

Related

Recent Posts