Frequency tables with weighted data in R
Just for the sake of completeness, using base R:
df <- data.frame(var = c("A", "A", "B", "B"), wt = c(30, 10, 20, 40))
aggregate(x = list("wt" = df$wt), by = list("var" = df$var), FUN = sum)
var wt
1 A 40
2 B 60
Or with the less cumbersome formula notation:
aggregate(wt ~ var, data = df, FUN = sum)
var wt
1 A 40
2 B 60
Another solution from package expss
:
df <- data.frame(var = c("A", "A", "B", "B"), wt = c(30, 10, 20, 40))
library(expss)
fre(df$var, weight = df$wt)
| df$var | Count | Valid percent | Percent | Responses, % | Cumulative responses, % |
| ------ | ----- | ------------- | ------- | ------------ | ----------------------- |
| A | 40 | 40 | 40 | 40 | 40 |
| B | 60 | 60 | 60 | 60 | 100 |
| #Total | 100 | 100 | 100 | 100 | |
| <NA> | 0 | | 0 | | |
You can use function svytable
from package survey
, or wtd.table
from rgrs
.
EDIT : rgrs
is now called questionr
:
df <- data.frame(var = c("A", "A", "B", "B"), wt = c(30, 10, 20, 40))
library(questionr)
wtd.table(x = df$var, weights = df$wt)
# A B
# 40 60
That's also possible with dplyr
:
library(dplyr)
count(x = df, var, wt = wt)
# # A tibble: 2 x 2
# var n
# <fctr> <dbl>
# 1 A 40
# 2 B 60
Using data.table
you could do:
# using the same data as Victorp
setDT(df)[, .(n = sum(wt)), var]
var n
1: A 40
2: B 60