Filtering data in R

You can drop any row containing a missing using na.omit(), however that's not what you want. Moreover, the currently accepted answer is wrong. It gives you complete columns, but does not drop the rows that have one or more missing values, which is what was asked for. The correct answer can be obtained as:

> a <- data.frame(a=c(1,2),b=c(NA,1), c=c(3,4))
> a
  a  b c
1 1 NA 3
2 2  1 4
> na.omit(a)[,colSums(is.na(a))==0]
  a c
2 2 4

To see that the above answer is wrong:

> a[ ,apply(a, 2, function(z) !any(is.na(z)))]
  a c
1 1 3
2 2 4

Row 1 should be dropped because of the NA in column 2.

If x is your data.frame (or matrix) then

x[ ,apply(x, 2, function(z) !any(is.na(z)))]

Since your example uses NULL, is.na(·) will be replaced by is.null(·)

Alternatively you can look at subset(·).

a <- data.frame(a=c(1,2,0,1),b=c(NA,1,NA,1), c=c(3,4,5,1))

na.omit(a)
  a b c
2 2 1 4
4 1 1 1

a[rowSums(is.na(a))==0,]
  a b c
2 2 1 4
4 1 1 1

a[complete.cases(a),]
  a b c
2 2 1 4
4 1 1 1

Filtering data in R

Tags:

Filtering

R

Related

Recent Posts