backtransform `scale()` for plotting

For a data frame or matrix:

set.seed(1)
x = matrix(sample(1:12), ncol= 3)
xs = scale(x, center = TRUE, scale = TRUE)

x.orig = t(apply(xs, 1, function(r)r*attr(xs,'scaled:scale') + attr(xs, 'scaled:center')))

print(x)
     [,1] [,2] [,3]
[1,]    4    2    3
[2,]    5    7    1
[3,]    6   10   11
[4,]    9   12    8

print(x.orig)
     [,1] [,2] [,3]
[1,]    4    2    3
[2,]    5    7    1
[3,]    6   10   11
[4,]    9   12    8

Be careful when using functions like identical():

print(x - x.orig)
     [,1] [,2]         [,3]
[1,]    0    0 0.000000e+00
[2,]    0    0 8.881784e-16
[3,]    0    0 0.000000e+00
[4,]    0    0 0.000000e+00

identical(x, x.orig)
# FALSE

tl;dr:

unscaled_vals <- xs + attr(xs, 'scaled:scale') + attr(xs, 'scaled:center')

where xs is a scaled object created by scale(x)

Just for those trying to make a bit of sense about this:

How R scales:

The scale function performs both scaling and centering by default.

Of the two, the function performs centering first.

Centering is achieved by default by subtracting the mean of all !is.na input values from each value:

data - mean(data, rm.na = T)

Scaling is achieved via:

sqrt( ( sum(x^2) ) / n - 1)

where x is the set of all !is.na values to scale and n = length(x).

Importantly, though, when center =T in scale, x is not the original set of data, but the already centered data.

So if center = T (the default), the scaling function is really calculating:
```
 sqrt( ( sum( (data - mean(data, rm.na = T))^2) ) / n - 1)
```
- Note: [when center = T] this is the same as taking the standard deviation: sd(data).

How to Unscale:

Explanation:

first multiply by scaling factor:

y = x * sqrt( ( sum( (x - mean(x , na.rm = T))^2) ) / (length(x) - 1))

then add back mean:
```
y + mean(x , na.rm = T)
```

Obviously you need to know the mean of the original set of data for this manual approach to truly be useful, but I place it here for conceptual sake.

Luckily, as previous answers have shown, the "centering" value (i.e., the mean) is located in the attributes of a scale object, so this approach can be simplified to:

How to do in R:

unscaled_vals <- xs + attr(xs, 'scaled:scale') + attr(xs, 'scaled:center')

where xs is a scaled object created by scale(x).

I felt like this should be a proper function, here was my attempt at it:

#' Reverse a scale
#'
#' Computes x = sz+c, which is the inverse of z = (x - c)/s 
#' provided by the \code{scale} function.
#' 
#' @param z a numeric matrix(like) object
#' @param center either NULL or a numeric vector of length equal to the number of columns of z  
#' @param scale  either NULL or a a numeric vector of length equal to the number of columns of z
#'
#' @seealso \code{\link{scale}}
#'  mtcs <- scale(mtcars)
#'  
#'  all.equal(
#'    unscale(mtcs), 
#'    as.matrix(mtcars), 
#'    check.attributes=FALSE
#'  )
#'  
#' @export
unscale <- function(z, center = attr(z, "scaled:center"), scale = attr(z, "scaled:scale")) {
  if(!is.null(scale))  z <- sweep(z, 2, scale, `*`)
  if(!is.null(center)) z <- sweep(z, 2, center, `+`)
  structure(z,
    "scaled:center"   = NULL,
    "scaled:scale"    = NULL,
    "unscaled:center" = center,
    "unscaled:scale"  = scale
  )
}

Take a look at:

attributes(d$s.x)

You can use the attributes to unscale:

d$s.x * attr(d$s.x, 'scaled:scale') + attr(d$s.x, 'scaled:center')

For example:

> x <- 1:10
> s.x <- scale(x)

> s.x
            [,1]
 [1,] -1.4863011
 [2,] -1.1560120
 [3,] -0.8257228
 [4,] -0.4954337
 [5,] -0.1651446
 [6,]  0.1651446
 [7,]  0.4954337
 [8,]  0.8257228
 [9,]  1.1560120
[10,]  1.4863011
attr(,"scaled:center")
[1] 5.5
attr(,"scaled:scale")
[1] 3.02765

> s.x * attr(s.x, 'scaled:scale') + attr(s.x, 'scaled:center')
      [,1]
 [1,]    1
 [2,]    2
 [3,]    3
 [4,]    4
 [5,]    5
 [6,]    6
 [7,]    7
 [8,]    8
 [9,]    9
[10,]   10
attr(,"scaled:center")
[1] 5.5
attr(,"scaled:scale")
[1] 3.02765

backtransform `scale()` for plotting

Tags:

R

Related

Recent Posts