backtransform `scale()` for plotting
For a data frame or matrix:
set.seed(1)
x = matrix(sample(1:12), ncol= 3)
xs = scale(x, center = TRUE, scale = TRUE)
x.orig = t(apply(xs, 1, function(r)r*attr(xs,'scaled:scale') + attr(xs, 'scaled:center')))
print(x)
[,1] [,2] [,3]
[1,] 4 2 3
[2,] 5 7 1
[3,] 6 10 11
[4,] 9 12 8
print(x.orig)
[,1] [,2] [,3]
[1,] 4 2 3
[2,] 5 7 1
[3,] 6 10 11
[4,] 9 12 8
Be careful when using functions like identical()
:
print(x - x.orig)
[,1] [,2] [,3]
[1,] 0 0 0.000000e+00
[2,] 0 0 8.881784e-16
[3,] 0 0 0.000000e+00
[4,] 0 0 0.000000e+00
identical(x, x.orig)
# FALSE
tl;dr:
unscaled_vals <- xs + attr(xs, 'scaled:scale') + attr(xs, 'scaled:center')
- where
xs
is a scaled object created byscale(x)
Just for those trying to make a bit of sense about this:
How R scales:
The scale
function performs both scaling and centering by default.
- Of the two, the function performs
centering
first.
Centering is achieved by default by subtracting the mean of all !is.na
input values from each value:
data - mean(data, rm.na = T)
Scaling is achieved via:
sqrt( ( sum(x^2) ) / n - 1)
where x
is the set of all !is.na
values to scale and n
= length(x)
.
Importantly, though, when
center =T
inscale
,x
is not the original set of data, but the already centered data.So if
center = T
(the default), the scaling function is really calculating:sqrt( ( sum( (data - mean(data, rm.na = T))^2) ) / n - 1)
- Note: [when
center = T
] this is the same as taking the standard deviation:sd(data)
.
- Note: [when
How to Unscale:
Explanation:
first multiply by scaling factor:
y = x * sqrt( ( sum( (x - mean(x , na.rm = T))^2) ) / (length(x) - 1))
then add back mean:
y + mean(x , na.rm = T)
Obviously you need to know the mean of the original set of data for this manual approach to truly be useful, but I place it here for conceptual sake.
Luckily, as previous answers have shown, the "centering" value (i.e., the mean) is located in the attributes of a scale
object, so this approach can be simplified to:
How to do in R:
unscaled_vals <- xs + attr(xs, 'scaled:scale') + attr(xs, 'scaled:center')
- where
xs
is a scaled object created byscale(x)
.
I felt like this should be a proper function, here was my attempt at it:
#' Reverse a scale
#'
#' Computes x = sz+c, which is the inverse of z = (x - c)/s
#' provided by the \code{scale} function.
#'
#' @param z a numeric matrix(like) object
#' @param center either NULL or a numeric vector of length equal to the number of columns of z
#' @param scale either NULL or a a numeric vector of length equal to the number of columns of z
#'
#' @seealso \code{\link{scale}}
#' mtcs <- scale(mtcars)
#'
#' all.equal(
#' unscale(mtcs),
#' as.matrix(mtcars),
#' check.attributes=FALSE
#' )
#'
#' @export
unscale <- function(z, center = attr(z, "scaled:center"), scale = attr(z, "scaled:scale")) {
if(!is.null(scale)) z <- sweep(z, 2, scale, `*`)
if(!is.null(center)) z <- sweep(z, 2, center, `+`)
structure(z,
"scaled:center" = NULL,
"scaled:scale" = NULL,
"unscaled:center" = center,
"unscaled:scale" = scale
)
}
Take a look at:
attributes(d$s.x)
You can use the attributes to unscale:
d$s.x * attr(d$s.x, 'scaled:scale') + attr(d$s.x, 'scaled:center')
For example:
> x <- 1:10
> s.x <- scale(x)
> s.x
[,1]
[1,] -1.4863011
[2,] -1.1560120
[3,] -0.8257228
[4,] -0.4954337
[5,] -0.1651446
[6,] 0.1651446
[7,] 0.4954337
[8,] 0.8257228
[9,] 1.1560120
[10,] 1.4863011
attr(,"scaled:center")
[1] 5.5
attr(,"scaled:scale")
[1] 3.02765
> s.x * attr(s.x, 'scaled:scale') + attr(s.x, 'scaled:center')
[,1]
[1,] 1
[2,] 2
[3,] 3
[4,] 4
[5,] 5
[6,] 6
[7,] 7
[8,] 8
[9,] 9
[10,] 10
attr(,"scaled:center")
[1] 5.5
attr(,"scaled:scale")
[1] 3.02765