Generating multidimensional data
I have some functions for hypercube and n-sphere selection that generate dataframes with cartesian coordinates and guarantee a uniform distribution through the hypercube or n-sphere for an arbitrary amount of dimensions :
GenerateCubiclePoints <- function(nrPoints,nrDim,center=rep(0,nrDim),l=1){
x <- matrix(runif(nrPoints*nrDim,-1,1),ncol=nrDim)
x <- as.data.frame(
t(apply(x*(l/2),1,'+',center))
)
names(x) <- make.names(seq_len(nrDim))
x
}
is in a cube/hypercube of nrDim
dimensions with a center
and l
the length of one side.
For an n-sphere with nrDim
dimensions, you can do something similar, where r
is the radius :
GenerateSpherePoints <- function(nrPoints,nrDim,center=rep(0,nrDim),r=1){
#generate the polar coordinates!
x <- matrix(runif(nrPoints*nrDim,-pi,pi),ncol=nrDim)
x[,nrDim] <- x[,nrDim]/2
#recalculate them to cartesians
sin.x <- sin(x)
cos.x <- cos(x)
cos.x[,nrDim] <- 1 # see the formula for n.spheres
y <- sapply(1:nrDim, function(i){
if(i==1){
cos.x[,1]
} else {
cos.x[,i]*apply(sin.x[,1:(i-1),drop=F],1,prod)
}
})*sqrt(runif(nrPoints,0,r^2))
y <- as.data.frame(
t(apply(y,1,'+',center))
)
names(y) <- make.names(seq_len(nrDim))
y
}
in 2 dimensions, these give :
From code :
T1 <- GenerateCubiclePoints(10000,2,c(4,3),5)
T2 <- GenerateSpherePoints(10000,2,c(-5,3),2)
op <- par(mfrow=c(1,2))
plot(T1)
plot(T2)
par(op)
Also check out the copula package. This will generate data within a cube/hypercube with uniform margins, but with correlation structures that you set. The generated variables can then be transformed to represent other shapes, but still with relations other than independent.
If you want more complex shapes but are happy with uniform and idependent within the shape then you can just do rejection sampling: generate data within a cube that contains your shape, then test if the points are within your shape, reject them if not, then keep doing this until there are enough points.