meaning of m34 of CATransform3D
You can find the full details here. Note that Apple uses the reversed multiplication order for projection (relative to the given link) so all matrix multiplications are reversed and all matrices are transposed.
A short description of the meaning:
- m34 = 1/z distance to projection plane (the 1/ez term in the reference link)
- + for the z axis is towards the viewer, resulting in a "looking in the mirror" feel when using -
- projection center is (0,0,0) plus any translations you set up
I read some articles including this one: https://developer.apple.com/library/content/documentation/Cocoa/Conceptual/CoreAnimation_guide/AdvancedAnimationTricks/AdvancedAnimationTricks.html#//apple_ref/doc/uid/TP40004514-CH8-SW13
My solutions is here:
Entities:
eye
- distance from screen to eyescale
- visual scale of transformed objectdistance
- distance to transformed object
Connecting formulas:
scale = eye / (eye + distance)
distance = eye * (1.0/scale - scale)
eye = distance / (1.0/scale - scale)
Example of computing z-distance for desized scale of selected eye distance:
CATransform3D transformByScaleAndEye(CGFloat scale, CGFloat eye) {
CATransform3D t = CATransform3DIdentity;
t.m34 = -1.0 / eye;
CGFloat distance = -eye*(1.0/scale - scale);
return CATransform3DTranslate(t, 0, 0, distance);
}
The following is some background knowledge about the topics which I think readers should know before answering the questions:
iOS coordinate system: Imagine you are holding your phone vertically with the screen facing you. For each view, its coordinate system has origin at its center. x-axis from left to right, y-axis from top to bottom, z-axis from back of the phone to you face.
Homogenouse coordinates: when doing 3D transformation with iOS, you are working with homogenous coordinates or projective coordinates instead of traditional Cartesian coordinates. In short, the new coordinate system use one more dimension
w
compared to the old one. The beauty of this system is that it allow doing rotation/translation/scale by doing vector-matrix-multiplication.
To convert a vector in homogenous coordinate to Catersian coordinate, you divide x, y, z to w.
Now, let's get to the answer. Consider an example as following:
var transform = CATransform3DIdentity
transform.m34 = -1 / 500
transform = CATransform3DRotate(transform, Double.pi/4, 0, 1, 0)
transform = CATransform3DTranslate(transform, 0, 0, 200)
imageLayer.transform = transform
To know what the code above does, you must read it in reversed order. Firstly, the image is moved 200px in z-axis toward (positive sign mean its direction is from the screen toward your face). Secondly, the image is rotate 45 degree relative to the y-axis (tilted to your right). If you stop here, you'll have a image with smaller width and shifted to your right. But some operations is done at step 3 and "magically" give the image perspective. Here lies the mystery of m34
element.
Here is the sequence of operations expressed in term of matrix multiplications:
[x' y' z' w'] = ([x y z w] * translation_matrix * rotation_matrix) * perspective_matrix
Transformation with perspective matrix
Now, focus on the operation with the perspective_matrix
Convert to Catersian coordinates:
Transformation with scale factor
To model 3D perspective on 2D screen, you want to objects closer to your eye appear bigger than objects further away. To do that, you project objects from 3D space to the screen, i.e, draw imaginary lines from your eyes passthrough the objects and intercept with the screen. Let's calculate scaling factor for that projections:
z = the object's z-coordinate which is its distance to the screen
eye2screen #distance from your eyes to the screens
scaleFactor = eye2screen / (eye2Screen - z) #Thales's intercept theorem
A quick calculation can confirm our intuition. As z get smaller (i.e objects further away from the eyes), scaleFactor
get smaller.
Reconcile our scaleFactor
with perspective_matrix
multiplication above we have:
scaleFactor = 1 / (1 + m34 * z)
1 / (1 - z / eye2screen) = 1 / (1 + m34 * z)
-1 / eye2screen = m34
m34 = -1 / eye2screen
Pick a reasonable distance between eyes and screen (like 500), you can calculate m34 and you've got yourself a new perspective_matrix
.
Conclusions:
- Perspective matrix is a math trick to scale objects base on their distance to your eyes.
- Compute
m34 = -1 / eye2screen
give you aperspective_matrix
perspective_matrix
operation must be applied only once at the end of your transform pipeline.