What does it mean to transform as a scalar or vector?

There are a number of ways of mathematically formalizing the notions "transforming as a vector" or "transforming as a scalar" depending on the context, but in the context you're considering, I'd recommend the following:

Consider a finite number of types of objects $o_1, \dots, o_n$, each of which lives in some set $O_i$ of objects, and each of which is defined to transform in a particular way under rotations. In other words, given any rotation $R$, and for each object $o_i$ we have a mapping when acting on objects in $O_i$ tells us what happens to them under a rotation $R$: \begin{align} o_i \mapsto o_i^R = \text{something we specify} \end{align} For example, if $o_1$ is just a vector $\mathbf r$ in three dimensional Euclidean space $\mathbb R^3$, then one would typically take \begin{align} \mathbf r \mapsto \mathbf r^R = R\mathbf r. \end{align} Each mapping $o_i\mapsto o_i^R$ is what a mathematician would call a group action of the group of rotations on the set $O_i$ (there are more details in defining a group action which we ignore here). Once we have specified how these different objects $o_i$ transform under rotations, we can make the following definition:

Definition. Scalar under rotations

Let any function $f:O_1\times O_2\times\cdots \times O_n\to \mathbb R$ be given, we say it is a scalar under rotations provided \begin{align} f(o_1^R, \dots o_n^R) = f(o_1, \dots o_n). \end{align} This definition is intuitively just saying that if you "build" an object $f$ out of a bunch of other objects $o_i$ whose transformation under rotations you have already specified, then the new object $f$ which you have constructed is considered a scalar if it doesn't change when you apply a rotation to all of the objects it's built out of.

Example. The dot product

Let $n=2$, and let $o_1 = \mathbf r_1$ and $o_2 = \mathbf r_2$ both be vectors in $\mathbb R^3$. We define $f$ as follows: \begin{align} f(\mathbf r_1, \mathbf r_2) = \mathbf r_1\cdot \mathbf r_2. \end{align} Is $f$ a scalar under rotations? Well let's see: \begin{align} f(\mathbf r_1^R, \mathbf r_2^R) = (R\mathbf r_1)\cdot (R\mathbf r_2) = \mathbf r_1\cdot (R^TR\mathbf r_2) = \mathbf r_1\cdot \mathbf r_2 = f(\mathbf r_1, \mathbf r_2) \end{align} so yes it is!

Now what about a field of scalars? How do we define such a beast? Well we just have to slightly modify the above definition.

Definition. Field of scalars

Let any function $f:O_1\times\cdots \times O_n\times\mathbb R^3\to \mathbb R$ be given. We call $f$ a field of scalars under rotations provided \begin{align} f(o_1^R, \dots, o_n^R)(R\mathbf x) = f(\mathbf x). \end{align} You can think of this as simply saying that the rotated version of $f$ evaluated at the rotated point $R\mathbf x$ agrees with the unrotated version of $f$ evaluated at the unrotated point. Notice that this is formally the same as the equation you wrote down, namely $\bar T(\bar x, \bar y) = T(x,y)$.

Example. Divergence of a vector field

Consider the case that $\mathbf v$ is a vector field. Rotations are conventionally defined to act on vector fields as follows (I'll try to find another post on physics.SE that explains why): \begin{align} \mathbf v^R(\mathbf x) = R\mathbf v(R^{-1}\mathbf x) \end{align} Is its divergence a scalar field? Well to make contact with the definition we give above, let $f$ denote the divergence, namely \begin{align} f(\mathbf v)(\mathbf x) = (\nabla\cdot \mathbf v)(\mathbf x) \end{align} Now notice that using the chain rule we get (we use Einstein summation notation) \begin{align} (\nabla\cdot\mathbf v^R)(\mathbf x) &= \nabla\cdot\big(R\mathbf v(R^{-1}\mathbf x)\big)\\ &= \partial_i(R_{ij}v_j(R^{-1}\mathbf x) \\ &= R_{ij} \partial_i(v_j(R^{-1}\mathbf x)) \\ &= R_{ij}(R^{-1})_{ki}(\partial_k v_j)(R^{-1}\mathbf x)\\ &= (\nabla\cdot \mathbf v)(R^{-1}\mathbf x) \end{align} which implies that \begin{align} (\nabla\cdot\mathbf v^R)(R\mathbf x) = (\nabla\cdot \mathbf v)(\mathbf x), \end{align} but the left hand side is precisely $f(\mathbf v^R)(R\mathbf x)$ and the right side is $f(\mathbf v)(\mathbf x)$ so we have \begin{align} f(\mathbf v^R)(R\mathbf x) = f(\mathbf v)(\mathbf x). \end{align} This is precisely the condition that $f$ (the divergence of a vector field) be a scalar field under rotations.

Extension to vectors and vector fields.

To define a vector under rotations, and a field of vectors under rotations, we do a very similar procedure, but instead we have functions $\mathbf f:O_1\times O_2\times\cdots \times O_n\to \mathbb R^3$ and $\mathbf f:O_1\times O_2\times\cdots \times O_n\times\mathbb R^3\to \mathbb R^3$ respectively (in other words the right hand side of the arrow gets changed from $\mathbb R$ to $\mathbb R^3$, and the defining equations for a vector and a field of vectors become \begin{align} \mathbf f(o_1^R, \dots o_n^R) = R\,\mathbf f(o_1, \dots o_n). \end{align} and \begin{align} \mathbf f(o_1^R, \dots, o_n^R)(R\mathbf x) = R \,\mathbf f(\mathbf x) \end{align} respectively. In other words, there is an extra $R$ multiplying the right hand side.

I think you have the right idea, but I'll try to write it in a more elucidating notation.

The first thing to make clear is that for this discussion we are only ever working at a single point. We only care about transforming the coordinates that describe the domain to the extent this induces changes in the associated directions at a point. That is, each point in space can have vectors defined on it, and a very convenient basis for the vector space at that point is the set of directional derivatives, e.g. $$ \mathcal{B} = \{\vec{\partial}_{(x)}, \vec{\partial}_{(y)}, \ldots\}. $$ $\vec{\partial}_{(x)}$ points in the $x$-direction; call it $\vec{e}_x$ if you want. Changing $\{x, y, \ldots\} \to \{\bar{x}, \bar{y}, \ldots\}$ will give us a new natural basis $$ \bar{\mathcal{B}} = \{\vec{\partial}_{(\bar{x})}, \vec{\partial}_{(\bar{x})}, \ldots\} $$ at each point.

The point of that discussion is that transformations are local. What numbers you use to identify the point in space are irrelevant, so don't get caught up on whether we're calling the point $(x,y)$ or $(\bar{x},\bar{y})$. Okay, enough allusions to differential geometry.

Let's look at scalars. A scalar is just a single number from your favorite mathematical field.¹ What's more, it doesn't transform when the direction vectors change, since it carries no direction information anyway. If I have a scalar $f$, I could say its representation in either basis is the same: $$ f \stackrel{\mathcal{B},\mathcal{\bar{B}}}{\to} f. $$

Now consider a vector $\vec{A}$. Since a vector can always be written uniquely as a linear combination of basis vectors, let's do that: $$ \vec{A} = A^x \vec{\partial}_{(x)} + A^y \vec{\partial}_{(y)} + \cdots. $$ But there is another basis floating around, and so I have another decomposition available: $$ \vec{A} = A^\bar{x} \vec{\partial}_{(\bar{x})} + A^\bar{y} \vec{\partial}_{(\bar{y})} + \cdots. $$ For simplicity, I can just write the coefficients when the basis is understood: \begin{align} \vec{A} & \stackrel{\mathcal{B}}{\to} (A^x, A^y, \ldots) \\ \vec{A} & \stackrel{\mathcal{\bar{B}}}{\to} (A^\bar{x}, A^\bar{y}, \ldots). \end{align}

The numbers $A^x$, $A^y$, $A^\bar{x}$, etc. are just scalars in the mathematical sense, but often we avoid calling them scalars. Instead, we call them components of a vector, and we expect them to collectively transform as a vector when we change basis. That is, if I switch from $\mathcal{B}$ to $\bar{\mathcal{B}}$, I should rewrite $(A^x, A^y, \ldots)$ as $(A^\bar{x}, A^\bar{y}, \ldots)$ so that the collection of numbers still refers to the same abstract vector.

The actual transformation is simple enough to find. I can always express an element from one basis in terms of the other basis. Suppose for $j \in \{x, y, \ldots\}$ and $\bar{\imath} \in \{\bar{x}, \bar{y}, \ldots\}$ we have coefficients ${\Lambda^\bar{\imath}}_j$ such that $$ \vec{\partial}_{(j)} = \sum_\bar{\imath} {\Lambda^\bar{\imath}}_j \vec{\partial}_{(\bar{\imath})}. $$ Then \begin{align} \sum_\bar{\imath} A^\bar{\imath} \vec{\partial}_{(\bar{\imath})} & = \vec{A} \\ & = \sum_j A^j \vec{\partial}_{(j)} \\ & = \sum_j A^j \sum_\bar{\imath} {\Lambda^\bar{\imath}}_j \vec{\partial}_{(\bar{\imath})} \\ & = \sum_\bar{\imath} \sum_j {\Lambda^\bar{\imath}}_j A^j \vec{\partial}_{(\bar{\imath})}. \end{align} Because basis decompositions are unique, we can read off $$ A^\bar{\imath} = \sum_j {\Lambda^\bar{\imath}}_j A^j. $$ In matrix notation, this is $$ \begin{pmatrix} A^\bar{x} \\ A^\bar{y} \\ \vdots \end{pmatrix} = \begin{pmatrix} {\Lambda^\bar{x}}_x & {\Lambda^\bar{x}}_y & \cdots \\ {\Lambda^\bar{y}}_x & {\Lambda^\bar{y}}_y & \cdots \\ \vdots & \vdots & \ddots \end{pmatrix} \begin{pmatrix} A^x \\ A^y \\ \vdots \end{pmatrix}. $$

When a physicist checks that $\vec{A}$ transforms as a vector, what is usually meant is that we have one set of formulas for $A^x, A^y, \ldots$ in $\mathcal{B}$, and another set for calculating $A^\bar{x}, A^\bar{y}, \ldots$ in $\bar{\mathcal{B}}$, and we want to make sure that the components are describing the same abstract vector $\vec{A}$. This is the case if the sets of components transform according to the rule given above.

In your case, you may be handed the scalar $T$ (i.e. $f$ above). You can calculate the values $\partial T/\partial x$, $\partial T/\partial y$, etc. (Here is where the dependence on other points comes in, since you are often given $T$ as a function of the coordinates so that you can calculate its partial derivatives.) You can assemble the (column) vector $(\partial T/\partial x, \partial T/\partial y, \ldots)$. You could do the same in another basis, with other partial derivatives, assembling $(\partial T/\partial\bar{x}, \partial T/\partial\bar{y}, \ldots)$. It is not a priori clear, however, that these two sets of components will obey the above transformation law. Fortunately, though, the gradient $\nabla T$ (i.e. $\vec{A}$) defined this way is a true vector and it transforms correctly.

¹$\mathbb{R}$ or $\mathbb{C}$ or whatever. Note when we say "field" in physics we often mean "function mapping the entire space in question to some sort of mathematical object." So a scalar field assigns a scalar to each point in space, a vector field assigns a vector, etc. But since we're only discussing what happens at a single point, the physicists' notion of "field" is not important here. If you really want to transform an entire scalar field or vector field, just take what's done here and apply it to every point in space.

What does it mean to transform as a scalar or vector?

Tags:

Covariance

Representation Theory

Vectors

Related

Recent Posts