is a one-by-one-matrix just a number (scalar)?
It's just a scalar in the sense that the ring of $1\times 1$ matrices over a field $K$ is isomorphic to $K$ (by the map $[x]\mapsto x$), but, as you observed, when you're considering the interaction of matrices of different sizes, then you have to treat them differently.
Any matrix $A$ carries with it a type $(m,n)$ with $m$, $n\in{\mathbb N}_{\geq1}$. In fact such an $A$ is nothing else but a map $$A:\quad[m]\times[n]\to K\ ,\qquad (i,k)\mapsto a_{ik}\ .$$ When ${\rm type}(A)={\rm type}(B)$ then the sum $A+B$ is defined, and if ${\rm type}(A)=(m,n)$, ${\rm type}(B)=(n,p)$ then the product $AB$ is defined and has type $(m,p)$.
When $m=n=1$ then $A=[a]$ for a single number $a$ in the ground field $K$, e.g., $a\in{\mathbb R}$. Unfortunately there is no established notation to extract this $a$ out of the matrix $A$, just the same as there is no notation to extract the element $a$ out of the one-element set $\{a\}$. At any rate the map $[a]\mapsto a$ is well defined.
In the reverse direction things are more worrying. With any $c\in K$ we can form the $(1,1)$-matrix $[c]$ in a unique way. But note that the product $[c] \,A$ is only defined if $A$ has just one row (i.e., is of type $(1,n)$), and the product $A\, [c]$ is only defined if $A$ has just one column (i.e., is of type $(m,1)$).
Contrasting with this is the fact that the scalar multiple $c\,A$ is defined for all $c\in K$ and any matrix $A$, whatever its type. The effect of left-multiplying $A$ by the scalar $c$ is, that all elements of $A$ are multiplied by $c$. If you want to realize that by means of a matrix product you have to replace the scalar $c$ by a square diagonal matrix ${\rm diag}(c,c,\ldots, c)$ of the appropriate size.