Why can't linear maps map to higher dimensions?
You can indeed have a linear map from a "low-dimensional" space to a "high-dimensional" one - you've given an example of such a map, and there are others (e.g. $x\mapsto (x, 0)$).
However, such a map will "miss" most of the target space. Specifically, given a linear map $f: V\rightarrow W$, the range or image of $f$ is the set of vectors in $W$ that are actually hit by something in $V$: $$im(f)=\{w\in W: \exists v\in V(f(v)=w)\}.$$ This is in contrast to the codomain, which is just $W$. (The distinction betwee range/image and codomain can feel slippery at first; see here.)
The point is that $im(f)$ is a subspace of $W$, and always has dimension $\le$ that of $V$. (Proof hint: show that if $I\subseteq im(f)$ is linearly independent in $W$, then $f^{-1}(I)$ is linearly independent in $V$.) So in this sense, linear maps can't "increase dimension".
This is a perfectly respectable linear map from $\mathbb{R}$ to $\mathbb{R}^2$.
Why do you think the dimension of the codomain can't be larger than the dimension of the domain? The dimension of the range (the actual image) can't be larger.
Although the essence has already been stated, let me try to give you a more graphic approach to linear maps. Often, when you get the right mental picture of a construct, the properties fall right into place.
PS: Whoops, That turned out to be a lot. I hope it's not a bad thing I kind of answer your question in the last paragraph, only. I hope this is helpful for somebody though.
Plus, I hope I didn't make any false statements here considering more than finite dimensions.
The definition
Let $V$, $W$ be vector spaces over a field $F$. A map $f: V → W$ is called linear, if:
$∀x, y\in V: f(x+y) = f(x)+f(y)$
$∀x \in V, λ \in F: f(λx) = λf(x)$.
What do linear maps map?
The first important thing here is that they map vector spaces into vector spaces. They can be anything, so this alone doesn't help a lot. Could they be something different than vector spaces? Well, if they weren't, our statements wouldn't make much sense – they use scalar multiplication and addition, which are operations only defined in vector spaces. So far nothing interesting here.
You can, however, immediately ask: “What does the image of a linear map look like?”, or, “in what way changes/transforms $f$ the space $V$ to $W$?”. What can this subset of $W$ look like? For instance, if $V=ℝ^3, W=ℝ^3$, can the image be the sphere? It obviously cannot, since for every vector $w = f(v)$ in the image, we can scale the parameter $w$ and get a scaled version $f(λv) = λf(v) = λw$. This greatly restricts what the image qualitatively looks like!
In fact, if you follow a similar argument for the preserving of addition, you might conjecture: The image itself is a vector space!
Proof (For the sake of completeness)
Let $x, y\in f[V], λ\in F.$ Thus we find $v\in V: x=f(v)$ and $w\in V: y = f(w)$. Now, $x+y=f(v)+f(w)=f(v+w)$, thus $x+y$ is in the image. Similar we get $λx = λf(v) = f(λv)$ thus $λx$ in the image. QED.
And now?
The fact that the image is a vector space being a subset of the vector space $W$, i.e. a (vector) subspace of $W$, helps for the intuition: e.g. in $ℝ^3$, vector subspaces are ${0}$, lines and planes through the origin, and $ℝ^3$ itself. So somehow, $f$ transforms a vector space $V$ into a subspace of $W$. At the moment, we don't know an important thing however: How “big” is this subspace? Can we say something about the dimension? If not, can we find some restriction like upper/lower bound?
The trick: Don't look at the whole space
Let's just assume $V$ and $W$ have a basis, and, to make writing sums easier, to be finite dimensional. We then can express elements of these spaces as the sum of the basis vectors scaled by a certain amount, i.e. as the “coordinate tuple” which are said amounts. The (unique, bijective) map from the coordinate tuples to the vectors is called the “basis isomorphism”.
Let's look at a vector $x=f(v)$ in the image of $f$. Choosing any ordered basis $(b_n)_n$ of $V$, we can write it as: $x = f(v) = f(\sum_{i=1}^n b_i v_i)$.
We „expanded“ the vector $v$ in the preimage by looking at the bases separately (the $v_i$ are the coefficients with regards to our basis $b_i$).
Now, the preserving of addition and scalar multiplication comes in handy: We can move the summation one level out! $$x = f(v) = \cdots = \sum_{i=1}^n v_i f(b_i)$$ This is actually a big deal! We now know that any element of the image can be described as linear combinations of the images of some basis elements in $V$ (or: it lies in the span of the image of the basis) – or, to put it differently: If you know the image of the basis elements, you know the image of the whole space.
Once I got this, I pictured every (finite-dimensional, well, to be honest, 3-dimensional) linear map by picturing a base on the left side and the image of that base on the right side.
This gives you immediately one constraint: The dimension of the image can at least not be larger than $\dim V$, since it is spanned by $\dim V$ (not necessarily linearly independent) vectors. Can it be less? Yes, if the images of the basis vectors are linearly dependent: Consider e.g. the map $$f: ℝ^3→ℝ^3, (x, y, z)↦ (x+z, y+z, 0)$$
It maps $e_x, e_y$ to themselves, but $f(0, 0, 1)=(1, 1, 0)$. So the base of the preimage maps to three vectors each lying in the $x-y$-Plane – in other words, they are linearly dependent, and span a subspace not of dimension 3, but of dimension 2.
Your Question
To answer your question: Yes, maps can indeed map to higher dimensional spaces. For instance, take $f: ℝ^n→ℝ^n+k, (x_1, …, x_n)↦(0, 0, …, x1, …, x_n)$.
The dimension of their image (also called “rank”), however, cannot have a higher dimension. Thus, if you map to a higher dimension, your map cannot be surjective anymore.
Matrix and determinant
You might notice that whether or not the images of the basis vectors are linearly independent is a major factor to qualitatively determine the nature of this function (let the word sink in for a moment: determin-e… rings a bell?). Consider injectivity: If a n-dimensional space is transformed into an $m<n$-dimensional one, can the map still be injective? The intuition screams “no”! But let me omit a proof here.
Let's pick a basis for each $\dim V=n$ and $\dim W=m$, and just care about the tuple representation of the vectors (lying in $F^n$ and $F^m$, respectively). The images of the basis vectors can now be written as such a tuple. View this tuple as a column of numbers and put these tuples near each other – you now got a thing of height $m$ and width $n$ – actually, an $m \times n$ matrix. The Idea of matrices are that they are, vaguely speaking, coordinate representations of finite-dimensional linear maps.
If you now consider $V$ and $W$ to be of the same dimension, so our matrix becomes square.
We now can think of the determinant as, hold on, the n-dimensional signed volume of the image of the unit hypercube. If a 3-dimensional unit cube is „squashed“ in the image to vectors laying in a plane, it's 3-dimensional volume is zero – which hopefully gives you some intuition while seemingly every theorem in linear algebra is equivalent to $\det M = 0$. But check out this answer and especially this answer – they do an excellent Job in making the determinant more accessible.