Why is a random variable called so despite being a function?
If you're in elementary probability instead of measure-theoretic probability, the following will make very little sense. My apologies if this is the case.
I'd say one reason is that we really don't look at the properties of random variables as functions from the underlying set of the probability space (usually denoted $\Omega$). You could change $\Omega$ all around, leaving the structure of $X$ behind, and we wouldn't really care. When I say the structure of $X$, I mean the measure it induces on $\mathbb{R}$, i.e. $\mu(I) = \mathbb{P} \left ( \left \{ \omega \in \Omega : X(\omega) \in I \right \} \right )$. We arguably care a little bit more about $X$ as a function when we start talking about collections of dependent random variables, but really we don't care about it then, either, since then we are just inducing a measure on $\mathbb{R}^n$ instead of $\mathbb{R}$.
Because we think of it as a variable that take random value intuitively. Formally they are function. Just like why we call a sequence a sequence, or call an arithmetical function an arithmetical function, when they are actually the same thing formally speaking. Just to add to the issue, calling a variable also match the notation. For example, $X=Y+Z$ is NOT the usual function addition, but they are "added" in such a way that make sense when we think as variable.
I'm 4 years late :) but will answer anyway (mostly for my own sake).
In math you have, what they call in school, a variable $x$. You use this $x$ as a place holder for some number. But what number? You can't say for sure, all you can say is that $x$ can vary in some interval $I=[0;1]$. That's basicly all you know about $x$. Well, it's not much but it works for 99% of time when you need to model some phenomenon.
For example, let $x=bees/crows$. You go out and count bees and crows in some park, then you can say that $x$ is equal to some number. BUT Before going out you CAN'T guess the value $x$ will take, the best you can do is to say that:
"There could be no crows or no bees or some number of each in the park. Yeah, and there is no such thing as negative number of bees, so negative numbers won't do. So, I can say that $x \in [0,\infty] \cup {?} $, where $?$ will be used when I find no bees and no crows i.e. $x=0/0=?$ "
That's it! Now comes the random variable $X$. It has the same use as $x$ has but you can say A LOT MORE about $X$!!! You can expect $X$ to take some values which you couldn't do with $x$. Why? Because if there is $X$ in the book then there is a probability function $\mathbb{P}$ somewhere, you can't have one without the other otherwise you endup with dull $x$ instead.
For example, let $X$ be a random variable defined the same way as above $bees/crows$. Now you need to go to the park and count the bee and the crows. But before doing so, what can you say about $X$?
Well, $X$ belongs to the same interval $X \in [0,\infty] \cup ?$ but now we have a tool $\mathbb{P}$-probability, which we can use to GUESS the values $X$ will take. This is much more than we could have done with simle varibale $x$! Without $\mathbb{P}$ you can't even guess the value of $x$, but having defined $\mathbb{P}$ you can say that $X$ will take value 0.5 with probability 90% and thus I can save my energy and not go to the park at all!!!
P.S. So with random variable you have additional information about the possible values $x$ can assume. So as a mnemonic you can makeup such "an equation" $$\mathbb{P} + x = X$$ Thus it would be better to call $x$ as unguessable, and $X$ guessable variable. So you can see that using x for both of them $X,x$ is justified since they are of the same "logic". Having $X$ explaind as a function is not a bizzar thing, cause you can explain $x$ in the same way ($x$ is a function from set of hyppos to the set of reals $\mathbb{R}$ i.e. $x:Hippos\rightarrow\mathbb{R}$ so you use $x$ as a number of hippos in the park.