Why is the tensor product important when we already have direct and semidirect products?
There are literally dozens of independent reasons to invent the tensor product, and just about every area of mathematics needs the tensor product for its own reasons (often several reasons). Here are a couple examples.
Suppose $X$ and $Y$ are topological spaces (metric spaces are fine if you like them better) and consider the rings $C(X)$ and $C(Y)$ of continuous real-valued functions. If you are convinced that products are worthy of consideration, then perhaps you are convinced that it is useful to look at $C(X \times Y)$. It is natural to ask if this can be expressed in terms of $C(X)$ and $C(Y)$; the answer (modulo largely irrelevant technical details) is that $C(X \times Y) = C(X) \otimes C(Y)$.
Let $V$ be a vector space over $\mathbb{R}$. It is often desirable to construct a complex vector space naturally associated to $V$ (the "complexificiation" of $V$). Here by "naturally" I mean in a way which is coordinate free and transparently compatible with linear maps. The solution is to set $V_{\mathbb{C}} = V \otimes \mathbb{C}$ (tensor product over $\mathbb{R}$). This is a special case of the more general phenomenon of "extension of scalars". As a fancy example demonstrating that this really is as useful as I claim, you might check out the wikipedia page on "pontryagin classes" (though it might be over your head if you haven't learned much algebraic topology).
One of the reasons why direct sums are important is that they let you turn strange objects into groups. For example, if $G$ is a group and $V$ and $W$ are two representations of $G$ (vector spaces on which $G$ acts nicely), then $V \oplus W$ is also a representation of $G$. So the set of all representations of $G$ has an additive structure, and with a little algebraic magic one can upgrade this structure to a group (don't spend too much time worrying about how you subtract representations). Groups are nice and have lots of their own invariants, but rings are even nicer and have even more invariants. So it would be great if we could define a natural product of representations. You guessed it: the product of $V$ and $W$ is just $V \otimes W$. The set of all representations of $G$ with this structure is the infamous "representation ring" of $G$. This product structure is apparently of paramount importance in quantum mechanics (I don't know why). As another example where the tensor product turns a group into a ring, you might check out the Wikipedia page on "topological K-theory".
There are many more examples. If you know about functional analysis, the Schwartz kernel theorem is a tool used to investigate existence questions and regularity properties of partial differential equations, and it can be formulated purely in terms of Grothendeick's theory of topological tensor products. I can't give you any deep reason why the same algebraic gadget has such a diverse array of applications, but I guess that's the way it is. You'll undoubtedly learn more as you keep studying math.
ADDED: I just noticed the other part of your question, in which you ask about the "lifting" property of the tensor product. If I were forced to give a one sentence explanation of what the tensor product really is, it would be the following sentence. Given two $R$-modules $A$ and $B$, we want to convert $R$-bilinear maps on $A \times B$ into linear maps on some other object. We want to do this because for many purposes it reduces the structure theory of bilinear maps to the (extensive!) structure theory for linear maps. The lifting property that you describe tells us that the tensor product does the job.
But it more than just "does the job" — it does the job in the absolute best way possible. When you learn about most mathematical objects, such as the direct sum of two vector spaces, it is typical to define the object as some set equipped with some structure and then prove that it has certain nice properties. With the tensor product, you should go about it backwards: you should think of the tensor product as an object with certain nice properties and then prove that there actually is an object with all of those properties. This is because the actual construction of the tensor product of two modules is completely unenlightening and completely irrelevant to how you actually use the idea in practice.
I'll be a little less vague and outline how the tensor product should be developed from scratch. Given two $R$-modules $A$ and $B$, define a tensor product of $A$ and $B$ to be a pair $T, t$ where $T$ is a $R$-module and $t: A \times B \to T$ is a bilinear map with the property that given any bilinear map $Q: A \times B \to C$ there exists a unique linear map $L: T \to C$ such that $Q = L \circ t$.
Lemma 1: If the tensor product exists, it is unique up to unique isomorphism.
Lemma 2: The tensor product exists.
Tensor products are useful because of two reasons:
- they allow you to study certain non linear maps (bilinear maps) by transforming them first into linear ones, to which you can apply linear algebra;
- they allow you to change the ring over which a module is defined.
There are many, many ways in which these two abilities show up in nature.
We define tensor products for the same reason we define any other abstract mathematical structure: it's a structure that shows up a lot in mathematics, so it's worth having a name for. I don't see why this reason applies any less to tensor products than to direct sums. As the Wikipedia article says,
Tensor products are important in areas of abstract algebra, homological algebra, algebraic topology and algebraic geometry
and tensor products of vector spaces are also important in differential geometry and physics. I think it is better to learn about these applications thoroughly than to have someone attempt to summarize them.
Tim Gowers has written a short introduction to tensor products here, but it does not, in my opinion, give a good sense of the wide range of applicability of this notion.