What exactly is a tangent vector?
The answer, in my opinion, depends on your general philosophy about mathematics, that is, if you lean towards intuitionism, or logicism, or formalism, or other flavours. For these see e.g. the article on Stanford Encyclopedia, or the Introduction of Lindström et al. (2009), or even better read some of their history in Kline's (1982) fascinating book.
The answer I give here has a markedly strong intuitionist flavour.
There are many concepts in mathematics and geometry which we understand intuitively but are difficult to nail down with a definition. The very notion of number is one of these. What is the number "2", for example? Its Zermelo-Fraenkel set-theoretic definition is this: $$\{\{\},\{\{\}\}\}.$$ Well, this definition is quite far from what I intuit as "2". And I'm not alone; see Benacerraf's (1965) article for example. But this definition, in the whole theory it is part of, captures many important properties of what we understand as "2".
Another example is the notion of function. It's defined in terms of a subset of a Cartesian product, and this definition captures its properties. My intuition of it, though, is more in terms of a procedure or rule associating this with that; probably this is your intuition of it too.
Another example is the notion of (local) orientation of a manifold. It can be defined as an equivalence class of charts and coordinate orderings (Choquet-Bruhat et al.) or of $n$-forms (Marsden et al. 2007). Well, again, my intuition of orientation is somehow more immediate than such definitions, but they capture its properties well.
Still another example is the notion of outer-oriented vector, also called "twisted vector". If you've never heard about such an object, I warmly invite you to take a look at Burke's (1983, 1987, 1995, 1995b), Bossavit's (1991, 1998), Schouten & van Dantzig's (1940), and Schouten's (1989) brilliant works (the last two are more technical but have very neat pictures). It's a beautiful and geometrically very intuitive object: it's like a vector with an orientation not along its own line, but "around" it. A picture can give you a good intuition: within a two-dimensional Eclidean space it is something like this: ; within a three-dimensional one it is something like this: . This object and its cousins can be added, subtracted, etc. just like vectors; they have many useful applications in mechanics, electromagnetism, and the theory of integration on manifolds. The definition of an outer-oriented vector involves equivalence classes of pairs of orientations and vectors (Bossavit 1991, 1998), and captures all the properties of this notion. But my intuition of this object is much more immediate than such a definition.
The point is that saying what something is is different from saying how something is defined. In mathematics and geometry we have many notions that we understand intuitively. When we axiomatize a theory, we aim for economy: we choose a small set of such intuitive notions as primitives – and therefore undefined – and define the remaining ones in terms of these. There are good reasons for proceeding this way: for example, we are less likely to create logical clashes in the properties of the primitives. But the price for this economy is that the definitions of other intuitive notions often end up being very convoluted.
The point of this long preamble is that I don't think that your question can be answered by a definition. Now imagine that someone asks you "what exactly is the number 2?". You could tell them "it's $\{\{\},\{\{\}\}\}$", but such a definition would probably leave them unsatisfied. What would be your answer? The problem, as you understand, is that they lack an intuitive understanding of "2" and its complex uses. They need to build such an understanding. It can only be built by considering many different definitions, examples, and especially by working with the concept.
So my (non-)answer to your question is this: you probably just need to be patient and build an intuitive understanding of it, by considering alternative definitions and especially by using it in practice.
I found the different definitions and perspectives by the following authors very helpful:
The great Cartan (1983) apparently thought of tangent vectors as traditional "arrows" in $R^n$, "tangent" to a manifold in the traditional sense of the word – just like you do! He imagined the manifold embedded in some $R^n$, a conceptual device which we can always use for any manifold (Hirsch 1961). (This view of Cartan's was discussed by a mathematician or a philosopher of mathematics, but I can't remember who and have been unsuccessful in finding this reference, sorry.)
Chevalley (1946), followed by Penrose & Rindler (1987), sees tangent vectors as differentiation operators; what's interesting is that his definition of manifold doesn't involve atlases of coordinates.
Burke's (1987) definition, § II.8, is similar to the traditional one in terms of an equivalence class of curves, but he offers a more pictorial understanding; see his discussion about "addition of curves".
Choquet-Bruhat et al. (1996), § III.B.1, discuss and compare several traditional definitions.
Kennington (2018) offers many, many insightful and well thought-out remarks and historical notes on the notion of tangent vectors. In particular, search for
Remark: Curve-classes and differential operators are unsatisfactory as tangent vectors
, forStyles of representation of tangent vectors
, and forThe true nature of tangent vectors
(section numbers aren't reliable, as this book is very frequently updated). I recommend doing a text search for the stringtangent vec
to find other interesting passages.
If I may add a personal note, in my concrete experience I've always had to deal with parameterized curves – concrete, particular curves on particular manifolds – and therefore I've never needed to consider equivalence classes of curves. This has also been possible because many objects usually represented as tangent vectors, e.g. electric fields or forces, can in fact be better represented as forms, which in my opinion are much easier to understand intuitively and picture geometrically (see Burke's and Bossavit's works above).
References
Benacerraf, P. (1965): What numbers could not be https://doi.org/10.2307/2183530.
Bossavit, A. (1991): Differential Geometry: for the student of numerical methods in Electromagnetism https://www.researchgate.net/publication/200018385_Differential_Geometry_for_the_student_of_numerical_methods_in_Electromagnetism, especially § 3.2.
Bossavit, A. (1998): On the geometry of electromagnetism https://www.researchgate.net/publication/254470625_On_the_geometry_of_electromagnetism, especially part (2).
Burke, W. L. (1983): Manifestly parity invariant electromagnetic theory and twisted tensors https://doi.org/10.1063/1.525603.
Burke, W. L. (1987): Applied Differential Geometry (Cambridge University Press), especially chaps IV–V.
Burke, W. L. (1995): Div, Grad, Curl Are Dead http://people.ucsc.edu/~rmont/papers/Burke_DivGradCurl.pdf.
Burke, W. L. (1995b): Twisted forms: twisted differential forms as they should be http://www.ucolick.org/~burke/forms/tdf.ps.
Cartan, E. (1983): Geometry of Riemannian Spaces (translation of 2nd Frenc ed., Math Sci Press).
Chevalley, C. (1946): Theory of Lie Groups I (Princeton University Press), especially chap III.
Choquet-Bruhat, Y., C. DeWitt-Morette, M. Dillard-Bleick (1996): Analysis, Manifolds and Physics. Part I: Basics (rev. ed., Elsevier), § IV.B.1.
Hirsch, M. W. (1961): On Imbedding Differentiable Manifolds in Euclidean Space https://doi.org/10.2307/1970318.
Kennington, A. U. (2018): Differential geometry reconstructed: a unified systematic framework http://www.geometry.org/.
Kline, M. (1982): Mathematics: The Loss of Certainty (Oxford University Press).
Lindström, S., E. Palmgren, K. Segerberg, V. Stoltenberg-Hansen (eds) (2009): Logicism, Intuitionism, and Formalism: What Has Become of Them? (Springer).
Marsden, J. E., T. Ratiu (2007): Manifolds, Tensor Analysis, and Applications (3rd ed., Springer), § 7.5.
Penrose, R., W. Rindler (1987): Spinors and Space-Time. Vol. 1: Two-spinor calculus and relativistic fields (Cambridge University Press), especially chap 4.
Schouten, J. A. (1989): Tensor Analysis for Physicists (2nd ed., Dover).
Schouten, J. A., D van Dantzig (1940): On ordinary quantities and $W$-quantities. Classification and geometrical applications http://www.numdam.org/item/CM_1940__7__447_0.
The equivalence relation on the (parametrised) curves is exactly that they give the same differential operator at the point $p$; this reconciles those two definitions. See below for some more details.
As for intuition, that can be a little harder. The thing to recognise is that we need a definition of tangent vectors without reference to an ambient space. In other words, we may have a manifold defined abstractly, rather than embedded in some Euclidean space (in general relativity, this is often the case).
Let $\{y^i\}$ be coordinates on $n$-dimensional Euclidean space. Notice that if we take a 'vector' $\vec{v} = (v^1, \ldots, v^n)$, we can think of this as giving the derivatives of the coordinate functions along the straight line parametrised by $y^i(t) = y^i_0 + v^i t$. Indeed, the directional derivative of any function $f$ along this line is given by $$ \frac{df}{dt} = \sum_i v^i \frac{\partial f}{\partial y^i} $$ So we can associate $v$ with the differential operator $\sum_i v^i\partial/\partial y^i$. You can easily check that this gives a linear isomorphism between the vector space $\mathbb{R}^n$ and the space of operators spanned by $\{\partial/\partial y^i\}$.
Let me return briefly to the 'equivalence classes of curves' bit. Choose some other parametrised curve $\gamma : [0,1] \to \mathbb{R}^n$ passing through the same point, i.e. $y^i(\gamma(0)) = y^i_0$. Then the directional derivative of $f$ along the curve at the point $(y^1_0,\ldots, y^n_0)$ is $$ \left.\frac{df(\gamma(t))}{dt}\right\vert_{t=0} = \sum_i \left.\frac{\partial y^i(\gamma(t))}{\partial t}\frac{\partial f}{\partial y^i}\right\vert_{t=0} $$ We define two curves $\gamma, \gamma'$ to be equivalent iff they give the same directional derivative for every function $f$, i.e. the same differential operator. This gives our isomorphism between equivalence classes of parametrised curves passing through some point and first-order differential operators at the point. Note that rescaling the parameter $t$ will rescale the differential operator, which is why we talk about parametrised curves.
Now suppose we have some embedded manifold $X$, with local coordinates $x^\mu$, and embedding functions $y^i(x^\mu)$. A tangent vector (in the familiar sense) to $X$ just gives the infinitesimal change in the coordinates $y^i$ when we change the coordinates $x^\mu$ by an arbitrary infinitesimal amount $\delta x^\mu$. We have $$ \delta y^i = \sum_\mu \delta x^\mu \frac{\partial y^i}{\partial x^\mu} $$ So the corresponding tangent vector, using the notation we introduced before, is $$ \sum_{i, \mu} \delta x^\mu \frac{\partial y^i}{\partial x^\mu} \frac{\partial}{\partial y^i} $$ The $\delta x^\mu$ are arbitrary, and as we vary them, we map out a linear subspace spanned by the 'vectors' $$ \sum_i \frac{\partial y^i}{\partial x^\mu} \frac{\partial}{\partial y^i} = \frac{\partial}{\partial x^\mu} $$
Now we realise that the operators $\partial/\partial x^\mu$ don't actually depend on the embedding at all, and we have successfully defined tangent vectors in an intrinsic way on any differentiable manifold!
This has become a very long answer, but I hope it's helpful.