Why can't differentiability be generalized as nicely as continuity?
There "is" a way, since in algebraic geometry, we do not work over the real numbers in general, yet we use techniques inspired from differentiation all the time. It is not the way preferred by most differential geometry textbooks who stick to charts and differentiable structures, but it still works.
A ringed space $(X,\mathcal O_X)$ is a topological space together with a sheaf of rings $\mathcal O_X$ on it. A locally ringed space is a ringed space such that the stalks $\mathcal O_{X,p}$ are local rings for each $p \in X$. If $(X,\mathcal O_X)$ is a locally ringed space, we can define the cotangent space at $p$ via $\mathfrak m_{X,p}/\mathfrak m_{X,p}^2$ where $\mathfrak m_{X,p}$ is the unique maximal ideal of $\mathcal O_{X,p}$. This is a vector space over the field $k(p) \overset{def}= \mathcal O_{X,p}/\mathfrak m_{X,p}$, so we can define the tangent space as the dual $k(p)$-vector space to $\mathfrak m_{X,p}/\mathfrak m_{X,p}^2$, or in other words, the set of all linear maps $\mathfrak m_{X,p}/\mathfrak m_{X,p}^2 \to k(p)$.
The idea is that if you have a linear map $\mathfrak m_{X,p}/\mathfrak m_{X,p}^2 \to k(p)$, and you are given a "function" $f \in \mathfrak m_{X,p}$, the value of the linear map at $f$ should give you the "direction"al derivative of $f$.
Of course, this level of abstraction removes any reference to coordinate patches and such, so it is hard to see what's going on. To remove all the algebro-geometric nonsense and get a particular example, take a manifold $M$ (which is a topological space) and consider the sheaf $\mathcal O_M$ of $C^{\infty}$-functions on $M$, that is, if $U \subseteq M$ is an open set, $\mathcal O_M(U)$ is the set of smooth functions $U \to \mathbb R$. Then $\mathcal O_{M,p}$ consists of all germs of functions at $p$, and $\mathfrak m_{M,p}$ is the maximal ideal of those germs whose value at $p$ is zero. The ideal $\mathfrak m_{M,p}^2$ is the ideal of all finite sums of products of two functions in $\mathfrak m_{M,p}$, and in particular such functions always vanish with multiplicity $\ge 2$ (this is the product rule for differentiation). It is usually shown that the dual of $\mathfrak m_{M,p}/\mathfrak m_{M,p}^2$ corresponds to the space of all derivations at $p$ ; note that in this case, we have $k(p) \simeq \mathbb R$, so that the linear maps $\mathfrak m_{M,p}/\mathfrak m_{M,p}^2 \to k(p)$ actually take values in the real numbers.
Of course, the sheaf $\mathcal O_M$ needs to be defined, and this is usually done at some point using coordinate patches ; you do not get away from dealing with charts when doing differential geometry. Sure, at some point you stop using them if you deal with coordinate-free approaches, but they lie somewhere in the treatment of the theory. My point is that the ideas of differentiation do generalize, and this is just a quick glance of how it does ; algebraic geometry takes "differentiation" to a whole new level.
As continuity has been weakened and played with in many different ways (weak/weak-star convergence in functional analysis, semi-upper-lower-continuity in optimization, etc.) and differentiation too (Fréchet, Gâteaux, directional derivatives, semi-directional derivatives, Dini derivatives), you should always remember one thing : yes generalizations are useful, but you should never forget why you wanted to generalize in the first place. It can be either because you want to pay attention to a certain class of problems that you cannot solve and want to have a clearer point of view or to build stronger tools, but generalizing for the sake of generalizing usually leads to being confused and losing intuition, which is not what you want. To this day I am still scared of the use of Dini derivatives...
Hope that helps,
It seems some of the frustration is with having to define manifolds from the inside out, by pasting together coordinate charts. It's possible to axiomatize a notion of "smooth object" in such a way that we can work with smooth functions on analogues of manifolds, function spaces, products, quotients, intersections, infinitesimally small spaces, and anything else you (or, at least, I) can imagine, and thus define spaces from the outside in, never using coordinates or charts in the definitions. This theory is called synthetic differential geometry (SDG.)
SDG gives a very different definition of smooth function than one uses classically, because one works in a logical framework in which it's impossible to even define non-smooth functions. So, synthetically, a smooth function is...any function between spaces in a model of SDG! This doesn't mean much until you know about what it takes to be a model of SDG. What SDG axiomatizes is what an object $R$ has to do to be able to act like a smooth line to do some differential geometry: so it has some elements which square to zero on which every function is linear (which leads to the definition of derivative,) a much bigger collection of nilpotent elements on which every function is defined by its Taylor series, every function on it has an antiderivative, etc...Then the other smooth objects all have their maps determined, at least locally, by their relationship to this "line" and its infinitesimally small subspaces.
So perhaps this sounds a bit too much like basing manifolds on $\mathbb{R}$ to you. I can assure you that $R$ can be wildly more exotic than $\mathbb{R}$, if that helps; and again objects in SDG are not by any means constructed locally out of finite products of $R$. A more compelling objection is probably that you've never heard of this, which is because, as you can already tell, it's very different from the geometry you know; and to actually develop the foundations requires large amounts of category theory that most geometers don't want to learn. However, the good news is that it's possible to do all of classical differential geometry within this framework, so that one can, at least in principle, think in synthetic terms and then translate proofs into language more familiar to the community.
Patrick Da Silva already gave a great answer. This is just an answer to complement. I am also not answering your question exactly since I am cheating, in my definition I am using the word "smooth" which already implies I am using differentiability, which is what is bothering you. I just want to point out how differential operators can be defined without local definitions. If you combine this with Patrick Da Silva you might obtain a construction ignoring my abuse of the word smooth.
In fact, differential operators can be set in a purely algebraic abstract way without even requiring a manifold. People use this in commutative algebra. Here are some lecture notes with the more general construction but it will require a perhaps mature level of algebra to be able to grasp it.
Here you have a simplified version for the tangent bundle of manifold $M$, to avoid using general vector bundles. The construction takes a while to digest, but it is very interesting. I apologise if the following it does not make too much sense as it I write it. I am definitely skipping many details. I am summarising the presentation given in here.
1) Define $\mathrm{Op}(TM)$ as the space of linear operators $T:\Gamma(TM)\to \Gamma(TM)$ (here I use $\Gamma$ to mean smooth sections, note that a smooth section of $TM$ is a by definition a smooth vector field).
2) For each smooth function $f\in C^\infty(TM)$, define a map $\mathrm{ad}(f)$ with domain $\mathrm{Op}(TM)$ such that for each $T\in\mathrm{Op}(TM)$ you obtain a map $\mathrm{ad}(f)(T)\colon \Gamma(TM)\to\Gamma(TM)$ defined by $$ (\mathrm{ad}(f)T)u = [T,F]u:=T(fu)-f(Tu),\quad \forall u\in \Gamma(TM)$$
3) Now inductively define a sequence of subspaces $$ \mathrm{PDO}^{(0)}(TM) \subset \mathrm{PDO}^{(1)}(TM) \subset ... \subset \mathrm{PDO}^{(k)}(TM) \subset ... $$ Following the prescription: $$ \mathrm{PDO}^{(0)}(TM) = \mathrm{hom}(TM,TM)$$ (thinks of this as a collection of maps $TM_x\to TM_x$ for all $x\in M$) and $$ \mathrm{PDO}^{(k+1)}(TM) = \{ T\in\mathrm{Op}(TM);[T,f]\in \mathrm{PDO}^{(k)}(TM), \forall f\in C^\infty(M) \}.$$ The elements of $\mathrm{PDO}^{(k)}(TM)$ are called partial differential operators of order $k$. Your job here (not a super easy one) is to convince yourself that this notion agrees with your idea of what a differential operator should be.