How is the epsilon-delta definition of continuity equivalent to the following statement?

Intuitively, if $U$ is open but $f^{-1}(U)$ is not, then $f^{-1}(U)$ contains a point $x_0$ such that for any neighborhood of $x_0$, however small, contains points outside of $f^{-1}(U)$. In other words, one can choose a point $x$ arbitrarily close to $x_0$ such that $f(x)\notin U$, even though $f(x_0)\in U$. For such a point $x$ very, very close to $x_0$, the value of $f(x)$ abruptly “jumps” outside the open set $U$, which is a violation of our intuitive concept of continuity: If $f$ were continuous, then one would expect that for a point $x$ very close to $x_0$, $f(x)$ should be very close to $f(x_0)$.


Formally, I assume we work in metric spaces $(\mathbb X, d_{\mathbb X})$ and $(\mathbb Y,d_{\mathbb Y})$.

The inverse-image definition implies the $\varepsilon$-$\delta$ definition.

Suppose that the inverse-image criterion is satisfied. Let $x_0\in\mathbb X$ and $\varepsilon>0$. Then, the ball $$B_{\mathbb Y}(\varepsilon, f(x_0))\equiv\{y\in\mathbb Y\,|\,d_{\mathbb Y}(y,f(x_0))<\varepsilon\}$$ of radius $\varepsilon$ about $f(x_0)$ is open in $\mathbb Y$, hence $f^{-1}(B_{\mathbb Y}(\varepsilon,f(x_0)))$ is open in $\mathbb X$. Since $x_0\in f^{-1}(B_{\mathbb Y}(\varepsilon,f(x_0)))$, there exists some ball of radius $\delta>0$ about $x_0$ such that $$B_{\mathbb X}(\delta,x_0)\subseteq f^{-1}(B_{\mathbb Y}(\varepsilon,f(x_0)))$$ This is exactly the $\varepsilon$-$\delta$ criterion: if $x\in \mathbb X$ is such that $d_{\mathbb X}(x,x_0)<\delta$, then $d_{\mathbb Y}(f(x),f(x_0))<\varepsilon$.

The $\varepsilon$-$\delta$ definition implies the inverse-image definition.

Suppose that the $\varepsilon$-$\delta$ criterion holds and let $U\subseteq\mathbb Y$ be open. By the definition of openness in metric spaces, there exists for each $y\in U$ some $\varepsilon_y>0$ such that $$B_{\mathbb Y}(\varepsilon_y,y)\subseteq U.$$ In fact, it is not difficult to check that $$U=\bigcup_{y\in U}B_{\mathbb Y}(\varepsilon_y,y).\tag{$\clubsuit$}$$ I now claim that $f^{-1}(U)$ is open in $\mathbb X$. Suppose that $x_0\in f^{-1}(U)$. Then $f(x_0)\in U$, so $f(x_0)\in B_{\mathbb Y}(\varepsilon_{y_0},y_0)$ for some $y_0\in U$ by ($\clubsuit$). [In fact, as @Dominik pointed out in a comment below, one can take $y_0\equiv f(x_0)$. This observation allows to make the derivation that follows a lot simpler.] That is $d_{\mathbb Y}(f(x_0),y_0)<\varepsilon_{y_0}$. Define $$\xi\equiv\varepsilon_{y_0}-d_{\mathbb Y}(f(x_0),y_0)>0.\tag{$\star$}$$ By the $\varepsilon$-$\delta$ definition of continuity, there exists some $\delta>0$ such that $$\text{if }x\in\mathbb X\text{ and }d_{\mathbb X}(x,x_0)<\delta\text{, then }d_{\mathbb Y}(f(x),f(x_0))<\xi.\tag{$\diamondsuit$}$$ I now claim that $$B_{\mathbb X}(\delta,x_0)\subseteq f^{-1}(U),\tag{$\spadesuit$}$$ which will show that $f^{-1}(U)$ is open (since its generic element $x_0$ has a ball around it still in $f^{-1}(U)$), as desired. To this end, let $x\in B_{\mathbb X}(\delta,x_0).$ That is, $d_{\mathbb X}(x,x_0)<\delta$. Then, by ($\diamondsuit$), one has that $$d_{\mathbb Y}(f(x),f(x_0))<\xi.$$ In turn, the triangle inequality and ($\star$) imply that $$d_{\mathbb Y}(f(x),y_0)\leq d_{\mathbb Y}(f(x),f(x_0))+d_{\mathbb Y}(f(x_0),y_0)<\xi+d_{\mathbb Y}(f(x_0),y_0)=\varepsilon_{y_0}.$$ This means that $f(x)\in B_{\mathbb Y}(\varepsilon_{y_0},y_0)\subseteq U$, so that $x\in f^{-1}(U)$. Therefore, ($\spadesuit$) holds, as claimed.


triple_sec has already written a detailed explanation as to why both definitions are equivalent, so I will try to give reasons as to why the other definition is useful.

The definition with the open sets doesn't need any concept of distance, only a concept of what an open set is in each space. This can be used to generalize the definition of continuous functions to general topological spaces.

More specifically, for a set $X$ we call a collection of subsets $\tau \subset \mathcal{P}(X)$ a topology if the following three conditions hold:

  1. $\emptyset \in \tau$ and $X \in \tau$.
  2. If $(A_i)_{i \in I}$ is a family of sets from $\tau$, then $\bigcup \limits_{i \in I} A_i \in \tau$.
  3. If $A_1, \ldots, A_n$ is a finite collection of elements from $\tau$, then $\bigcap \limits_{i = 1}^n A_i \in \tau$.

The pair $(X, \tau)$ is called a topological space and the sets $A \in \tau$ are called open sets.

We can call a set $A$ in a metric space $\epsilon$-open*, if for every point $x_0 \in A$ there is a $\epsilon > 0$ for which $B_X(\epsilon, x_0) \subset A$. This is probably the definition of an open set that you've seen so far. It is easy to check that $\{A \subset X \;|\; A \text{ is $\epsilon$-open}\}$ is a topology on $X$. This way, every metric on a space $X$ induces a topology on $X$.

Now using the definition of continuity that only needs a concept of open sets, we can generalize the notion of a continuous function between two metric spaces to a continuous function between two topological spaces. This is a pretty big generalization, as not every topology on a topological space is induced by a metric [for example, take any non-Hausdorff space].

The definition via open sets can in some cases also be applied to show certain properties of a set. In a topological space $(X, \tau)$ we call a set A closed iff its complement $A^c$ is an open set. It is now again easy to see that the notion of a closed set in the metric setting is the same as the notion of a closed set in the corresponding topological setting. Now consider the function $f: \mathbb{R}^n \to \mathbb{R}$, $x \mapsto ||x||_2$. It is easy to see that this function is continuous if we endow $\mathbb{R}^n$ and $\mathbb{R}$ with their standard-topology [i.e. the topology induced by the euclidean distance]. Now the definition of continuity that uses open sets shows immediately that $\mathbb{S}^{n - 1} = f^{-1}(\{1\}) = (f^{-1}(\{1\}^c))^c$ is a closed set.

Another important observation is that two different metrics can induce the same topology. For example, the $p$-norms on $\mathbb{R}^n$ with $0 < p \le \infty$ all induce the same topology. Now if we endow $\mathbb{R}^n$ and $\mathbb{R}^m$ with arbitrary $p$-norms, we can see that the continuity of a function $f: \mathbb{R}^n \to \mathbb{R}^m$ doesn't depend on which $p$ you choose in each respective space.

An often used result in this context is, that on a finite-dimensional real vector space all norms are equivalent. This implies that if we endow two finite-dimensional normed spaces $V, W$ with topologies that are induced by norms, the continuity of a function doesn't depend at all on the specific choice of our norms.

*Please not that the term "$\epsilon$-open" is not a generally used mathematical term. I only used it here to differentiate between the two notions of open sets I've used in this post.