What is the Wilsonian definition of renormalizability?
This is a very good question which, however, shows the extent of the reigning confusion about renormalization even four decades after Wilson's Nobel Prize winning theory on the matter. I essentially answered the OP's question, and much more, about constructing continuum QFTs in Wilson's framework in my expository article "QFT, RG, and all that, for mathematicians, in eleven pages" but in a very condensed fashion (one needs to do computations on the side to follow what is being said). So let me give more details pertaining to the OP's specific question. I should preface this by saying that what follows is a "cartoon" for renormalization. I will oversimplify things by ignoring anomalous dimensions, marginal operators, and nonlocal terms generated by the RG. You will not find technical details but hopefully the conceptual picture and logical structure of renormalization will become clearer.
The OP is right to point out that in the setting of ODEs and dynamical systems a first order equation can be run backwards in time. So let me start by recalling some important terminology from that area. Consider a first order nonautonomous ODE of the form $$ \frac{dX}{dt}=f(t,X)\ . $$ It generates a flow (groupoid morphism from time pairs to diffeomorphisms of phase space) I will denote by $U[t_2,t_1]$ which sends the initial value $X(t_1)$ to the value of the solution at time $t_2$. It trivially satisfies $\forall t, U[t,t]={\rm Id}$ and the semigroup property $$ \forall t_1,t_2,t_3,\ \ U[t_3,t_2]\circ U[t_2,t_1]=U[t_3,t_1]\ . $$ This time-dependent situation is to be distinguished from the autonomous ODE case $$ \frac{dX}{dt}=f(X) $$ where $U[t_2,t_1]=U[t_2-t_1,0]=:U[t_2-t_1]$.
In Wilson's RG, time is scale or more precisely, $t=-\log\Lambda$ where the UV cutoff is implemented in momentum space by a condition like $|p|\le\Lambda$ or in position space by $\Delta x\ge \Lambda^{-1}=e^t$. The high energy physics literature usually works in a nonautonomous setting while it is essential to translate the equation to autonomous form for a proper understanding of Wilson's RG. The latter imported tools and concepts from dynamical system theory like fixed points, stable and unstable manifolds etc. It is possible to do some contortions to try to make sense of these concepts in the nonautonomous setting, but these truly are notions that are congenial to autonomous dynamical systems.
Let $\mu=:\mu_{-\infty,\infty}$ denote the probability measure corresponding to the free Euclidean theory.
Its propagator is
$$
\int \phi(x)\phi(y)\ d\mu_{-\infty,\infty}(\phi)=\langle \phi(x)\phi(y)\rangle_{-\infty,\infty}=
\int\frac{dp}{(2\pi)^D} \frac{e^{ip(x-y)}}{p^{D-2\Delta}}
$$
where $\Delta$ is the scaling dimension of the field $\phi$. Normally, $\Delta=\frac{D-2}{2}$ but I will allow more general $\Delta$'s
in this discussion.
Now let me introduce a mollifier, i.e., a smooth function of fast decay $\rho(x)$ such that
$\int \rho(x)\ dx=\widehat{\rho}(0)=1$.
For any $t$, let me set $\rho_t(x)=e^{-Dt}\rho(e^{-t}x)$, so in particular $\rho_0=\rho$.
Let $\mu_{t,\infty}$ be the law of the field $\rho_t\ast\phi$ where $\phi$ is sampled according to $\mu_{-\infty,\infty}$ and we used a convolution with the rescaled mollifier. In other words, $\mu_{t,\infty}$ is the free cutoff measure at $\Lambda_H=e^{-t}$ and propagator
$$
\int \phi(x)\phi(y)\ d\mu_{t,\infty}(\phi)=\langle \phi(x)\phi(y)\rangle_{t,\infty}=
\int\frac{dp}{(2\pi)^D} \frac{|\widehat{\rho}_t(p)|^2\ e^{ip(x-y)}}{p^{D-2\Delta}}\ .
$$
Note that $\widehat{\rho}_t(p)=\widehat{\rho}(e^t p)$ which we assume to have decreasing modulus with respect to $t$.
We have $\widehat{\rho}_{-\infty}=1$ and $\widehat{\rho}_{\infty}=0$ and $|\widehat{\rho}_{t_1}(p)|^2-|\widehat{\rho}_{t_2}(p)|^2\ge 0$ whenever $t_1\le t_2$. One can thus define a more general family of modified free/Gaussian theories $\mu_{t_1,t_2}$ with $t_1\le t_2$ by the propagator
$$
\int \phi(x)\phi(y)\ d\mu_{t_1,t_2}(\phi)=\langle \phi(x)\phi(y)\rangle_{t_1,t_2}=
\int\frac{dp}{(2\pi)^D} \frac{\left(|\widehat{\rho}_{t_1}(p)|^2-|\widehat{\rho}_{t_2}(p)|^2\right)\ e^{ip(x-y)}}{p^{D-2\Delta}}\ .
$$
One has the semigroup property for convolution of (probability) measures
$$
\mu_{t_1,t_2}\ast\mu_{t_2,t_3}=\mu_{t_1,t_3}
$$
when $-\infty\le t_1\le t_2\le t_3\le \infty$. This means that for any functional $F(\phi)$,
$$
\int F(\phi)\ d\mu_{t_1,t_3}=\int\int d\mu_{t_1,t_2}(\zeta)\ d\mu_{t_2,t_3}(\psi)\ F(\zeta+\psi)\ .
$$
The other key players are scale transformations $S_t$. Their action on fields is given by $(S_t \phi)(x)=e^{-\Delta t}\phi(e^{-t}x)$
and obviously satisfies $S_{t_1}\circ S_{t_2}=S_{t_1+t_2}$.
Using the notion of push-forward/direct image of measures, one has $(S_t)_{\ast}\mu_{t_1,t_2}=\mu_{t_1+t,t_2+t}$, i.e.,
$$
\int d\mu_{t_1,t_2}(\phi)\ F(S_t\phi)=\int d\mu_{t_1+t,t_2+t}(\phi)\ F(\phi)\ .
$$
Since these are centered Gaussian measures, it is enough to check the last property on propagators, i.e., $F(\phi)=\phi(x)\phi(y)$
where this follows from a simple change of momentum integration variable from $p$ to $q=e^{-t}p$ in the above formula for the propagator.
This also covers the infinite endpoint case with the conventions $t+\infty=\infty$, $t-\infty=-\infty$ for finite $t$.
The high energy physics Wilsonian RG is the transformation of functionals $RG[t_2,t_1]$ for pairs $t_1\le t_2$ obtained as follows. Using the convolution semigroup property $$ \int e^{-V(\phi)} d\mu_{t_1,\infty}(\phi)=\int e^{-V(\zeta+\psi)} d\mu_{t_1,t_2}(\zeta)\ d\mu_{t_2,\infty}(\psi) $$ $$ =\int e^{-(RG[t_2,t_1](V))(\phi)} d\mu_{t_2,\infty}(\phi) $$ after renaming the dummy integration variable $\psi\rightarrow\phi$ and introducing the definition $$ (RG[t_2,t_1](V))(\phi):=-\log \int e^{-V(\zeta+\phi)} d\mu_{t_1,t_2}(\zeta)\ . $$ If $V$ is the functional of $\phi$ corresponding to the bare action/potential with UV cutoff $\Lambda_H=e^{-t_1}$, then $RG[t_2,t_1](V)$ is the effective potential at momentum/mass scale $\Lambda_L=e^{-t_2}$. Trivially (Fubini plus associativity of convolution of probability measures) one has, for $t_1\le t_2\le t_3$, $$ RG[t_3,t_2]\circ RG[t_2,t_1]=RG[t_3,t_1] $$ which is indicative of a nonautonomous dynamical system structure, to be remedied shortly. At this point one can already state the main goal of renormalization/taking continuum limits of QFTs: finding a correct choice of cutoff-dependent potentials/actions/integrated Lagrangians, $(V_t^{\rm bare})_{t\in\mathbb{R}}$ such that $$ \forall t_2,\ \lim_{t_1\rightarrow -\infty} RG[t_2,t_1](V_{t_1}^{\rm bare})\ =:\ V_{t_2}^{\rm eff}\ {\rm exists}. $$ The OP's intuition is correct in seeing this as a backwards shooting problem: choosing the right initial condition at $\Lambda_{H}$ to arrive where we want at $\Lambda_{L}$. A difficulty here (related to scattering in classical dynamical systems) is this involves an IVP at $t=-\infty$ instead of finite time. Note that the continuum QFT, its correlations, etc. should be completely determined by the collection of its effective theories indexed by scales $(V_{t}^{\rm eff})_{t\in\mathbb{R}}$. This is most easily seen when considering correlations smeared with test functions with compact support in Fourier space and with a sharp cutoff $\widehat{\rho}(p)$ given by the indicator function of the condition $|p|\le 1$ (or at least one which satisfies $\widehat{\rho}(p)=1$ in a neighborhood of zero momentum).
Switching to an autonomous setting involves some twisting by the scaling maps $S_t$. Given a potential V (bare or effective) which "lives at" scale $t_1$, one has $$ \int e^{-V(\phi)}\ d\mu_{t_1,\infty}(\phi)=\int e^{-V(S_{t_1}\phi)}\ d\mu_{0,\infty}(\phi)= \int e^{-(S_{-t_1}V)(\phi)}\ d\mu_{0,\infty}(\phi) $$ where we now define the action of rescaling maps on functionals by $$ (S_t V)(\phi):=V(S_{-t}\phi)\ . $$ As maps on functionals, one has the identity $$ RG[t_2,t_1]=S_{t_1}\circ RG[t_2-t_1,0]\circ S_{-t_1}\ . $$
Wilson's Wilsonian RG is $WRG[t]:=S_{-t}\circ RG[t,0]$, for $t\ge 0$. It acts on the space of "unit lattice theories" (I put quotes because I am using Fourier rather than lattice cutoffs). Thus the previous identity becomes $$ RG[t_2,t_1]=S_{t_2}\circ WRG[t_2-t_1]\circ S_{-t_1}\ . $$ The identity can be derived as follows (note the orgy of parentheses due to the increase of abstraction from functions to functionals, then to maps on functionals): $$ [(RG[t_2-t_1,0]\circ S_{-t_1})(V)](\phi)=-\log\int d\mu_{0,t_2-t_1}(\zeta) \exp[-(S_{-t_1}V)(\phi+\zeta)] $$ $$ =-\log\int d\mu_{0,t_2-t_1}(\zeta) \exp[-V(S_{t_1}\phi+S_{t_1}\zeta)] $$ $$ =-\log\int d\mu_{t_1,t_2}(\xi) \exp[-V(S_{t_1}\phi+\xi)] $$ where we changed variables to $\xi=S_{t_1}\zeta$. From this one gets $$ [(S_{t_1}\circ RG[t_2-t_1,0]\circ S_{-t_1})(V)](\phi)=[(RG[t_2,t_1,]\circ S_{-t_1})(V)](S_{t_1}\phi) $$ and the identity follows from the trivial fact $S_{t_1}(S_{-t_1}\phi)=\phi$.
Note that $(V_t)_{t\in\mathbb{R}}$ is trajectory of $RG$, i.e., $$ \forall t_1\le t_2,\ V_{t_2}=RG[t_2,t_1](V_{t_1}) $$ if and only if $W_t:=S_{-t}V_t$ is a trajectory of $WRG$, i.e., $$ \forall t_1\le t_2,\ W_{t_2}=WRG[t_2-t_1](W_{t_1})\ . $$ The semigroup property for $RG$ readily implies that for $WRG$, namely, $$ \forall t_1, t_2\ge 0,\ WRG[t_1+t_2]=WRG[t_1]\circ WRG[t_2]\ . $$ Now define $W_{t}^{\rm start}:=S_{-t} \circ V_t^{\rm bare}$. Then assuming continuity of all these RG maps one has $$ V_{t_2}^{\rm eff}=\lim_{t_1\rightarrow -\infty} RG[t_2,t_1](V_{t_1}^{\rm bare})=S_{t_2}(W_{t_2}^{\rm eff}) $$ where $$ W_{t_2}^{\rm eff}:=\lim_{t_1\rightarrow -\infty} WRG[t_2-t_1](W_{t_1}^{\rm start})\ . $$ The definiteness of the continuum QFT can also be rephrased as the existence of the potentials $W_{t}^{\rm eff}$. A common source of confusion is the failure to see that while $(W_{t}^{\rm eff})_{t\in\mathbb{R}}$ is (by definition, the semigroup property and continuity) a trajectory of $WRG$, the family of bare potentials $(W_{t}^{\rm bare})_{t\in\mathbb{R}}$ is not. The same statement is true, by undoing the "moving frame change of coordinates", when replacing $W$'s with $V$'s and $WRG$ with $RG$.
For concreteness, we need coordinates on the space where the RG acts. Assume the bare potential $V_t^{\rm bare}$ is determined by a collection of coordinates or couplings $(g_i)_{i\in I}$ via $$ V_{t}^{\rm bare}(\phi)=\sum_{i\in I} g_i^{\rm bare}(t)\ \int \mathcal{O}_i(x)\ dx $$ for local operators of the form $$ \mathcal{O}_i(x)= :\partial^{\alpha_1}\phi(x)\cdots \partial^{\alpha_k}\phi(x):_t\ . $$ The Wick/normal ordering is with respect to the free cutoff measure $\mu_{t,\infty}$. More precisely, for every functional $F$, $$ :F(\phi):_t\ \ :=\exp\left[-\frac{1}{2} \int dxdy\ \frac{\delta}{\delta\phi(x)}\ C_{t,\infty}(x,y)\ \frac{\delta}{\delta\phi(y)} \right]\ F(\phi) $$ where we denoted the propagator by $C_{t,\infty}(x,y):=\langle\phi(x)\phi(y)\rangle_{t,\infty}$. Note that changing $-\frac{1}{2}$ to $+\frac{1}{2}$ followed by setting $\phi=0$ is integration with respect to $\mu_{t,\infty}$. For instance $:\phi(x)^2:_t=\phi(x)^2-C_{t,\infty}(x,x)$ and $:\phi(x)^4:_t=\phi(x)^4-6C_{t,\infty}(x,x)\phi(x)^2+3C_{t,\infty}(x,x)^2$. An easy change of variables $y=e^{-t}x$ shows that $$ (S_{-t}V_{t}^{\rm bare})(\phi)=\sum_{i\in I} g_i^{\rm start}(t) \int :\partial^{\alpha_1}\phi(y)\cdots \partial^{\alpha_k}\phi(y):_0\ dy $$ where $g_i^{\rm start}(t):=e^{(D-\Delta_i)t}\ g_i^{\rm bare}(t)$ and I used the notation $\Delta_i=k\Delta+|\alpha_1|+\cdots+|\alpha_k|$ for the scaling dimension of the local operator $\mathcal{O}_i$. The switch $g_i^{\rm bare}\rightarrow g_i^{\rm start}$ corresponds to that from dimensionful to dimensionless coupling constants. The indexing set splits as $I=I_{\rm rel}\cup I_{\rm mar}\cup I_{\rm irr}$, respectively corresponding to the three possibilities for operators: $D-\Delta_i>0$ or relevant, $D-\Delta_i=0$ or marginal, $D-\Delta_i<0$ or irrelevant.
$W=0$ is a fixed point of the autonomous dynamical system $WRG$. The behavior near this (trivial/Gaussian/free) fixed point is governed by the linearization or differential at $W=0$, i.e., the maps $\mathcal{D}WRG[t]$ given by $$ [\mathcal{D}WRG[t](W)](\phi):=\int W(S_t\phi+\zeta)\ d\mu_{0,t}(\zeta) $$ as follows from the definition $$ [WRG[t](W)](\phi)=-\log \int e^{-W(S_t\phi+\zeta)}\ d\mu_{0,t}(\zeta) $$ and the crude approximations $e^z\simeq 1+z$ and $\log(1+z)\simeq z$. If $W$ has coordinates $(g_i)_{i\in I}$ (with $:\bullet :_0$ Wick ordering), then one can show (good not so trivial exercise) that $\mathcal{D}WRG[t](W)$ has coordinates given exactly by $(e^{(D-\Delta_i)t}g_i)_{i\in I}$, in the same frame, i.e., with the same $t=0$ Wick ordering. If instead of flows one prefers talking in terms of the vector field $\mathcal{V}$ generating the dynamics, then a trajectory $(W_t)_{t\in\mathbb{R}}$ of $WRG$ satisfies $\frac{dW_t}{dt}=\mathcal{V}(W_t)$ with $\mathcal{V}:=\left.\frac{d}{dt} WRG[t]\right|_{t=0}$ admitting a linear plus nonlinear splitting $\mathcal{V}=\mathcal{D}+\mathcal{N}$. The linear part, in coordinates, is $$ \mathcal{D}(g_i)_{i\in I}=((D-\Delta_i) g_i)_{i\in I}\ . $$ Assume the existence of $W_{\rm UV}:=\lim_{t\rightarrow -\infty} W_{t}^{\rm eff}$, the UV fixed point, and $W_{\rm IR}:=\lim_{t\rightarrow \infty} W_{t}^{\rm eff}$, the infrared fixed point (they have to be fixed points by continuity). The discussion of perturbative renormalizability always refers to the situation where $W_{\rm UV}=0$ corresponding to continuum QFTs obtained as perturbations of the free CFT $\mu_{-\infty,\infty}$. By definition, the QFT or the trajectory $(W_t)_{t\in\mathbb{R}}$ of its "unit lattice"-rescaled effective theories lies on the unstable manifold $\mathcal{W}^{\rm u}$ of the $W=0$ fixed point. In what follows I will assume for simplicity there are no marginal operators so the fixed point is hyperbolic and there are no subtleties due to center manifolds. The tangent space $T\mathcal{W}^{\rm u}$ is then spanned by functionals $\phi\longmapsto \int \mathcal{O}_i$, for $i$ in $I_{\rm rel}$ which is typically finite.
Note that, in principle, knowing a QFT is the same as knowing a trajectory $(W_t^{\rm eff})_{t\in\mathbb{R}}$ and thus the same as knowing just one point of that trajectory say $W_0^{\rm eff}$ (if the $t=0$ IVP is well-posed forwards and backwards in time, which is another delicate issue as explained in Arnold's answer). The point $W_0^{\rm eff}$ can be made to sweep the unstable manifold which can be identified with the space of continuum QFTs obtained by perturbing the $W=0$ fixed point. On the other hand our control parameter is the choice of cut-off dependent starting points $(W_t^{\rm start})_{t\in\mathbb{R}}$. These belong to the bare surface $T\mathcal{W}^{\rm u}$. This is why when considering say the $\phi^4$ model only a small finite number of terms are put in the bare Lagrangian, otherwise we would be talking about some other (family of) model(s) like $\phi^6$, $\phi^8$, etc. So after all these explanations, it should be clear that renormalization in Wilson's framework can be seen as a parametrization of the nonlinear variety $\mathcal{W}^{\rm u}$ by the linear subspace $T\mathcal{W}^{\rm u}$. If we denote the stable manifold by $\mathcal{W}^{\rm s}$ and its tangent space by $T\mathcal{W}^{\rm s}$ then, assuming hyperbolicity of the trivial fixed point, the full space where the RG acts should be $T\mathcal{W}^{\rm u}\oplus T\mathcal{W}^{\rm s}$. The stable manifold theorem gives a representation of $\mathcal{W}^{\rm u}$ as the graph of a map from $T\mathcal{W}^{\rm u}$ into $T\mathcal{W}^{\rm s}$.
The main problem is to find $(W_t^{\rm start})_{t\in\mathbb{R}}$ so that the limit $W_0^{\rm eff}=\lim_{t\rightarrow -\infty} WRG[-t](W_t^{\rm start})$ exists. The stable manifold theorem is the $t=-\infty$ case of a mixed boundary problem where on a trajectory one imposes conditions (on coordinates) of the form $g_i^{\rm start}(t)=0$, $i\in I_{\rm irr}$, and $g_i^{\rm eff}(0)=\lambda_{i}^{\rm R}$, $i\in I_{\rm rel}$. Irwin's proof is a nice way to solve this and it works even if the RG is not reversible. This method can be applied for finite negative $t$, and this should produce a collection $(W_t^{\rm })_{t<0}$ (all that is needed in fact) dependent on the renormalized couplings $\lambda_{i}^{\rm R}$. Let us assume for instance that $I_{\rm rel}=\{1,2\}$ and $I_{\rm irr}=\{3,4,\ldots\}$. Consider the map $P_t$ given by $$ (\lambda_{1}^{\rm B},\lambda_{2}^{\rm B})\longmapsto (g_i\{WRG[-t](\lambda_{1}^{\rm B}, \lambda_{2}^{\rm B},0,0,\ldots)\})_{i=1,2} $$ where $g_i\{W\}$ denotes the $i$-th coordinate of $W$. A possible choice of starting points is thus $$ W_t^{\rm start}:=(P_t^{-1}(\lambda_{1}^{\rm R},\lambda_{2}^{\rm R}),0,0,\ldots)\ . $$
The above is more like a road map for what needs to be done but it does not quite provide a recipe for doing it. In the perturbative setting, one trades numbers in $\mathbb{R}$ for formal power series in $\mathbb{R}[[\hbar]]$. The propagators of the $\mu$ measures get multiplied by $\hbar$ and there is now $\frac{1}{\hbar}$ in front of the $V$'s or $W$'s in the exponential. All the couplings $g_i$ now also become elements of $\mathbb{R}[[\hbar]]$. The invertibility of $P_t$ in this setting is easy and follows by analogues of the implicit/inverse function theorem for formal power series (e.g. in Bourbaki, Algebra II, Chapters 4-7, Berlin, Springer-Verlag, 1990).
All the work is in showing that for $i\ge 3$, the quantities
$$
f_i(\lambda_{1}^{\rm R},\lambda_{2}^{\rm R}):=\lim_{t\rightarrow -\infty}
g_i\{WRG[-t](P_t^{-1}(\lambda_{1}^{\rm R},\lambda_{2}^{\rm R}),0,0,\ldots)\}
$$
converge to finite values. This gives the wanted parametrization $(\lambda_{1}^{\rm R},\lambda_{2}^{\rm R})
\mapsto(\lambda_{1}^{\rm R},\lambda_{2}^{\rm R},f_3(\lambda_{1}^{\rm R},\lambda_{2}^{\rm R}),f_4(\lambda_{1}^{\rm R},\lambda_{2}^{\rm R}),\ldots)$ of $\mathcal{W}^{\rm u}$ by $T\mathcal{W}^{\rm u}$.
There are two ways of showing the above convergence statement. Underlying both ways is the fact (see Bourbaki above) the formal power series $P_t^{-1}(\lambda_{1}^{\rm R},\lambda_{2}^{\rm R})\in \mathbb{R}[[\hbar]]^2$ exist and are unique.
Fans of combinatorics would prefer a two-step procedure consisting in 1) finding an explicit formula for $WRG[-t](P_t^{-1}(\lambda_{1}^{\rm R},\lambda_{2}^{\rm R}),0,0,\ldots)$ for finite $t$; then 2), with this formula in hand, analyze the limit $t\rightarrow -\infty$. The explicit formula in 1) is Zimmermann's forest formula. See this article by Hairer for a recent take on the delicate analytical estimates needed for step 2).
For those who abhor combinatorics, there is another method which avoids explicit formulas. Change the scale $0$ in the mixed boundary problem to an arbitrary scale $s>t$. Namely, impose $g_i(t)=0$ for $i\ge 3$ and $g_i(s)=\lambda_i^{\rm R}$ for $i=1,2$ and study the variation of $s$ from $s=t$ to $s=0$ by ODE techniques. This is the Wilson-Polchinski approach. The best rigorous account that I know for this second approach is in the book "Renormalization: An Introduction" by Salmhofer.
Finally, one could ask what would happen if one used $W_{s}^{\rm eff}$, for some fixed $s\neq 0$, to parametrize the QFTs instead of $W_{0}^{\rm eff}$. The answer is obtained by noticing that the maps $W_s^{\rm eff}\mapsto {\rm QFT}$ intertwine the action of $WRG$ on $\mathcal{W}^{\rm u}$ and that of the scaling maps $S_t$ on QFTs (simply rescale correlations, i.e., do $\phi\rightarrow S_t\phi$ inside correlations). This is the relation to the old Stueckelberg-Peterman-Gell-Mann-Low RG (i.e., change of scale can be absorbed in a change of renormalized coupling constants). In other words, the restriction of the nonreversible $WRG$ to the finite dimensional manifold $\mathcal{W}^{\rm u}$ should be reversible since $S_t$'s (on collections of correlations) are, or because of the remark I made about Irwin's proof even for noninvertible (discrete) dynamical systems.
There are two kinds of renormalization groups. Lots of pointers to the literature are given here.
The most common renormalization group definition is in the spirit of Kadanoff and Wilson. But this ''group'' is in spite of the name only a semigroup: The renormalization is not invertible, and in general one cannot run the equations backward. Thus being able to continue backwards (in this case this means to arbitrarily high energies) is a very stringent additional requirement.
This is already the rule for simpler systems, such as parabolic partial differential equations. For example, the initial value problem for the heat equation is well-posed, while that for the reverse heat equation is not. Most IVPs have no solution at all, and when there is a solution it is infinitely sensitive to changes in the initial conditions - arbitrarily small changes can be found with arbitrarily large consequences after arbitrarily tiny times. Thus nothing at all can be concluded from the initial conditions unless they are exact to infinitely many digits.
The other renormalization group definition is in the spirit of Bogoliubov & Stückelberg and is a true group.