BAC─CAB rule used on del operators
To the extent that the BAC─CAB rule makes sense for $\nabla\times(\nabla \times\vec A)$, it does hold. In particular, if you look at it in component notation, i.e. if you compare the BAC─CAB rule $$ \left[ \vec a \times(\vec b\times \vec c)\right ]_i = b_i \,c_j a_j - c_i b_j a_j $$ with the double curl, $$ \left[ \nabla\times(\nabla \times\vec A) \right]_i = \partial_i(\partial_j A_j) - \partial_i \partial_i A_j, $$ where I've used Einstein summations and simplified partial derivatives to $\partial_i = \frac{\partial}{\partial x_i}$, then you get exactly the same structure.
The only difference now is that you need to be very careful with how you place things: where before you could place the real numbers $a_i$, $b_j$ and $c_k$ in any order you wanted, you you need to have all the $\partial_i$'s before the $A_j$'s for the expression to make sense. Once you do, though, everything works out as expected.
On the other hand, it's important to note that the $\nabla$ operator is still an operator, and that vector-calculus identities should never be assumed to hold without specific confirmation. (As an example, the BAC─CAB rule definitely doesn't hold for $\nabla \times (\vec A \times \vec B)$; instead, a version of it does hold, but you need to include two copies, one with the derivatives acting on $\vec A$ and one with the derivatives acting on $\vec B$.)
More generally, it really is easiest to work in component notation. In this case, for example, I would argue that the "true" (or, at least, the most useful) instantiation of the BAC─CAB rule is the (pretty trivial*) identity $$ \epsilon_{ijk}\epsilon_{klm} = \delta_{il}\delta_{jm} - \delta_{im}\delta_{jl} $$ between the Levi-Civita symbol and the Kronecker delta. Why? because the Levi-Civita tensor is the core of both cross products $$ \left[ \vec a \times \vec b \right]_i = \epsilon_{ijk} a_j b_k $$ as well as curls, $$ \left[ \nabla \times \vec A \right]_i = \epsilon_{ijk} \partial_j A_k, $$ so the left-hand side of $\epsilon_{ijk}\epsilon_{klm} = \delta_{il}\delta_{jm} - \delta_{im}\delta_{jl}$ gives you a concatenation of two cross-producty objects, while the right-hand side has two contractions: one of which becomes an inner product and the other of which transfers one internal index to the externally-queried $i$. Moreover, this makes it extremely simple to work through the tensor algebra, regardless of whether it is simple vectors at play or a more complicated calculus identity: thus, for the double curl you get \begin{align} \left[ \nabla\times(\nabla \times\vec A) \right]_i & = \epsilon_{ijk} \partial_j \left[ \nabla \times\vec A \right]_k \\ & = \epsilon_{ijk}\epsilon_{klm} \partial_j \partial_l A_m \\ & = (\delta_{il}\delta_{jm} - \delta_{im}\delta_{jl}) \partial_j \partial_l A_m \\ & = \partial_j\partial_i A_j - \partial_j \partial_j A_i \\ & = \partial_i(\partial_j A_j) - \partial_j \partial_j A_i \\ & = \left[ \nabla (\nabla \cdot \vec A) - \nabla^2 \vec A \right]_i \end{align} as usual, but for the curl of a cross product it allows you to do much the same while still making it clear, at the stage \begin{align} \left[ \nabla\times(\vec A \times\vec B) \right]_i & = \epsilon_{ijk} \partial_j \left[ \vec A \times\vec B \right]_k \\ & = \epsilon_{ijk}\epsilon_{klm} \partial_j \left[ A_l B_m \right] \end{align} that the partial derivative needs to act in a product-rule kind of way to give $$ \partial_j \left[ A_l B_m \right] = B_m \, \partial_j A_l + A_l \, \partial_j B_m, $$ a product that can then be fed into the rest of the calculation.
Hopefully that will help in conceptualizing the result and placing it inside a more coherent context.
* Why trivial? Because the combinations that contribute to the Levi-Civita symbol with nonzero weight are very restricted. Basically, for $\epsilon_{ijk}$ to count at all, you need $i\neq j$, and that fixes $k$ to be whatever $i$ and $j$ are not, i.e. if $i=1$ and $j=2$ then the only $k$ that survives the summation is $k=3$. Then, to do the product, you first use the cyclic property on the second term, $\epsilon_{ijk}\epsilon_{klm} = \epsilon_{ijk}\epsilon_{lmk}$, and since $k$ is now fixed, the set $\{l,m\}$ needs to be the same as $\{i,j\}$, and this can only happen (a) if $i=l$ and $j=m$, in which case $\epsilon_{ijk}$ and $\epsilon_{klm}$ have matching signs, or (b) if the pairs $(i,j)$ and $(l,m)$ are swapped, in which case you accrue a minus sign. That train of thought is enough to reconstruct the full form of the symbol identity simply by matching indices on $\epsilon_{ijk}\epsilon_{lmk}$.
It is safe as long as you are careful to track what terms each $\nabla$ acts on. In practice it is often a better idea to work in components to derive such identities since then you can easily keep track manually. If you insist on doing it in vector form it can be useful to annotate with arrows which factors the $\nabla$ acts on and then work out the terms where the operator acts on several factors using the product rule.
For example: \begin{align*} \nabla \times (\vec a \times \vec b) &= \overset{\downarrow}{\vec a}(\nabla \cdot \overset{\downarrow}{\vec b}) - \overset{\downarrow}{\vec b} (\nabla \cdot \overset{\downarrow}{\vec a}) \\ &= (\vec b \cdot \nabla) \overset{\downarrow}{\vec a} + \vec a (\nabla \cdot \overset{\downarrow}{\vec b}) - (\vec a \cdot \nabla) \overset{\downarrow}{\vec b} - \vec b(\nabla \cdot \overset{\downarrow}{\vec a}). \end{align*} In the final result we can simply omit the arrows since the factors the operator acts on are exactly those it acts on according to the standard rules.
Expansion as reply to a comment:
Acting upon means that the differentiation within that operator is applied to that function. For example in $\partial_x (ab)$ the operator acts on a and b while in $a\partial_x b$ the operator only acts on $b$ (that is, the derivative is only taken for occurences of $x$ in $b$ not for those in $a$). The arrow notation helps writing down terms where the operator does not (or not only) act on the factors to the right of it.
In the original term $\nabla \times (\vec a \times \vec b)$ both $\vec a$ and $\vec b$ are factors to the right of the differential operator, so it acts on both of them (since this is the usual convention). In our arrow notation we can make this explicit: $\nabla \times (\overset{\downarrow}{\vec a} \times \overset{\downarrow}{\vec b})$. When we then use the vector algebra rules to manipulate the expression, we must not change the factors that our operator acts on. Vector manipulations and operator applications are in this sense independent, this can be motivated by looking at the calculation in components. If several terms appear during the vector manipulation then in each term the operator still has to apply to all the functions marked with an arrow in the original term. After a vector algebra manipulation the occurrences are now no longer necessarily ordered correctly to show where the differentiation acts without the arrows, so we have to reorder the factors to restore the conventions (and us the product rule where necessary).