Product rule for scalar-vector product
There are two ways of looking at this that sort it out: we can use indices, or we can understand everything as linear maps and work it out explicitly.
- In index notation, we have functions $G_i = \phi(y) F_i(y)$. All of these are scalars, so the usual product rule for scalar functions applies: $$ \frac{\partial G_i}{\partial y_j} = \phi \frac{\partial F_i}{\partial y_j} + \frac{\partial \phi}{\partial y_j} F_i. $$ Since $\phi$ is a scalar, to write down the matrix corresponding to this, we can reverse the order to put the $j$ terms on the right, i.e. $$ (DG)_{ij} = \phi (DF)_{ij} + F_i (\nabla\phi)_j = (\phi DF + F \otimes \nabla \phi)_{ij}, $$ $\otimes$ being the dyadic product $(A \otimes B)_{ij} = A_i B_j$.
Approaching the problem in a coordinate-free way, the derivative $DG(y)$ is a linear map $\mathbb{R}^p \to \mathbb{R}^s$ given uniquely by $$ G(y+h) = G(y) + DG(y)(h) + o(\lVert h \rVert). $$ Written this way, it doesn't matter how $h$ is incorporated providing that the expression ends up in $\mathbb{R}^s$. One can show that the product rule (or a clever use of the chain rule) in this formalism gives you $$ DG(y)(h) = [\phi(y)] DF(y)(h) + [D\phi(y)(h)] F(y), $$ where the terms in brackets are both scalars (and hence we can push them about to end up with all the $h$s on the right if we wish).
The important thing to understand is that since the derivative is a linear map, it has to have an argument fed into it somewhere. If it divides up into terms that are some form of products of derivatives and parts of the original function, the argument must be fed into the derivatives, not the other parts of the function (this is a good reason to think about functions $\mathbb{R}^n \supset U \to \mathbb{R}^m$, since then the argument of the function is restricted to a subset, but the tangent vectors that you feed into the derivative are not.
Let $F_1,...,F_s$ denote the components of $F$. Then $G_i = \phi F_i$, and so $$(DG)_{ij}=\frac{\partial G_i}{\partial x_j} = \frac{\partial \phi}{\partial x_j}F_i+\phi\frac{\partial F_i}{\partial x_j} = \frac{\partial \phi}{\partial x_j}F_i+\phi (DF)_{ij}$$ The $s\times p$ matrix $A$ such that $A_{ij}=\frac{\partial \phi}{\partial x_j}F_i$ is equal to $F\nabla\phi$, where here, $F$ is thought of as a $s$-dimensional column vector and $\nabla \phi$ is a $p$-dimensional row vector. Therefore, we have $$DG = D(\phi F) = F\nabla \phi+\phi DF$$ Keep in mind that on the RHS, the first product is matrix multiplication, and the second is scalar multiplication of the matrix $DF$ by the scalar function $\phi$.