Question about the Dirac notation for partial trace
I believe that an example will help clarify your confusion about notation (as examples usually do). Consider a system of two qubits, $A$ and $B$, with Hilbert spaces $V_A$ and $V_B$ spanned by two orthonormal eigenbasis of $\sigma_z$, $|0\rangle_A$ and $|1\rangle_A$; and $|0\rangle_B$ and $|1\rangle_B$. Now suppose that we have a Bell state, $$|\Psi\rangle_{AB} = \frac{1}{\sqrt{2}} (|0\rangle_A \otimes |0\rangle_B + |1\rangle_A \otimes |1\rangle_B).$$ This state corresponds to a density matrix, $$\rho_{AB}=|\Psi\rangle_{AB}\langle\Psi|_{AB}$$ $$=\frac{1}{2}(|0\rangle_A \otimes |0\rangle_B \langle0|_A \otimes \langle0|_B+|0\rangle_A \otimes |0\rangle_B\langle1|_A \otimes \langle1|_B$$ $$+|1\rangle_A \otimes |1\rangle_B\langle0|_A \otimes \langle0|_B+|1\rangle_A \otimes |1\rangle_B\langle1|_A \otimes \langle1|_B).$$ Now suppose that we wish to get the reduced density matrix for system A. We use your definition for the partial trace over system $B$ with $|e_1 \rangle=|0\rangle_B$ and $|e_1 \rangle=|1\rangle_B$, together with the fact that $\langle \phi' |_B(|\psi \rangle_A \otimes |\phi\rangle_B)=(\langle \phi' |_B|\phi\rangle_B)|\psi \rangle_A $ (which is just the inner product of $|\phi\rangle_B$ and $|\phi'\rangle_B$, a number, times $|\psi \rangle_A$) as well as orthonormality, to get, $$\rho_{A}=\frac{1}{2}(|0\rangle_A\langle0|_A+|1\rangle_A\langle1|_A), $$ a completely mixed state.
Incidentally, this appearance of a completely mixed state is the reason there is no FTL signalling in Bell experiments - a mixed state is complete ignorance about what is going with $B$ if we only study $A$ locally.
A basis for the Hilbert space $A \otimes B$ could be written :
$$|e_{i_1 i_2}\rangle = |e_{i_1}\rangle \otimes |e_{i_2}\rangle \tag{1}$$
We may user the notation $I$ representing a composite index : $ I = (i_1 i_2)$, so that $|e_{I}\rangle = |e_{i_1}\rangle \otimes |e_{i_2}\rangle $
The density matrix could be written :
$$\rho_{I'I} = \rho_{ \large (i'_1 i'_2)(i_1 i_2)} \tag{2}$$
For instance, the action of the density matrix on a state $|\psi \rangle = \sum_{i_1,i_2} \psi_{i_1 i_2} |e_{i_1 i_2}\rangle$ is $|\psi' \rangle = \rho |\psi \rangle$, that is ${\psi'}_I{'} = \sum_{I} \rho_{I'I} \psi_I$ ,or, in a more detailed way :
$${\psi'}_{i'_1 i'_2} = \sum_{i_1,i_2} \rho_{ \large (i'_1 i'_2)(i_1 i_2)} \psi_{i_1 i_2} \tag{3}$$
The partial trace on system $(2)$, that we called $\rho_1$, is defined by :
$$(\rho_1)_{i_1' i_1} = \sum_{i_2}\rho_{ \large (i'_1 i_2)(i_1 i_2)} \tag{4}$$
Now, we may write $\rho_{I'I} = \langle e_{I'}|\rho|e_{I}\rangle$, that is : $$\rho_{ \large (i'_1 i'_2)(i_1 i_2)} = \langle e_{i'_1 i'_2}|\rho |e_{i_1 i_2}\rangle = ( \langle e_{i'_1}| \otimes \langle e_{i'_2}|) ~~\rho ~~(|e_{i_1}\rangle \otimes |e_{i_2}\rangle) \tag{5}$$
So, finally, we may write :
$$(\rho_1)_{i_1' i_1} = \sum_{i_2} \langle e_{i'_1 i_2}|\rho |e_{i_1 i_2}\rangle= \sum_{i_2} ( \langle e_{i'_1}| \otimes \langle e_{i_2}|) ~~\rho ~~(|e_{i_1}\rangle \otimes |e_{i_2}\rangle)\tag{6}$$
This is indeed slightly confusing.
It is probably a little bit easier to understand starting with the bra. If $\langle \varphi|$ is a bra in the $A$ side of an $\mathcal H_A\otimes \mathcal H_B$ tensor product, then it is a linear transformation from $\mathcal H_A$ into $\mathbb C$: $$\langle \varphi|:\mathcal H_A\rightarrow \mathbb C.$$ Similarly to operators $\hat V_A:\mathcal H_A\rightarrow \mathcal H_A$, when you want to talk about their action on the tensor product $\mathcal H_A\otimes \mathcal H_B$, you should always tensor-product them with an identity on the $B$ side. Thus, on the composite system, the usual convention is that $$\langle \varphi|\text{ is shorthand for }\langle\varphi|\otimes \mathbb I_B:\mathcal H_A\otimes\mathcal H_B\rightarrow \mathbb C\otimes\mathcal H_B=\mathcal H_B,$$ in the same way that $\hat V_A$ is shorthand for$\hat V_A \otimes \mathbb I_B:\mathcal H_A\otimes\mathcal H_B\rightarrow \mathcal H_A\otimes\mathcal H_B$.
In an analogous way, vectors in a single tensor factor must also be tensored with an identity on the rest (unless, of course, they're part of a proper tensor product with vectors on all factors, like $$|\varphi_A\rangle\otimes|\varphi_B\rangle\in \mathcal H_A\otimes\mathcal H_B.)$$ Thus, when doing a partial trace, the ket $|\varphi\rangle$ on the right is shorthand for $|\varphi\rangle\otimes \mathbb I_B$. It is slightly unnatural to see this as an operator but you can see it as going from $\mathcal H_B=\mathbb C\otimes \mathcal H_B$ into $\mathcal H_A\otimes\mathcal H_B$.
Thus, for a density matrix $\rho_{AB}:\mathcal H_A\otimes\mathcal H_B\rightarrow\mathcal H_A \otimes \mathcal H_B$, the partial trace over $A$ is a sum of terms of the form $$ \rho_B\sim \langle\varphi|\rho_{AB}|\varphi\rangle \text{ which is shorthand for } (\langle\varphi|\otimes\mathbb I_B)\rho_{AB}(|\varphi\rangle\otimes\mathbb I_B) ,$$ and is an operator on $\mathcal H_B$.