Diagonalizable subgroups of a connected linear algebraic group
No. For example, $\mathrm{PGL}_n$ contains a subgroup $G$ isomorphic to the product of two cyclic subgroups of order $n$, generated by the classes of the diagonal matrix whose entries are the powers of a fixed primitive $n^{\rm th}$ root of 1, and the permutation matrix corresponding to a cycle of length $n$. The inverse image of this subgroup in $\mathrm{GL}_n$ is not commutative, while the inverse image of a maximal torus in $\mathrm{PGL}_n$ is a maximal torus in $\mathrm{GL}_n$, so $G$ is not contained in a torus.
To reinforce Angelo's example, it's worthwhile to point out the broader setting for this kind of question: the study of centralizers and connectedness properties in a semisimple (or more generally reductive) algebraic group. An older but very useful source is part II of the extensive notes by T.A. Springer and R. Steinberg on conjugacy classes, part of an IAS seminar (Lect. Notes in Math. 131, Springer, 1970). A crucial question is whether a given connected semisimple group is simply connected or not; this shows up in the standard example where the adjoint group $\mathrm{PGL}$ fails to be simply connected. Here you have the deep theorem: If $G$ is a connected, simply connected algebraic group over an algebraically closed field, then all centralizers of semisimple elements are connected. (It's elementary on the other hand to prove that all centralizers in a general linear group are connected.) The role of the characteristic of the field is also discussed in depth by Springer and Steinberg, as well as the role of "torsion primes" (treated more fully in Steinberg, Torsion in reductive groups, Advances in Math 1975).
Some of the results are written up in later textbooks and in the first two chapters of my 1990 AMS book Conjugacy Classes in Semisimple Algebraic Groups (with the relevant example for the question here given in 1.12).
ADDED: To answer the added question, in any connected algebraic group it's true that an arbitrary semisimple element and hence the cyclic subgroup it generates lies in some maximal torus. This is part of the standard development of Borel-Chevalley structure theory (see for example Section 22.3 of my book Linear Algebraic Groups), though it does take a while to get that far into the theory.
Here is another example similar to Angelo's construction of a non-toral diagonalizable subgroup of a reductive group. I'll suppose that the characteristic is not 2. Let $G = SO(V) = SO(V,\beta)$ for $\dim V > 2$, and write $V$ as an orthogonal sum $V = U \perp W$ for $0 < \dim U < \dim V$ with $\dim U$ even, such that the restriction of $\beta$ to $U$ and $W$ is non-degenerate.
Let $t \in G$ act as the identity on $W$ and as $-1$ on $U$. Then the centralizer $M=C_G(t)$ identifies with the subgroup {$(x,y) \in O(U) \times O(W) \mid \det(x) = \det(y)$}. In particular, this centralizer is not connected: $M/M^0$ has order 2.
One can evidently choose an involution $s \in M \setminus M^0$, and then $D = \langle t,s\rangle$ is a diag. subgroup of $G$ which is contained in no maximal torus.
Part of this construction can be made in char. 2. Instead of $t$, you have to take a non-smooth subgroup $\mu \simeq \mu_2$, essentially given by the action of a semisimple element $X \in \operatorname{Lie}(G)$ ($X$ should act as $1$ on $U$ and $0$ on $W$). Then $M=C_G(\mu) = C_G(X)$ is again disconnected (well, now you can't argue by determinants) with component group of order $2$. But this doesn't seem to lead to a non-toral diagonalizable subgroup (any finite order element representating the non-trivial coset of $M/M^0$ has a non-trivial unipotent part).