'Leftover' voltage when using constant current
This circuit is particularly useful for a variety of reasons. So I wanted to do it a little more justice than just two sentences' worth.
OVERVIEW
In the following circuit, I've intentionally boxed up the circuit so that it appears to have three pins. There is a good reason for this choice, as will become clearer later on (if not already.)
simulate this circuit – Schematic created using CircuitLab
The circuit requires a GND pin, of course. This needs to be connected as the lowest voltage of the three pins and should be tied a low-impedance node so that the current can sink to that node without moving the node voltage around. This is typically attached to the overall circuit ground reference node.
The SINK pin is controlled such that the current specified by \$R_\text{SET}\$ is sunk at this pin. A load may be attached between this pin and a positive voltage supply (relative to whatever the GND pin is tied to, anyway.) I show an LED here. But this could be other kinds of loads, as well.
The burden voltage for the SINK pin, relative to GND, is at least \$1\:\text{V}\$. (The circuit can often be designed to tolerate some saturation of \$Q_1\$, so that its \$V_\text{CE}\$ can be as little as \$300\:\text{mV}\$.) But for some designs this may be somewhat higher -- perhaps as high as around \$1.6\:\text{V}\$ in some more extreme cases with high sinking currents set by \$R_\text{SET}\$.
The ENA pin is usually connected to the same positive voltage supply (relative to GND) that the load uses. This "enables" the circuit. But using the positive supply directly need not be the case here, as I'll discuss a little later towards the very end, below. It may also be connected to an MCU and used to turn the circuit ON and OFF. For this reason, I show it as a separate pin and call it ENA to signal that this is an active-HI enable pin, of sorts. You can tie it HI (as shown above) or you can use an I/O pin to control it. (As I said, more on that later on.)
OPERATION
Let's first look at the circuit to see how it does what it does.
The main idea is to use the fact that a silicon BJT has a relatively stable \$V_\text{BE}\approx 700\:\text{mV}\$. (Assuming the temperature of \$Q_2\$ isn't moving around -- more on that, later.) All of the current sinking into the SINK pin, plus a little more from the base of \$Q_1\$ and less a little bit that goes into the base of \$Q_2\$, is driven through \$R_\text{SET}\$ and therefore develops a voltage across it. If \$Q_1\$'s collector were to try and increase the current it was sinking (Early Effect?) then this added collector current would cause the voltage across \$R_\text{SET}\$ to exceed the nominal \$V_\text{BE}\$ of \$Q_2\$. In response, \$Q_2\$'s collector will sink more current (its collector current is controlled by its base-emitter voltage.) This added \$Q_2\$ collector current has to come through \$R_\text{B}\$ and therefore must increase the voltage drop across it. This lowers the voltage at the base of \$Q_1\$, which therefore lowers its emitter voltage. In effect, the overall response of the circuit to an attempted increase in the current through the load (LED?) is to oppose that change and return the base voltage of \$Q_2\$ to the nominal \$V_\text{BE}\approx 700\:\text{mV}\$. Similar logic also applies should the collector current of \$Q_1\$ decline. The circuit response to a lowering in \$Q_1\$'s collector current is to raise the base voltage of \$Q_1\$ and therefore oppose that decline.
There is negative feedback in this system to oppose changes in the sinking current at the SINK pin. And the sinking current is nominally:
$$I_\text{SINK}=\frac{700\:\text{mV}}{R_\text{SET}}$$
This entire circuit works to control the voltage at the collector of \$Q_1\$ by observing the current sinking into the collector of \$Q_1\$. Other than the voltage across \$R_\text{SET}\$ and across the load (the LED in this case), all of the rest of the voltage difference between the positive supply and GND is taken up by the \$V_\text{CE}\$ of \$Q_1\$. But this detail is under management by the circuit so that the load's voltage is just what is necessary for the current set by \$R_\text{SET}\$.
A DESIGN
You must be sure that you have enough extra voltage for the positive supply. It must be at least the burden voltage of the circuit's SINK pin plus whatever burden voltage is required by the load. For a typical LED that uses somewhere from \$2\:\text{V}\$ to \$3.5\:\text{V}\$, this means that the positive supply voltage must be at least \$5\:\text{V}\$. But if you are using a red LED with, say, \$1.9\:\text{V}\$ and if you can accept a design with a little saturation in \$Q_1\$, then you could get away with as little as \$3\:\text{V}\$ for the positive supply. (There is no hard and fast rule here and there are many tradeoffs to consider.)
So let's assume that \$V_\text{CC}=5\:\text{V}\$ and that this is a red LED that is to be operated at \$I_\text{LED}=20\:\text{mA}\$. (Also, obviously, we will use \$V_\text{ENA}=V_\text{CC}\$.)
$$\begin{align*} R_\text{SET}&=\frac{V_{\text{BE}_2}}{I_\text{LED}\cdot\left(1+\frac{1}{\beta_1}-\frac{1}{10\:\beta_2}\right)}\\\\ R_\text{B}&=10\cdot\frac{V_\text{ENA}-V_{\text{BE}_1}-V_{\text{BE}_2}}{I_\text{LED}} \end{align*}$$
We could assume that \$V_{\text{BE}_1}=V_{\text{BE}_2}=700\:\text{mV}\$ and that is what most folks would do. Feel free to do that, in fact. But if you want to be a little "techy" about it, you might realize that I'm going to set the collector current of \$Q_2\$ to be \$\frac{1}{10}\$th the value of \$I_\text{LED}\$. (You can see that much in the above equations, if you are sly about reading them.) That means there will be a \$60\:\text{mV}\$ difference in the base-emitter voltages between the two BJTs. Also, because we are dealing with \$20\:\text{mA}\$ and since I happen to know that \$V_\text{BE}\$ in small signal BJTs is roughly \$700\:\text{mV}\$ when the collector current is somewhere in the vicinity of \$3\:\text{mA}\$, I can work out the fact that:
$$\begin{align*} V_{\text{BE}_1}&=700\:\text{mV}+26\:\text{mV}\cdot\operatorname{ln}\left(\frac{20\:\text{mA}}{3\:\text{mA}}\right)\approx 750 \:\text{mV}\label{note}\tag{See appendix}\\\\ V_{\text{BE}_2}&=700\:\text{mV}+26\:\text{mV}\cdot\operatorname{ln}\left(\frac{20\:\text{mA}}{10\cdot 3\:\text{mA}}\right)\approx 690 \:\text{mV} \end{align*}$$
I'm doing this to illustrate a few things. One is about the fact that \$V_\text{BE}\$ isn't actually constant. (We are still ignoring temperature.) Instead, it does depend on the currents involved. So if you choose very different currents than I show here, you may need to know this detail. Another is to show how little it really matters in the case of dealing with an LED. We could just assume and be "close enough." It helps to see it done both ways to make that point.
Designing this for a low burden voltage of \$1\:\text{V}\$ (\$Q_1\$ going into saturation, so let's say \$\beta_1=20\$) and assuming a semi-worst case \$\beta_2=100\$, we find that
$$\begin{align*} R_\text{SET}\approx 33\:\Omega\\\\ R_\text{B}\approx 1.8\:\text{k}\Omega \end{align*}$$
And you will find that both those values are true whether or not you use my fancy calculated values for the base-emitter voltages or not. Either way, the results are very much the same. (So this points out one reason why the base-emitter voltages of BJTs can often be taken as a given value.)
TEMPERATURE AND THE EARLY EFFECT
I've avoided these topics until now. But they are both important enough that it is time to bring them up. One of the huge benefits of this circuit is that it solves both these problems (to a reasonable degree) by adding \$Q_2\$.
\$Q_1\$ suffers from the Early Effect in a circuit like this, because the \$V_\text{CE}\$ voltage can vary widely depending upon variations where the load drops different voltages. BJTs have, in effect, a resistor that goes from their collector to their emitter. This resistor is sometimes called \$r_o\$ and its value is \$r_o=\frac{VA+V_\text{CE}}{I_\text{C}}\$, where \$VA\$ is the Early Voltage for the device. This is often around \$100\:\text{V}\$ but it can be quite a bit less (the D45H11 can be as little as \$10\:\text{V}\$.) For \$I_\text{LED}=20\:\text{mA}\$, this value might be as little as \$5\:\text{k}\Omega\$ and, unchecked, would increase the current in the LED.
\$Q_1\$ also suffers from heating. It's the work-horse device here and it may have to dissipate power and, in the process, heat up. If we were depending upon its \$V_\text{BE}\$ voltage as a way of programming the LED current, we could be in some trouble because this voltage changes by somewhere between \$-1.8\:\frac{\text{mV}}{^\circ\text{C}}\$ to \$-2.4\:\frac{\text{mV}}{^\circ\text{C}}\$. With some BJTs in TO-92 packages with thermal resistances of \$200\:\frac{^\circ\text{C}}{\text{W}}\$, you can easily see a fairly large change in the base-emitter voltage due to heating caused by their normal operations.
But this circuit moves both problems away from \$Q_1\$.
First, \$Q_2\$ operates with a nearly fixed \$V_\text{CE}\$ in this circuit and therefore the Early Effect on \$Q_2\$ doesn't change as the circuit operates. But more importantly, the Early Effect on \$Q_1\$ is neutralized because \$Q_2\$ will adjust its operations to automatically compensate for such variations by observing the voltage drop across \$R_\text{SET}\$. If \$Q_1\$'s Early Effect attempts to make any changes in the LED current, \$Q_2\$ will immediately adjust its own collector so as to counter it. The result is that the Early Effect in \$Q_1\$ is neutralized by the same negative feedback loop that is controlling the LED current in the first place.
Second, since \$Q_2\$ is the BJT measuring the LED current, it's important that \$Q_2\$ base-emitter voltage remain as constant as possible. But also, because \$Q_2\$ is operating with far less current (on the order of the base current needed by \$Q_1\$) and because it has a very small \$V_\text{CE}\$ voltage, the dissipation in \$Q_2\$ is very modest. This means it is only barely affected by heating and this helps to ensure that its base-emitter voltage is only depending on the ambient temperature for operation; without the additional problems caused by having to dissipate the far greater power being handled by \$Q_1\$. So this helps to minimize temperature effects.
It doesn't completely eliminate them, since the ambient temperature of the circuit will still have an effect. But it does help to separate these problems into two devices.
So using the 2nd BJT here is helping in two ways: nullifying the Early Effect in \$Q_1\$ and removing the self-heating portion of the problem in \$Q_1\$ from impacting the programmed current.
FINAL NOTES
There is another use for this circuit I'd mentioned earlier. The ENA input can be managed by an MCU I/O pin, instead of tying it to the positive supply. If so, the value of \$R_\text{B}\$ might be adjusted per the \$V_\text{CC}\$ of the MCU itself (which may be different than that used by the load.) This allows ON/OFF control of this circuit.
So it's really quite a useful circuit pattern to know about and use.
As if that weren't enough already, the above circuit can also be modified to use a MOSFET:
simulate this circuit
Here, the value of \$R_\text{SET}\$ is set up much the same way as before. However, as the MOSFET doesn't require any base current, the value of \$R_\text{B}\$ can be much higher -- say \$22\:\text{k}\Omega\$ or still more, if desired. One impact of setting \$R_\text{B}\$ higher or lower will be the collector current of \$Q_2\$ and therefore will have some effect on the voltage across \$R_\text{SET}\$. Also, there is a threshold voltage required for the MOSFET (NFET), and this voltage requirement must be met by whatever voltage is supplied to ENA.
For example, the BSS123 has a threshold voltage of about \$1.7-2.0\:\text{V}\$. This is added to the already needed base-emitter voltage for \$Q_2\$. So the voltage difference across \$R_\text{B}\$ will be \$\approx V_\text{ENA}-700\:\text{mV}-1.7\:\text{V}\$. From that, you can work out the collector current for \$Q_2\$.
So there are a few slightly different calculations involved in this new pattern using an NFET with an NPN BJT. But this also shows the versatility of this pattern, too.
Appendix
Start with the usual equation for a BJT collector current when in active mode:
$$\begin{align*} I_\text{C}&=I_\text{SAT}\left( e^{^\frac{V_\text{BE}}{\eta\,V_T}}- 1 \right) \end{align*}$$
Solve for \$V_\text{BE}\$:
$$\begin{align*} V_\text{BE}&=\eta\,V_T\,\operatorname{ln}\left( \frac{I_\text{C}}{I_\text{SAT}}+ 1 \right) \end{align*}$$
At this point, we can simplify the above equation, removing the \$+1\$ term, because \$I_\text{SAT}\$ is many orders of magnitude smaller than any practical \$I_\text{C}\$. So, without any harm we can re-write it as:
$$\begin{align*} V_\text{BE}&=\eta\,V_T\,\operatorname{ln} \frac{I_\text{C}}{I_\text{SAT}} \end{align*}$$
Now. Suppose we happened to take a reference measurement of \$V_{\text{BE}_0}\$ at a collector current \$I_{\text{C}_0}\$. What would we then expect for a new \$V_{\text{BE}_1}\$ at a new collector current, \$I_{\text{C}_1}\$?
$$\begin{align*} \Delta\, V_\text{BE}&=\left[\eta\,V_T\,\operatorname{ln} \frac{I_{\text{C}_1}}{I_\text{SAT}}\right]-\left[\eta\,V_T\,\operatorname{ln} \frac{I_{\text{C}_0}}{I_\text{SAT}}\right]\\\\ &=\eta\,V_T\,\left[\operatorname{ln} \frac{I_{\text{C}_1}}{I_\text{SAT}}-\operatorname{ln} \frac{I_{\text{C}_0}}{I_\text{SAT}}\right]\\\\ &=\eta\,V_T\,\left[\operatorname{ln} I_{\text{C}_1} - \operatorname{ln} I_\text{SAT}-\left(\operatorname{ln}I_{\text{C}_0}-\operatorname{ln} I_\text{SAT}\right)\right]\\\\ &=\eta\,V_T\,\left[\operatorname{ln} I_{\text{C}_1} -\operatorname{ln}I_{\text{C}_0}\right]\\\\ &=\eta\,V_T\,\operatorname{ln} \frac{ I_{\text{C}_1}}{I_{\text{C}_0}} \end{align*}$$
So, this suggests that:
$$\begin{align*} V_{\text{BE}_1}&=V_{\text{BE}_0}+\Delta\, V_\text{BE}\\\\ &=V_{\text{BE}_0}+\eta\,V_T\,\operatorname{ln} \frac{ I_{\text{C}_1}}{I_{\text{C}_0}} \end{align*}$$
For a small signal BJT, it is almost always the case that \$\eta\approx 1\$ and at room temperature \$V_T\approx 26\:\text{mV}\$. This is where I got the "\$\ref{note}\$" equation, above.
If you are only interested in small scale changes rather than large, non-linear changes, then you can take a differential approach instead (sometimes called "local linearization about a point.") Below, I'll use a differential operator approach:
$$\begin{align*} I_\text{C}&=I_\text{SAT}\left( e^{^\frac{V_\text{BE}}{\eta\,V_T}}- 1 \right) \tag{non-linear Shockley}\label{nonlinear}\\\\ \text{d}\,I_{\text{C}}&=\text{d}\left[I_\text{SAT}\left( e^{^\frac{V_\text{BE}}{\eta\,V_T}}- 1 \right)\right] \\\\ &=I_\text{SAT}\cdot \text{d}\left[e^{^\frac{V_\text{BE}}{\eta\,V_T}}- 1\right] \\\\ &=I_\text{SAT}\cdot e^{^\frac{V_\text{BE}}{\eta\,V_T}} \cdot \text{d}\left[\frac{V_\text{BE}}{\eta\,V_T}\right] \\\\ &=I_\text{SAT}\cdot e^{^\frac{V_\text{BE}}{\eta\,V_T}} \cdot \frac{\text{d}\,V_\text{BE}}{\eta\,V_T}\tag{linearized}\label{linear} \end{align*}$$
At this point, you may notice that the first two factors in \$\ref{linear}\$ are very similar to the right-side of \$\ref{nonlinear}\$. The difference, the \$-1\$ term, isn't important enough to worry about, as the exponential value is almost always huge (\$\gg 1\$.) So this means we can re-write (without any loss of generality) \$\ref{linear}\$ as:
$$\begin{align*} \text{d}\,I_{\text{C}}&=I_\text{C}\cdot \frac{\text{d}\,V_\text{BE}}{\eta\,V_T} \end{align*}$$
Solving for \$\text{d}\,V_\text{BE}\$:
$$\begin{align*} \text{d}\,V_\text{BE}&=\eta\,V_T\cdot \frac{\text{d}\,I_{\text{C}}}{I_\text{C}}\tag{small signal $V_\text{BE}$ change}\label{sschange} \end{align*}$$
The interpretation of \$\frac{\text{d}\,I_{\text{C}}}{I_\text{C}}\$, its meaning in effect, is the "infinitesimal percent variation of the collector current." Compare the above \$\ref{sschange}\$ with what was developed still earlier above:
$$\Delta\, V_\text{BE}=\eta\,V_T\cdot \operatorname{ln} \frac{ I_{\text{C}_1}}{I_{\text{C}_0}}\tag{finite $V_\text{BE}$ change}\label{lschange}$$
There is a difference, as you can see. If all you want are nearby change estimates, all you need to worry about is estimating the %-change in the collector current and you can easily get the estimated change in the base-emitter voltage without having to worry about taking the logarithm of the current ratios.
If you remember 1st year calculus, then you may remember seeing that \$\text{d}\,\operatorname{ln}\,x=\frac{1}{x}\,\text{d}\, x\$. This means that \$\text{d}\,\operatorname{ln} \frac{ I_{\text{C}_1}}{I_{\text{C}_0}}=\frac{I_{\text{C}_0}}{I_{\text{C}_1}}\cdot\frac{\text{d}\,I_{\text{C}_1}}{I_{\text{C}_0}}=\frac{\text{d}\,I_{\text{C}_1}}{I_{\text{C}_1}}\$. And now you can easily see, in broad daylight, the relationship between \$\ref{sschange}\$ and \$\ref{lschange}\$.
The remaining voltage will be dropped across Q1. Or, the voltage across Q1 will vary to ensure that 20 mA flows through the LED and R2.