What is the difference between $E[X\mid Y]$ vs $E[X\mid Y=y]$ and some of the properties of $E[X \mid Y]$?

If $\varphi(y)=E[X\mid Y=y\,]$ is the non-random function of $y$, then $E[X\mid Y\,]$ is defined to be the random variable $\varphi(Y)$.


If $Z=E[X\mid Y]$ a random variable then it means $F_{Z}(z)=P(Z\leq z)=P(Y\in\{y\in\mathbb{R}:E[X\mid Y=y]\leq z\})$.

If you want to think intuitively, it is very simple: to choose a random value for $Z$, first choose $Y$ randomly according to the probability of $Y$, you obtain $Y=y$. Then the value $Z$ takes is $E[X\mid Y=y]$.

In fact, this is just a special case of the general construction: if $f$ is a real Borel measurable function, then $Z=f(Y)$ is defined to be a random variable such that $F_{Z}(z)=P(Z\leq z)=P(Y\in f^{-1}(-\infty,z])$. Here $E[X\mid Y=y]$ is a real function that take $y$ and return $E[X\mid Y=y]$.

The reason for the intuitive notion to be slightly complicated is because of the possibility of the function being not injective. For example, let's suppose $X$ and $Y$ is independent. Then $Z$ would be a value that take on the value of $E[X]$ with probability $1$, regardless of how $Y$ are distributed: this is because $E[X\mid Y=y]$ is a constant function.


There are fundamental differences between $E[X\mid Y=y]$ and $E[X\mid Y]$. You've got the fist one down, in that we are conditioning on a specific event. In the latter case, you are merely told the likelihood of various events, not which ones actually happen.

Calculating the distribution of $E[X\mid Y]$ is exactly the same as calculating the distribution of $f(X)$ for some integrable function $f$ given that you know the distribution of $Y$, since $E[X\mid Y]$ is just a special case of a function of $Y$.

Bottom line: Don't think of it as an expectation, think of it as just the distribution of the function of a random variable.