How is it that you can guess if one of a pair of random numbers is larger with probability > 1/2?

After Bill's latest clarifications in the commentary on Critch's answer, I think the question is interesting again. My take:

One thing that always seemed to fall through the cracks when I learned about probability theory is that probability is intricately tied to information, and probabilities are only defined in the context of information. Probabilities aren't absolute; two people who have different information about an event may well disagree on its probability, even if both are perfectly rational. Similarly, if you get new information relevant to a certain event, then you should probably reevaluate what you think is the probability that it will occur. Your particular problem is interesting because the new information you get isn't enough for you to revise that probability by purely mathematical considerations, but I'll get to that in good time.

With the previous paragraph in mind, let's compare two games:

G1. You are given two closed doors, A and B, with two numbers behind them, and your goal is to choose the door with the higher number. You are given no information about the doors or numbers.

G2. You are given two closed doors, A and B, with two numbers behind them, and your goal is to choose the door with the higher number. You are allowed to look behind one of the doors and then make your choice.

For the first game, by symmetry, you clearly can't do better than choosing a door randomly, which gives you a success probability of exactly 1/2. However, the second game has a chance of being better. You are playing for the same goal with strictly more information, so you might expect to be able to do somewhat better. [I had originally said that it was obviously better, but now I'm not so sure that it's obvious.] The tricky thing is quantifying how much better, since it's not clear how to reason about the relationship between two numbers if you know one of the numbers and have no information about the other one. Indeed, it isn't even possible to quantify it mathematically.

"But how can that be?" you may ask. "This is a mathematical problem, so how can the solution not be mathematically definable?" There's the rub: part of the issue is that the problem isn't formulated in a mathematically rigorous way. That can be fixed in multiple ways, and any way we choose will make the paradox evaporate. The problem is that we're asked to reason about "the probability of answering the question correctly," but it's not clear what context that probability should be computed in. (Remember: probabilities aren't absolute.) In common probability theory problems and puzzles, this isn't an issue because there is usually an unambiguous "most general applicable context": we should obviously assume exactly what's given in the problem and nothing else. We can't do that here because the most general context, in which we assume nothing about how the numbers $x$ and $y$ are chosen, does not define a probability space at all and thus the "probability of answering the question correctly" is not a meaningful concept.

Here's a simpler ostensible probability question that exhibits the same fallacy: "what's the probability that a positive integer is greater than 1,000,000?" In order to answer that, we have to pick a probability distribution on the positive integers; the question is meaningless without specifying that.

As I said, there are multiple ways to fix this. Here are a couple:

I1. (Tyler's interpretation.) We really want the probability of answering the question correctly given a particular $x$ and $y$ to be greater than 1/2. (The exact probability will of course depend on the two numbers.)

I2. (Critch's interpretation.) More generally, we want the probability of answering correctly given a particular probability distribution for $(x,y)$ to be greater than 1/2. (The exact probability will of course depend on the distribution.)

(Those two are actually equivalent mathematically.) Clearly, if we knew what that distribution was, we could cook up strategies to get a success probability strictly above 1/2. That's pretty much obvious. It is not nearly as obvious that a single strategy (such as the one in the statement of the question) can work for all distributions of $(x,y)$, but it's true, as Bill's proof shows. It's an interesting fact, but hardly paradoxical now.

Let me summarize by giving proper mathematical interpretations of the informal statement "there is a strategy that answers the question correctly with probability strictly greater than 1/2," with quantifiers in place:

(1a) $\exists \text{ strategy } S: \forall x, y: \exists \delta > 0$: $S$ answers correctly on $x$, $y$ with probability at least $1/2 + \delta$.

(1b) $\exists \text{ strategy } S: \forall \text{ probability distributions } P \text{ on } \mathbb{N}^2: \exists \delta > 0$: $S$ answers correctly, when $x$, $y$ are chosen according to $P$, with probability at least $1/2 + \delta$.

I think with the proper quantifiers and the dependence on $x$ and $y$ explicit, it becomes a cool mathematical result rather than a paradox. Actually, based on my arguments at the beginning, it's not even that surprising: we should expect to do better than random guessing, since we are given information. However, simply knowing one number doesn't seem very useful in determining whether the other number is bigger, and that's reflected in the fact that we can't improve our probability by any fixed positive amount without more context.

Edit: It occurred to me that the last part of my discussion above has a nonstandard-analytical flavor. In fact, using the first version of the formula for simplicity (the two being equivalent), and the principle of idealisation, I think we immediately obtain:

(2) $\exists \text{ strategy } S: \exists \delta > 0: \forall \text{ standard }x, y:$ $S$ answers correctly on $x$, $y$ with probability at least $1/2 + \delta$.

(Please correct me if I'm wrong.) The number $\delta$ is not necessarily standard, and a basic argument shows that it must actually be smaller than all standard positive reals, i. e., infinitesimal. Thus, we can say that being able to look behind one door gives us an unquantifiably small, infinitesimal edge over random guessing. That actually meshes pretty well with my intuition! (It might still a nontrivial observation that the strategy $S$ can be taken standard; I'm not sure about that...)


It might be useful to the intuition to consider the following simpler strategy: If the revealed number is positive, guess that it is the larger of the two. If the revealed number is negative, guess that it is the smaller of the two.

This simple strategy already guarantees you a chance of winning that is always at least 50% and sometimes greater. In particular, your chance of winning is 50% if the numbers are either both positive or both negative, and 100% if they have opposite signs.

That's not quite a solution to the original problem, but it works --- or comes close to working --- for exactly the same reason that the actual solution works, and it's easy to understand in an instant.


Thanks for writing out the proof in detail, so that answers can easily refer back to it! The math is fairly straightforward, but as a paradox I find this very interesting.

Spoiler alert: Paradoxes are awesome! If you like paradoxes, don't read below unless you've thought about it yourself first, or just don't care :)

EDIT (Dec. 17, after change in question statement): This answer interprets the question with the values { $x,y$ } varying over possible games. Mathematically speaking this is only an integral more general than the case where { $x,y$ } is fixed (which is the special case where the hosts's choice distribution $Q$ is supported uniformly on two fixed integers), but a new level of potentially paradoxical issues arise. The answer below is intended presuming the fixed case is understood, and addresses issues that arise in passing to the variable case. For a discussion of the fixed case, I recommend Darsh's excellent answer.


Logical/mathematical remarks:

(0) An easier method than "repeating until non-integral" is to directly fix a probability distribution $P$ on numbers of the form $n+0.5$ and choose one randomly via $P$.

(1) The proof for this strategy is valid provided that you know the host is choosing his number according to SOME fixed, well-defined probability distribution $Q$ on the integers (but you don't have to know what $Q$ is). In this sense, you know that his choice is actually not completely arbitrary... you know he's following some system, even if you don't know which one.

For example

(1.1) Specifically, consider the step "Averaging over all the cases" (after Case 3 in the question). The number $\epsilon$ in Case 3 depends on $x$, $y$, and your distribution $P$. "Average over all the cases" means that you integrate the function $(1+\epsilon)/2$ over the space of all possible pairs $(x,y)$. This requires a probability distribution on the pairs $(x,y)$, which is where $Q$ comes in. Since any $Q$ will give a result greater than $1/2$, it is tempting to think $Q$ is irrelevant to the conclusion. This is a fallacy! I cannot stress enough that treating an unknown as a random variable is a non-vacuous assumption that has real consequences, this scenario itself being an example. Without this assumption, the step "Averaging over all the cases" is meaningless and invalid.

(2) Although the result of your strategy is that you "know something" about the relation between the two numbers, your confidence in this knowledge is arbitrary. That is, you have no estimate, not even a probabilistic one, on how much bigger than $0.5$ your chances of winning are.


Resolving the paradox: (isolating what causes the "weird feeling" here)

In short, I'd say mixing "random" and "arbitrary" in the same scenario makes for a patchy idealization of reality.

Here, one unknown, the choice of ordered pair $(x,y)$, is being treated via a probability distribution $Q$, whereas a related unkown, the choice of probability distribution $Q$, is being treated as "arbitrary". This is a weird metaphysical mix of assumptions to make.

Why?

For example, in science and in everyday life, an estimate of "what's likely" can be accompanied, consciously or unconsciously, with an estimate of how accurate the first estimate is, and in principal one could have estimates of those accuracies as well, and so on. This capacity for self reflection is a big part of being sentient (or at least the "illusion of sentience", whatever that means).

In this scenario, such self-reflective estimates are in principle impossible (because we don't assume that the host's distribution $Q$ was chosen according to any distribution on distributions), which makes it a very unfamiliar situation.

This is just one reason why mixing "random" and "arbitrary" can make for a weird-seeming model of reality, and I blame this mixture for allowing the scenario to appear paradoxical.