Find matching phone numbers without actually knowing them
bcrypt
would be a somewhat better approach because it is designed to be (programmably) slow.
Using a large enough salt and a reasonable complexityFactor, bcrypt(salt + number, complexityFactor)
should yield a viable hash and you avoid "rolling your own cryptography", which could possibly turn out to be a difficult sell. To increase security you just crank up complexityFactor
.
An attacker would now have to generate the bcrypt not only of every 10-digit phone number (which could be feasible: there are only 1010 numbers after all), but of every possible salted sequence. With a 10-character base64 salt (60 bits of entropy), the complexity increases of twenty orders of magnitude.
Brute forcing
Suppose you have 1,000 contacts . The CPU of your average phone seems to be two orders of magnitude slower than a server core array. I think it's reasonable to say that it will be three orders of magnitude slower than a semi-dedicated GPU implementation of bcrypt
, which is said not to be so efficient.
So we tune bcrypt
to take 100 milliseconds for each encoding. This means that we need 1 minute 40 seconds to generate our 1,000 hashes, which is a reasonable time for one thousand contacts (a progress bar seems in order). If the user only has 100 contacts, he'll be done in 10 seconds.
The attacker, given the salt of one number, has to generate perhaps 108 numbers to reasonably cover the mobile number space (the first number, and possibly the first two, aren't really 10 or 100 -- I count them as 1). It will take three orders of magnitude less than 108 times 100 milliseconds, i.e. than 107 seconds. This is down to 104 seconds, or around two hours and a half (or a whole day if the GPU optimization thingy turns out not to work).
In less than four months, the whole 1,000 contacts will have been decrypted - using one optimized server. Use ten such servers, and the attacker will be done in two weeks.
The problem, as pointed out by Ángel's answer and Neil Smithline's comments, is that the key space is small.
In practice user A will produce something (a hash block, or whatever) to be made available somehow to B. User B must have a method that works like
matches = (boolean|NumberArray) function(SomethingFromA, NumberFromB)
(little changes if the second parameter is a set of N numbers, since UserB can build a set using one true number and N-1 numbers known to be fake or not interesting. It can lengthen attack time by a factor of N).
This function works in a time T... actually this function must work in a time T short enough that user B, in a commercial real world application, is satisfied.
Therefore one bound we can't easily dodge is that M numbers must be checked in an acceptable time on an average smartphone. Another bound we can't reasonably dodge is that User B can supply fake numbers to the algorithm (i.e. people that aren't really contacts, and possibly do not even exist).
Both bounds also are enforced if the checker is on a third server; this only assures a lower execution time limit that can thwart some scenarios, such as "decrypt all UserA's numbers", but not others such as "verify who has this number", as in drewbenn's answer).
From these two bounds arise the fact that using a smartphone (or a third party server with minimum execution time enforcement), cycling through all 108 reasonable numbers takes about 108 smartphoneTime time, or on the order of one thousand months.
Attack strategies to decrease this time are distributing the attack between several verifiers, or running it on a non-throttled and faster server (this requires the algorithm being available, but assuming the contrary is security through obscurity), and they look both feasible and affordable.
A loophole?
One possibility could be to introduce a small probability of false positives. I.e., the above oracle function will occasionally (say once every ten thousand contacts), and deterministically on UserA's input, return true to one of UserB's numbers.
This means that the brute force attack on 108 numbers will yield UserA's contacts mingled with 104 other numbers. Determinism on UserA's input means that two successive checks on these 104 found items won't further whittle them. Unless UserB's can grab a different copy of UserA's input, which will yield a different set of false positives and allow filtering the true contacts as the intersection of the two sets, this may make the bruteforced answer less appealing. This comes at a cost - honest UserBs will have to get the occasional false hit.
We really can't win
If UserB must be able to answer the question "Is number X among UserA's contacts?" in a reasonable time with certainty, the time expenditure is linear, because the system cannot prevent two such requests from being made against numbers X1 and X2, and time for request X2 will be the same for request X1. Therefore, solving two numbers will require double that reasonable time; by induction, solving for N numbers will entail N times that reasonable time (not, say, N2).
The difference between a legitimate request and an attack is that the attack will work on a space ten to a hundred thousand times larger. Being linear, it will require a time up to one hundred thousand times longer... but it also may run on a machine or group of machines one hundred to one thousand times faster.
Therefore, our attacker will always be able to decrypt all of UserA's contacts in a "still not unreasonable" time. The only serious check to this would be for the checks to be run on a third, trusted machine with rate limiting and the means of detecting a likely attack.
To thwart an attacker we need something bad to increase with the increase of N, and since it can't be running time (which doesn't increase enough), I think the only resort left is the probability of false positives. The attacker will still get the answer, but we might still succeed in making a bruteforced answer less usable.
One simplistic implementation (poor man's Bloom filter)
To answer Mindwin's comment, the local algorithm can't work by hiding information - the information must be missing in the first place, otherwise we'd be still be doing security through obscurity.
One method would be for UserA (Alice) to send over the bcrypt
salt for her (say) 1000 contacts, followed by 1000 incomplete bcrypt
hashes. If the hashes are truncated at the i-th byte, there will be pseudo-random collisions. Among UserB (Bob)'s contacts, which are few, collisions will be very rare (unless i is really small). Among the attacker(Eve)'s whole number space, collisions will be significant.
Note that phone number distribution is not flat, so Eve can have ways of whittling those collisions by removing, say, unused numbering sequences.
If every contact hash has a probability of collision of one in a thousand, Bob, checking his one thousand contacts, has a probability of (1 - 1/1000)1000 of having no collisions at all - that's 70%, not so good. If collision probability is 1/10000, Bob with one thousand contacts will have 90% chances of not getting even one collision. On a hundred contacts only, no-coll probabilities for Bob are 90% and 99% respectively.
Eve, verifying 108 numbers, even with p = 1/10000, will always get ten thousand collisions no matter what.
Sending two or more hashes with higher collision probability does not change things much for either Bob or Eve, in comparison to sending a single hash with collision probability equal to the product of the separate hashes.
For example instead of 1 round with p = 1/10000, use two rounds with p = 1/100, since 1/100*1/100 = 1/10000.
So Alice sends two sets of unordered incomplete hashes, with different seeds, and higher collision probability of 1%; Bob will test his 1000 contacts and get positive matches for the 100 contacts he has in common; the remaining 900 shouldn't match, but since the hash is incomplete, 1% of them will, that means 9 spurious contacts, and Bob will end up with 109 likely candidates after running 1000 tests. He now has to test those 109 with the second hash, which also has 1% probability. The 100 true intersections will still match. Of the remaining 9, likely none will pass. The chance of a contact passing two such rounds is 1% of 1%, i.e. 1 in 10000, and the chance of having not even one false positive on 1000 nonmatching contacts is (1-1/10000)1000, or 90.48%, exactly as before.
With the same numbers, Eve will get one million false positives on her first round, and will have to run one million extra tests. 1% of those will match the second round, leaving Eve with ten thousand false positives mixed up with the one thousand contacts from Alice.
Eve has had to run 101 million tests instead of 100, and Bob had to run 1109 tests instead of 1000. In proportion, the double-hash scheme impacts Bob harder than Eve. It would be better to use a single hash with higher complexity.
The privacy problem of answering the question "Does Alice know number N?" will remain unaddressed - the time to answer that is the same for Bob and Eve.
There are potentially other privacy issues you're not considering yet. By design your app makes it easy to see who is connected to a certain target. So an attacker creates one contact on their phone (the activist/informant/terrorist/victim they are interested in) and then connects to many other users through your app, to create a list of the target's contacts. So for example a DV abuser could use this app to make a list of people still in contact with his ex: even Google has had problems getting this right.
Yes, it is (a bit) flawed. The problem is that the space is too small, so even with the multiple rounds and salts, it's relatively easy to bruteforce.
Open Whisper Systems had a witty system where they provided an encrypted bloom filter that can be queried locally using blind signatures. They explain the process (as well as providing a good discussion of the private information retrieval problems) at https://whispersystems.org/blog/contact-discovery/
Sadly, they had to discontinue this on TextSecure due to practical issues with a big user base. In your case, as you are sharing numbers between two final users, it should be feasible, either with their same method or using another protocol like those published ones that are mentioned by Moxie.