Proper credit card encryption for use in a blacklist
This is a pretty tall order, OP, and I don't think it is possible to do it perfectly. Nonetheless here are a couple ideas.
The attacker is unable to check if the card is on the list, even though they can insert values and can see the encrypted/hashed values.
The answer to this is to insert false positives. If a card is to be blacklisted, compute a hash (or hash + crypt-- see below) composed of PAN. If it is not, compute the hash as PAN + a nonce. In both cases, insert a record. To check to see if the card is blacklisted, recompose the hash without the nonce and check for it.
This way, a hacker can see that a record is inserted, but the insertion itself doesn't tell him anything.
Have the credit card numbers secured with either encryption or a hashing function so that if the data is compromised the numbers will not be usable.
Hash the PAN + long, systemwide cryptographic pepper (secret salt)
The pepper will add enough entropy to make brute force impossible without knowing the key. Store the pepper outside of the database.
To check for a match, perform the hash on the PAN of interest, then search the table. Because the secrets are all systemwide, you should be able to leverage an index, including the nice B-tree index that gives you the O(log(n)) performance you are looking for.
The biggest problem I see with your question is: Have the credit card numbers secured with either encryption or a hashing function so that if the data is compromised the numbers will not be usable.
This is a problem because as @Bobson stated in the comments Even hashing credit card numbers is problematic. Given the restricted nature of valid card numbers, it's likely that there's only one valid number which hashes to a given hash.
At first I thought that this was because the search space was limited to [0-9]{15-16} or 1.11 x 10^16 ... which is kinda bad but at an offline attack rate of 1.29 days per card number ... its not end of the world bad. But after doing a bit more research I found that the problem is actually much worse than this.
The first number on your card: If it starts with 3, it is an American Express, Diner’s Club, or Carte Blanche; 4 is reserved for Visa, 5 for MasterCard, and 6 for Discover. The next five digits will indicate the card issuer such as the bank or credit union, as well as the type of credit card. So for example, all Citi AAdvantage Executive MasterCards will begin with 546616 with the 5 representing MasterCard, and the 46616 representing Citi as well as the fact that it is one of their American Airlines AAdvantage Executive cards. The remainder of the 16 digits (or 15, in the case of American Express), represent both the cardholder’s account number as well as one or more “check digits.” source
For 16-digit numbers, start with the first number on the left, and double each alternating number on the card. (For 15-digit numbers, start with the second number from the left.) Add any double-digit numbers so that they reduce to a single-digit number. (For example, if one of the numbers is 8, it doubles to 16, then you add 1 + 6 to get 7.) Next, add those together with the alternating numbers that you did not double. If the total you get is divisible by 10, the credit card number is likely valid. source
So this means that in the number 1222-2233-4444-4444 your search space is only slightly more than [0-9]{8} which would mean:
- 1.29 days per card - Assuming one thousand hashes per second (online attack)
- 1.11 ms per card - Assuming one hundred billion hashes per second (offline attack)
source
this problem gets even worse when you factor in how many cards are actually in circulation
Type of Credit Card 2000 2003 2007 2008 2010 2012
Visa 255 283 321 304 240 261
Mastercard 200 267 279 260 171 174
Store 597 521 546 539 318 455
Oil Company 98 86 73 65 35 60
Discover 50 54 57 58 55 59
American Express 33 36 52 54 49 52
Other 192 185 166 160 124 106
Total 1,425 1,432 1,494 1,440 992 1,167
*cards in millions
source
This means that unless you have an additional source of entropy that is not stored in the database (aka not salt) nor in the program doing the hash (aka not pepper) besides the card-number itself ... storing the credit card hashes at all is not advisable.
But what if a hacker gets access to my live database and is able to arbitrarily enter information into it...
You could (actually should) argue that if a person can add things to your database that you have bigger problems than the security of your blacklist - let's start with privilege escalation, admin super user hacking etc (it's much easier to add/remove cards from blacklists through your admin interface than through your database).
And if we're talking about an offline dump of your database getting stolen then it's your duty to keep the PANs safe. According to the PCI DSS council "the good enough for PAN safekeeping is" defined in Requirement 3: Protect stored cardholder data of the PCI DSS' "Requirements and Security Assessment Procedures":
3.4 Render PAN unreadable anywhere it is stored (including on portable digital media, backup media, and in logs) by using any of the following approaches:
- One-way hashes based on strong cryptography, (hash must be of the entire PAN)
- Truncation (hashing cannot be used to replace the truncated segment of PAN)
- Index tokens and pads (pads must be securely stored)
- Strong cryptography with associated key-management processes and procedures.
Note: It is a relatively trivial effort for a malicious individual to reconstruct original PAN data if they have access to both the truncated and hashed version of a PAN. Where hashed and truncated versions of the same PAN are present in an entity’s environment, additional controls must be in place to ensure that the hashed and truncated versions cannot be correlated to reconstruct the original PAN.
I would argue, though, that the note in the bottom implies that the PCI DSS council expected people to not salt their hashes. Read more about hashes & their security later.
Please note that all of the options I'm listing here are "good enough" per PCI DSS standards, in varying levels of usability and paranoia.
Option 1: use only the masked PAN
(and a masked PAN is first 6 and last 4 chars of a PAN)
Use the masked PAN for blacklisting. I have seen precious little cards in my history where 2 users came with the same masked PAN and different full PANs, perhaps it's rare enough to make it a Helpdesk and not a programming issue.
Option 2: use the masked PAN and the encrypted PAN
If you want to guard against this eventuality, use masked pan to grab a number of rows (usually 1, almost never 2), and then compare against a salted hashed PAN, a global salt will do. But at this point you might as well use the encrypted PAN too (decrypt & compare) because 99.99% of the time you will be comparing with just 1 record.
- Q: but this doesn't work for me, my masked PAN is just the Luhn digit! What can I do?
- A: then sorry, obviously you can't use this approach, just remember that the PCI council considers first 6 and last 4 to be the "masked PAN". You can still use hashing though, keep reading
Option 3: use the hashed PAN
No, you don't need to check O(n) records to do a verification. Use a table-level pepper and a database index and ditch the salt.
Hashes security is a another story altogether, but remember:
- table-level strong pepper should be enough, salt isn't helpful so much because it gets stolen with the database
- perform the hashing operation repeatedly, a number of times, a practice called stretching
- store & use the pepper via an HSM
Take a look at this implementation to give you a general idea.
Does your HSM allow for such a scheme? Digests are generally poorly supported in hardware HSMs, so you can use normal 3DES/AES for the same thing.
- Q: I know this approach is probably stronger than the security of my admin's passwords but I still don't feel sure about this, it's software-stored peppers after all, and they're table-wide, it feels insecure
- A: OK - but remember - this is what PAN peppers are for - this approach is above and beyond what's required by PCI DSS - while row-level salts are something that makes rainbows attacks harder they're also something you steal together with the hashes themselves so their security is moot compared to something that's not stored in the database - like a database-level pepper - but OK, keep reading, you can extend this approach by applying your HSM to the problem
Option 3b: hashed + encrypted PAN
Use a 3DES or AES cipher (I just refer to them as "DES" in this section, don't be confused by this) in ECB mode or with fixed IV in CBC mode. For added security through obscurity you can encode something in the IV, such as parts of the PAN or, stronger - another cryptogram that's accessible only via the HSM (ECB encrypted PAN or its derivation comes to mind as a decent IV candidate).
Use the same technique as described before and as shown in the code sample I provided - with the global pepper - with or without the digest & under DES - this will make sure that if an attacker runs away with your database & your code base & your server setup that he will not be able to start a rainbow attack without breaking DES first and this is "impossible" because the DES keys are in the HSM.
Note that ECB and fixed IV CBC in this case is "good enough" because of the underlying randomness of the peppered and hashed material.
- Q: but I'm still not happy because ....
- A: well.. keep reading
But in the end...
I'm pretty sure you can't make the scheme more secure than this while still keeping the function it's supposed to do.
Security and usability are two polar opposites, don't go full paranoid mode, be smart. Full paranoid mode can backfire in interesting ways, from my experience, so far:
- I can already imagine that in your institution while you're looking for the perfect solution some lazy programmer implemented clear PANs in the database 6 months ago because the management was shouting - it's probably not like you're running without a blacklist all this time - I've seen this happen with a global e-Wallet provider - if this is your case - you're not alone
- a military intelligence institution and a bank with strong enforced passwords - passwords written on a post-it on the back-side of the keyboard and "not passwords.txt" file on the desktops, respectively
- a super-uber-PCI-DSS-got-nothing-on-this secured national bank in the EU with contactless PKCS11 wallets you carry on your person - being there as a third party contractor I found out they gave us "blank access" cards to all areas of the building and all computers because the "third party" module of the security software wasn't compatible with the "linux drivers" whatever that means
- a super-secure PCI DSS environment and a single "magic cable" "hidden" from the CSO sticking out of the secured box and into a router because the protocols didn't foresee mass hardware migrations in a secure closet with limited drawers
- same institution as the last point, a bug in the software in which first 8 chars of 10-20 PANs appeared in a log in a secure environment (by PCI standards). Someone would have encrypted the logs, but the CSO argued that it conflicts with some "ease of use clause" in the PCI specification and insisted that the issuers of the cards be contacted to blacklist the cards and reissue them because "they have been compromised", the backlash of the banks who were also partners was pretty understandable, this CSO lost his job eventually
- PIN cryptography, protection vs cloning Visa/MC cards is protected by 2-length DES keys with ECB, never broken cryptographicaly, but people write their PINs on cards because they're hard to remember and US still afaik by and large uses magstripe because EMV is hard to implement
- Mifare cards, single length DES encryption by-and-large - 10 billion chips, 150 million readers, only extended recently after successful breaches by AES with a key negotiation scheme which might as well have been written by a child (one party makes half a key which is concatenated to the other party's half-a-key instead of a proper full-length XORing scheme - and the initial key exchange when the master key is loaded into the card is still DES with a predefined 000..0000 key) - considered secure
- and let's remember https://xkcd.com/538/ :D
So yeah, if you're trying to be holier than Pope in relation to security, you do it at your own risk, don't be a fundamentalist, be smart. Focus your energy on securing your system where it matters most - and the blacklist almost certainly isn't it.
I'm guessing this because generally by the time you have a blacklist you already have a "secure card lookup mechanism" in place for the transactions, risk rules etc and the problem was, generally, solved months ago. And this isn't the case here.
PS get an online API and try a brute force "scan" (1$ authorizations) of 10.000 card numbers and see what the processor tells you about it. There are checks and balances against this in place, read:
- "VISA: Global Acquirer Risk Standards"
- "VISA: Global Brand Protection Program Guide for Acquirers"
to learn a bit more about them.