IntroductionRecent news in the world of compliance, security and cryptography has drawn attention to how things are encrypted, and whether that encryption can withstand an attack. Particular attention is being given to the credit card industry because it touches almost every American’s life. As the people who actually assess PCI environments, this means that we as QSAs need to ensure we have a solid understanding of what it actually means to encrypt something.
The PCI-DSS 2.0 standard, requirement 3 states: “Protect stored cardholder data”. Requirement 3.4 further refines this to “Use strong cryptography”, but what does that really mean?
The usual approach is to cite NIST standards which basically state “Don’t use DES. Use Triple DES or AES”. What the NIST standards we look to for guidance fail to do is define HOW to properly implement these approaches.
NIST does have fantastic guidance on implementation, but it is buried in an arcane indexing system that almost requires an advanced math degree just to understand in itself! Encryption is one of the few areas of security where you can be provably secure, and rely on something concrete for protection. Sadly, many developers and assessors don’t have a strong grasp of the intricacies, or even the basics of cryptography. As such, the PCI world seems plagued both by poor implementations of crypto, and poor assessment of those implementations. Encryption is the area of data protection where I see the biggest missteps and misunderstandings – both among those who implement it and those who assess it. In this paper we will examine some of the real world implementations seen, and why they fail the test of “Strong Cryptography”. Our intent is not to educate the reader on the subtleties of ciphers and modes, but rather to examine some of the missteps taken in the implementations seen during various PCI-DSS assessments. We will focus on actual encryption of data here, and not on related concepts (such as hashing or tokenization) which often get mixed up with encryption and misapplied.
Cryptography is only as strong as its weakest element. The most secure, unbreakable cipher is easily compromised by a weak implementation. The most common source for these missteps stems from either a default configuration of a software package like the DBMS, or in-house development without having a clear understanding of how crypto should be configured or how keys should be managed.
Cipher Choice - An introduction to AES and Triple-DES
There are dozens of different ciphers available for use when encrypting data; however PCI implementations tend to use either 3DES or AES as they are the most widely recognized, implemented, and portable ciphers. Other ciphers are allowed, or at least not prohibited, and the same principals applied to 3DES and AES are equally applicable to other ciphers.
Triple-DES (3DES or TDES) is an update of the older and now deprecated Data Encryption Standard (DES) mode. Like DES, 3DES uses a 64 bit block size. Unlike DES, 3DES takes the plaintext through 3 cycles of cryptography where it first encrypts, then decrypts, then again encrypts the data. When properly implemented, this is done with a 192 bit key that is used 64 bits at a time in each cycle.
The first 64 bits is used as a DES key to perform an encrypt operation on the 64 bit plaintext block, then the middle 64 bits is used to perform a decrypt against the ciphertext from the first operation, further scrambling the data. Finally, using the last 64 bits of the key, the product of the decrypt function is encrypted. This works because of a property of DES where encrypt and decrypt operations use the same sequence, with only a reversal of the key schedule to perform the decrypt.
The cryptographic strength of 3DES is highly dependent on how the key is constructed which is demonstrated later. 3DES retains some DES properties, such as the reduction of the key block size from 64 to 56 bits by discarding every 8th bit. The repercussions of these design decisions is outside the scope of this paper, however a number of well written discussions about the S-Box, key strength, and key scheduling can be found with a bit of research.
Advanced Encryption Standard (AES)
AES was announced by the NIST as U.S. FIPS PUB 197 (FIPS 197) on November 26, 2001. The AES cipher was developed to address the weaknesses identified in other ciphers and provide more robust cryptography. Although 3DES is still allowed per NIST recommendations AES is the preferred cipher today.
The AES cipher differs from 3DES in three important ways:
- It uses a stronger implementation of the substitution box with the Rijndael approach,
- It uses a 128 bit block size instead of 64 bit,
- It uses different keys sizes (128 bit, 192 bit, 256 bit) depending on requirements.
While there has been some development on proposed versions of AES that expand the block size to 512 and 1024 bits, these implementations are experimental and not appropriate for general use. Like DES and 3DES, AES retains a strict block size, necessitating a padding solution like PKCS. A real-world advantage of AES is that many processors have a dedicated hardware based AES encryption engine which increases performance exponentially over a software only implementation.
Cipher modes – ECB and CBC
Most ciphers used to encrypt data today are block ciphers, where data is divided into fixed-size chunks which are each encrypted. Block ciphers use a variety of methods, or modes, to string the different chunks of ciphertext together, and each of these has advantages and disadvantages.
While it may feel overly pedantic, choosing the right, or wrong, cipher mode can significantly affect the security of the encryption in practice. The key to success is to balance these in a way that provides the cryptographic strength you need while not adding too much processing burden to your system. The two relevant modes for this discussion are ECB and CBC because they are the two most common block modes used to encrypt PAN numbers (discussed later).
Electronic Code Book
The Electronic Code Book (ECB) mode was the first block mode utilized. ECB is the simplest and most straightforward block mode. Each block is independent from the others and any given piece of plaintext with result in exactly the same ciphertext, every time. This makes the cipher highly parallelizable and results in the fastest implementations.
This property of ECB mode is also its greatest weakness. If a plaintext can be found that fits within the block size of the cipher, that will always produce the same ciphertext. That may be an account number, a word, any sequence that is in the same place in each message. Using this, a cryptanalyst can infer a great deal of information from the ciphertext without having to actually decrypt the data. The penguin to the right is a good example.
Chained Block Modes
Weaknesses resulting from the way ECB treats each block independently resulted in the development of other modes where each block was treated differently. Examples include Cipher Block Chaining (CBC) and Output Feedback (OFB). These use an additional piece of data, called the Initialization Vector (IV), as an input to the encryption process.
An IV is used as a seed value in block modes like Cipher Block Chaining (CBC), Cipher Feedback (CFB), Output Feedback (OFB), and Counter (CTR). Each of these needs the IV as a starting point, and then they have an internal mechanism that uses some preceding data, such as the last piece of ciphertext, as the IV. Some modes have a dependence on non-predictability as well, but they are outside the scope of this paper.
The IV is a piece of known data that is the same size as the cipher block which is XORed with the plaintext to be encrypted. The IV does not need to be kept a secret, as a matter of fact, when used correctly it MUST be known to complete the decryption. The most important property of an IV is non-repetition. The properties of CBC discussed here are common across all of the block chained and IV dependent modes. Any given plaintext, if submitted more than once should only have a chance of producing identical ciphertext. Naturally, this is diluted by the cryptographic strength of the random number generator (RNG) in use, but for the purposes of this discussion, we will assume a hypothetical perfect RNG.
Input Entropy - Why what you are encrypting matters
When you have a type of data you need to encrypt, it is important to understand the nature of that data and the impact it may have on the cryptographic process. The ideal cipher will make your data appear as much like random noise as possible. The appearance of randomness is affected by both the cipher, and the data you are putting through it.
The more possible values you have, the more random it can appear. For example, if you are encrypting something that only has two values, like “true” or “false”, you will only get two pieces of ciphertext. As an attacker I have a 50/50 chance of getting it right each time. The more complex your plaintext, the more complex your ciphertext will be. It is this property that makes credit card data so interesting from a cryptanalysis perspective.
The piece of data of most interest for encryption under PCI-DSS is the Primary Account Number or PAN. The PAN has properties that make it interesting and challenging from a cryptographic perspective.
PANs are not random
Many developers depend on the apparent randomness of PANs to be an element of the cryptographic strength. Although PANs appear random, they are not nearly random enough to be considered for use as a cryptographic element. The first part of the pan indicates what the card brand and issuing bank are, and the last digit is a checksum. That leaves only the parts that the issuing bank generates, and that is simply too small to be useful as a source of randomness
PANs are not long enough
Ciphers that use various block chaining modes rely on data that exceeds the block size in order to be utilized properly. PANs are, at their longest, 16 characters. This equates to 16 bytes or 128 bits of data. Exactly two 3DES blocks or one AES block. Most cipher implementations will add a block filled with padding data, either all zeroes, PKCS5, or PKCS7. In any of these cases, the last block of data will consist of a known value. If CBC is in use, that value will be XORed with the ciphertext of the previous block, but because both values are known, this is of little use. In fact, it provides a potential crib (plaintext used in cryptanalysis) depending on how the underlying implementation handles padding data.
Because PANs typically fall on a block boundary, unless CBC or some other chained mode is used with a proper IV implementation, at best the cryptography is only as strong as ECB equivalent.
Key management - The dangers are in the details
When it comes to cryptographic key management there are many ways to do it right; here are some of the ways to do it WRONG. This is all far from an academic exercise; the examples cited below have been seen in real world production environments. The first place an attacker is going to look if they want to compromise your cryptography is in things which they have access to. Why guess it when you might be able to find it with a little digging?
Keys in source code
Embedding cryptographic materials in source or executable code is one of the oldest, easiest to avoid, and yet most prevalent mistakes made when implementing cryptography. An attacker will search in the source code if they can get it, and in the executable next. It is the embedding of keys and passwords in code that lead to PCI-DSS 2.0 requirement 6.5 which, among other things, requires training of developers to avoid this critical mistake.
Some of the examples seen in the wild are:
- Actual plain text keys in source code, PHP, even HTML
- Formulas used to derive a key or meant to obfuscate the key embedded in source or PHP
- Static IV included in the source code
Copy and paste an example
Many applications have example configuration included or documented somewhere convenient. Often times a developer or DBA will simply grab that example and paste it into a running configuration and assume that all is well.
This is not so. - It is these defaults and examples that an attacker will try first, and sadly, they often succeed.
Weak or NULL keys
It is often tempting to use easy to generate, short, or null keys, especially when embedding them in code. Shorter keys are easier to brute force, and in 3DES they defeat the purpose of how it uses the key (see the section on Triple-DES earlier). While these are easy mistakes to avoid, there are some other factors that can lead to a weak key.
Some systems allow you to use a short alphanumeric key and simply translate that into ASCII to get a hex value and then pad the remainder to get the 128, 256, or 512 bit key for AES or the 192 bit key for 3DES. The more prudent approach is to use a passphrase and then use a hashing function like MD5 or SHA to produce an appropriate length key and discard either the most or least significant digits to make it fit the key length.
For example the password “letmein” becomes:
This is a much stronger key than the plain hex value:
If the plain text key above were used in a 3DES implementation, the result is the same as though DES were used. The first 8 bytes (64 bits) are used to encrypt, then a
Static or NULL IV
Using the built-in encryption without understanding how it works
Going forward - A minimum standard for PCI and crypto
- Start with a well-known and tested cryptographic library. NEVER ROLL YOUR OWN!!!
- Use AES because the PAN fits inside a single block. 3DES breaks it into two blocks.
- Use the largest key your system will support. 256bit is best.
- Use Cipher Block Chaining mode, not ECB, because it the IV component provides a significant improvement in cryptographic strength.
- Use an IV that is unique for each record. You can start with a random number and count up. Use randomly generated keys, and then hash them with SHA and use the hash output as the keys
- Manage your keys like the thing they are; the one thing that can unlock all of your secrets. Test it.
- Break it.
- Fix it.
- Do this until you can’t break it.
- Then have someone else do the same thing to it.