AES-GCM is an API that takes 4 inputs. AES-GCM(key, nonce, additional_data, plaintext). The nonce is also called an initialization vector (IV). The key and nonce/IV are used to encrypt the plaintext using AES-CTR. A keyed hash, GHASH, is then computed over the additional data and the cipher text. That hash is encrypted with AES too, and you get an authentication tag.
AES-GCM-SIV has two big differences. Firstly, GHASH is replaced with POLYVAL, a hash that is almost exactly the same but reverses the order of some bytes because it turns out that's more efficient on most platforms. Secondly, rather than using the key and a nonce to initialize the AES-CTR encryption, a synthetic IV is computed from the nonce that the caller provides and an encrypted POLYVAL hash of the plaintext.
This difference means that even if the caller uses the same nonce twice, which is bad, that the synthetic IV will still be different as long as the plaintext messages are different. This is what makes the nonce resistance. This is great for cases where we're using the same key over and over in a distributed way, and all of the senders can't guarantee not to use the same nonces, they might collide. TLS session tickets and cookies are good examples of that.
Two small gotchas: the biggest is that when encrypting the message, we have to pass over it twice, once to compute the hash, and once to encrypt it. The second is that it's still possible to screw things up on the caller side, SIV doesn't magically make it ok not to make an attempt at setting a random nonce. For example if you use the null/all-zeroes nonce over and over, identical messages are going to be trivially finger-printable.
>> Firstly, GHASH is replaced with POLYVAL, a hash that is almost exactly the same but reverses the order of some bytes because it turns out that's more efficient on most platforms.
The GHASH/GCM bit twiddling is so awful and slow and side-channel insecure to do in software, that most people just blindly copy the obscure code full of clever optimization that smart specialists come up with [1].
Then other smart specialists [2] have to reverse engineer said clever tricks (now widely used everywhere), because no explanation exists and it is not clear why they work in the first place.
Once done, a third group of smart specialists tweak the algorithm in an incompatible way to gain 10% speed on specific platforms and forcing people to pull in even more obscure code.
Simply put, I wish AES-GCM-SIV had just used GHASH as GCM and helped keeping complexity of crypto libraries down...
Isn't the Shay Gueron presentation you're citing talking about cleverly using the CLMUL extensions that Intel basically added as hardware support for GCM? I feel like Intel did a pretty good job documenting how to use PCLMULQDQ to implement GHASH.
Related to gotcha #1 - it also means that you have to have the entire message available when you start encryption. Lightweight embedded systems may not be able to meet that requirement.
For that you need to divide the message in chunks and encrypt&authenticate each chunk. It's what GPG is going to do soon. There are some ad-hoc approaches for that procedure, but two interesting and well-constructed are STREAM and CHAIN: https://eprint.iacr.org/2015/189.pdf
the problem with gcm’s nonce reuse weakness is that the best way to generate a unique nonce is as a hash of the input message - otherwise you technically have to record every nonce you’ve ever used.
I guess an alternative would be to have a secret key and a counter and then have the nonce be something like aes(key, counter++) but obviously that has scalability issues
I’m sure @tqpf/@tqbf (I cant recall) has opinions :)
Nonces don't need to be secret. You can just use a counter value if you want, unencrypted. It gets harder when you have many uncoordinated senders. You can shard your nonce-space; give each sender a unique prefix, but this reduces the number of messages they can each send with the same key.
Bigger services, such as Public Cloud providers (I work at AWS, and it's a factor for us) actually end up running up on these limits.
> the best way to generate a unique nonce is as a hash of the input message - otherwise you technically have to record every nonce you’ve ever used.
That doesn't sound right.
I would wager that using a cryptographically secure RNG is less likely to produce the same number twice, or at the very least equally as likely. If you have a very strong hash, then the hash algorithm should act the same as an RNG.
You are supposed to combine the hash with the user-provided nonce. It's not purely a hash, that's not good enough. The purpose is to protect against the user-provided nonce being weak. Still expect a proper nonce if it can be done.
The problem with gcm is that you absolutely cannot ever reuse a nonce, hence gcm-siv using a hash of the message - assuming a cryptographically strong hash the hash collision rate should theoretically have lower odds than a pure ring (sqrt(hashsize) vs likelihood of non-random different data colliding in practice)
Of course all of this is on the assumption that any hashes and rngs you use are secure :)
In practice: many sites using gcm were using rng based nonces that end up colliding
I get why 'SIV' used a hash of the message in addition to the user provided nonce.
My point was I disagree on your sentence "the best way to generate a unique nonce is as a hash of the input message - otherwise you technically have to record every nonce you’ve ever used."
That implies that a hash is strong enough that you don't need to track messages. If you believe that to be the case then a CSRNG would be equally as strong of not stronger and less complex. Still sqrt(n) chance of collision.
The nonce shouldn’t be regarded as secret. Typically you’d generate a random nonce per encrypt action and pack it within the first few bytes of the output
> A keyed hash, GHASH, is then computed over the additional data and the cipher text. That hash is encrypted with AES too, and you get an authentication tag.
What is the purpose of encrypting the keyed hash? I'm assuming there's something clever that an attacker could do with it if it were not encrypted, but I can't figure out what that might be.
This is playing a little fast and loose but just to get an idea: the keyed hash is fast but only secure once. By encrypting it, you get most of the perf benefit (the tag is much smaller than the entire message), and you get multi-message security.
FWIW, Poly1305 in e.g. secretbox works the same way: it's really Poly1305-AES. The Poly1305 is equivalent to GHASH here; AES encrypts in both cases.
Whoops; you're right of course, but same idea with fewer primitives -- either way, obscure the output of the polynomial MAC with the output of a cipher, so you get the fast MAC that you can reuse multiple times.
> some AEADs (including AES-GCM) suffer catastrophic failures
of confidentiality and/or integrity when two distinct messages are
encrypted with the same key and nonce. While the requirements for
AEADs specify that the pair of (key, nonce) shall only ever be used
once, and thus prohibit this, this is a worry in practice.
Could anyone explain what this catastrophic failure entails? Do both messages suddenly get decryptable, or is the entire AES key compromised?
GCM/GHASH is a Carter-Wegman MAC. These all have in common that they tend to be highly efficient, but their problem is that they require a unique key for each message; when a key is reused, an attacker can forge MACs. This is unlike e.g. HMAC where you don't even have a per-message key/IV.
The same caveats thus apply to e.g. Chacha20-Poly1305 (keystream out of counter mode XOR plaintext for encryption, Carter-Wegman MAC).
Also because it's not part of the cipher text I always end up wondering if I can transmit it in plaintext -- thus, I search stackoverflow for some sketchy answer that says it can -- because nobody every documents this in a library..
The symmetric encryption functions are meant to be fast and efficient. Calls to the RNG are often not, which is why they're used for key generation but not for every encryption operation.
The nonce could be stored as an internal counter, but that would require the encryption operation to modify internal state, which would then require locking in order to be thread safe. That would introduce more complexity than requiring the caller to supply the nonce.
I would argue there would still be value in offering simple to use APIs/hard to misuse APIs and separate advanced APIs. If the calls to RNG are becoming a bottle-neck, you can still switch to the advanced API.
Calls to the symmetric encryption function typically are the bottleneck. It's optimizing the thing known to actually be the bottleneck in the majority of cases.
Moreover, RNGs require internal state to prevent giving the same random numbers twice, so in this context it's just a slower counter that would still introduce locking.
They might be the bottleneck in things where the load is crypto heavy sure.
I was more talking about a case where a developer would like to add crypto to their chat application or whatever where the amount of crypto being done is so low it shouldn't really matter.
That's always the trouble. Nothing is optimal for everything but you don't want a different API for every possible combination.
Particularly in a case like this, you would end up with two APIs, one that takes an immutable state plus a nonce and the other that takes a mutable state that is really the immutable state plus the nonce which it increments for you, with the separate question of whether it handles the locking for you or not (note that not all platforms even have threads). They're so similar that having both is confusing, but if you're to have only one then the applications that need good performance (which are common) want the one where the caller supplies the nonce, as does anything else that may need visibility into that, e.g. if messages are sent from more than one process or device.
Moreover, the performance implications creep up on you. You don't expect it to matter for your chat app until you start letting users attach files to messages and suddenly you're encrypting 10GB files instead of 10 byte messages. Then you start losing users to apps that use faster encryption, or worse, no encryption.
Because these are cryptographic primitives and you're supposed to be educated enough for using them correctly.
If you do not want to think about that, use a higher-level cryptographic libraries that make all these decisions for you: how IV is generated, how it is stored together with the ciphertext, how do you compute authentication token, how to you compute the encryption key, what algorithm you use, etc.
Thank you to the authors for being entirely explicit about the implementation to the point of providing pseudo code and not having any expectation that implementors or users will know anything about cryptography.
is XChaCha20-Poly1305 better if I don't need the performance of hardware accelerated AES GCM or am not guaranteed to always run on a device that has that hardware acceleration (AESNI, CLMUL)?
If you trust your CSPRNG and you wanted AES-GCM-SIV just to be allowed to use random nonces, yeah. If there is any reason you might still repeat a nonce, XChaCha will break.
S2V for XChaCha20-Poly1305 has actually been discussed on the CFRG mailing list (and a little bit off-list between myself and others interested in this topic). The only question that hasn't been addressed at the time of the last email was, where does it fit?
XChaCha20 uses HChaCha20 to derive a 256-bit subkey between the 256-bit key and first 128 bits of the nonce. Then it uses ChaCha20 with the subkey and the remaining 64 bits of the nonce. ChaCha20 has an internal counter that goes between 0 and 2^64 - 1.
AEAD_XChaCha20_Poly1305 uses the first 32 bytes of the XChaCha20 keystream to determine the Poly1305 key for that message, then begins the encryption starting at the next block. (The remaining 32 bytes of block_counter = 0 are discarded in XChaCha20, but not in XSalsa20; this is a subtle difference that probably only makes streaming APIs easier to design and has no significant security considerations.)
With that in mind: XChaCha20 is already deriving a key from the key and (extended) nonce. And it already has an internal block counter (which side-steps counter/nonce wrapping issue that was addressed with AES-CTR in the AES-GCM-SIV design).
The simplest solution is to, instead of just trusting your CSPRNG, do this:
function crypto_aead_xchacha20poly1305siv_encrypt(msg, key) {
let nonce = Buffer.alloc(24);
sodium.randombytes_buf(nonce);
// Synthetic nonce here:
s2v(nonce, msg);
let cipher = Buffer.alloc(msg.length + 16);
sodium.crypt_aead_xchacha20poly1305_ietf_encrypt(cipher, msg, nonce, key);
return [nonce, cipher];
}
But this would be, strictly speaking, indistinguishable from XChaCha20-Poly1305 without SIV.
For context: I authored the XChaCha20 RFC draft, implemented it in pure-PHP (twice; the second time was to support 32-bit systems).
Here's a TLDR in case you're curious:
AES-GCM is an API that takes 4 inputs. AES-GCM(key, nonce, additional_data, plaintext). The nonce is also called an initialization vector (IV). The key and nonce/IV are used to encrypt the plaintext using AES-CTR. A keyed hash, GHASH, is then computed over the additional data and the cipher text. That hash is encrypted with AES too, and you get an authentication tag.
AES-GCM-SIV has two big differences. Firstly, GHASH is replaced with POLYVAL, a hash that is almost exactly the same but reverses the order of some bytes because it turns out that's more efficient on most platforms. Secondly, rather than using the key and a nonce to initialize the AES-CTR encryption, a synthetic IV is computed from the nonce that the caller provides and an encrypted POLYVAL hash of the plaintext.
This difference means that even if the caller uses the same nonce twice, which is bad, that the synthetic IV will still be different as long as the plaintext messages are different. This is what makes the nonce resistance. This is great for cases where we're using the same key over and over in a distributed way, and all of the senders can't guarantee not to use the same nonces, they might collide. TLS session tickets and cookies are good examples of that.
Two small gotchas: the biggest is that when encrypting the message, we have to pass over it twice, once to compute the hash, and once to encrypt it. The second is that it's still possible to screw things up on the caller side, SIV doesn't magically make it ok not to make an attempt at setting a random nonce. For example if you use the null/all-zeroes nonce over and over, identical messages are going to be trivially finger-printable.
>> Firstly, GHASH is replaced with POLYVAL, a hash that is almost exactly the same but reverses the order of some bytes because it turns out that's more efficient on most platforms.
The GHASH/GCM bit twiddling is so awful and slow and side-channel insecure to do in software, that most people just blindly copy the obscure code full of clever optimization that smart specialists come up with [1].
Then other smart specialists [2] have to reverse engineer said clever tricks (now widely used everywhere), because no explanation exists and it is not clear why they work in the first place.
Once done, a third group of smart specialists tweak the algorithm in an incompatible way to gain 10% speed on specific platforms and forcing people to pull in even more obscure code.
Simply put, I wish AES-GCM-SIV had just used GHASH as GCM and helped keeping complexity of crypto libraries down...
[1] https://crypto.stanford.edu/RealWorldCrypto/slides/gueron.pd... [2] https://blog.quarkslab.com/reversing-a-finite-field-multipli...
Isn't the Shay Gueron presentation you're citing talking about cleverly using the CLMUL extensions that Intel basically added as hardware support for GCM? I feel like Intel did a pretty good job documenting how to use PCLMULQDQ to implement GHASH.
Related to gotcha #1 - it also means that you have to have the entire message available when you start encryption. Lightweight embedded systems may not be able to meet that requirement.
For that you need to divide the message in chunks and encrypt&authenticate each chunk. It's what GPG is going to do soon. There are some ad-hoc approaches for that procedure, but two interesting and well-constructed are STREAM and CHAIN: https://eprint.iacr.org/2015/189.pdf
the problem with gcm’s nonce reuse weakness is that the best way to generate a unique nonce is as a hash of the input message - otherwise you technically have to record every nonce you’ve ever used.
I guess an alternative would be to have a secret key and a counter and then have the nonce be something like aes(key, counter++) but obviously that has scalability issues
I’m sure @tqpf/@tqbf (I cant recall) has opinions :)
Nonces don't need to be secret. You can just use a counter value if you want, unencrypted. It gets harder when you have many uncoordinated senders. You can shard your nonce-space; give each sender a unique prefix, but this reduces the number of messages they can each send with the same key.
Bigger services, such as Public Cloud providers (I work at AWS, and it's a factor for us) actually end up running up on these limits.
> the best way to generate a unique nonce is as a hash of the input message - otherwise you technically have to record every nonce you’ve ever used.
That doesn't sound right.
I would wager that using a cryptographically secure RNG is less likely to produce the same number twice, or at the very least equally as likely. If you have a very strong hash, then the hash algorithm should act the same as an RNG.
You are supposed to combine the hash with the user-provided nonce. It's not purely a hash, that's not good enough. The purpose is to protect against the user-provided nonce being weak. Still expect a proper nonce if it can be done.
The problem with gcm is that you absolutely cannot ever reuse a nonce, hence gcm-siv using a hash of the message - assuming a cryptographically strong hash the hash collision rate should theoretically have lower odds than a pure ring (sqrt(hashsize) vs likelihood of non-random different data colliding in practice)
Of course all of this is on the assumption that any hashes and rngs you use are secure :)
In practice: many sites using gcm were using rng based nonces that end up colliding
I get why 'SIV' used a hash of the message in addition to the user provided nonce.
My point was I disagree on your sentence "the best way to generate a unique nonce is as a hash of the input message - otherwise you technically have to record every nonce you’ve ever used."
That implies that a hash is strong enough that you don't need to track messages. If you believe that to be the case then a CSRNG would be equally as strong of not stronger and less complex. Still sqrt(n) chance of collision.
The nonce shouldn’t be regarded as secret. Typically you’d generate a random nonce per encrypt action and pack it within the first few bytes of the output
Counters are already used in the generation of the keystream
> A keyed hash, GHASH, is then computed over the additional data and the cipher text. That hash is encrypted with AES too, and you get an authentication tag.
What is the purpose of encrypting the keyed hash? I'm assuming there's something clever that an attacker could do with it if it were not encrypted, but I can't figure out what that might be.
This is playing a little fast and loose but just to get an idea: the keyed hash is fast but only secure once. By encrypting it, you get most of the perf benefit (the tag is much smaller than the entire message), and you get multi-message security.
FWIW, Poly1305 in e.g. secretbox works the same way: it's really Poly1305-AES. The Poly1305 is equivalent to GHASH here; AES encrypts in both cases.
The later iterations of Poly1305 removed the need for AES by using a synthetic Salsa20 derived key that is unique per key-nonce input pair.
Whoops; you're right of course, but same idea with fewer primitives -- either way, obscure the output of the polynomial MAC with the output of a cipher, so you get the fast MAC that you can reuse multiple times.
A naked GHASH is not a secure MAC; the key can be reverse engineered from the value. So it's encrypted to make the tag.
You mean for 2 values under the same (key, nonce), right?
Yes :) Well, technically they need a different plaintext too.
> some AEADs (including AES-GCM) suffer catastrophic failures of confidentiality and/or integrity when two distinct messages are encrypted with the same key and nonce. While the requirements for AEADs specify that the pair of (key, nonce) shall only ever be used once, and thus prohibit this, this is a worry in practice.
Could anyone explain what this catastrophic failure entails? Do both messages suddenly get decryptable, or is the entire AES key compromised?
video:
https://www.youtube.com/watch?v=uxuXFK5XKEU
presentation:
https://www.blackhat.com/docs/us-16/materials/us-16-Devlin-N...
whitepaper:
https://www.blackhat.com/docs/us-16/materials/us-16-Devlin-N...
You immediately lose your integrity protections, which allows you to launch attacks against AES-CTR as if it had no authentication tag.
https://cryptologie.net/article/361/breaking-https-aes-gcm-o...
You also learn the XOR or the two plaintexts, which can be a catastrophic loss of confidentiality.
You can also determine the keyed hash function key if you collect enough plaintexts, which would let you forge authentication tags
https://csrc.nist.gov/csrc/media/projects/block-cipher-techn...
Its worse than that with AES-GCM, nonce reuse reveals the AES private key.
GCM/GHASH is a Carter-Wegman MAC. These all have in common that they tend to be highly efficient, but their problem is that they require a unique key for each message; when a key is reused, an attacker can forge MACs. This is unlike e.g. HMAC where you don't even have a per-message key/IV.
The same caveats thus apply to e.g. Chacha20-Poly1305 (keystream out of counter mode XOR plaintext for encryption, Carter-Wegman MAC).
Why is it that most implementation require me to specify the nonce/IV, and why not include it in the cipher text?
Most implementations could easily call an internal RNG and harvest an IV on my behalf, preventing me from being stupid. So why not?
Similarly, if the specification said that the IV must be prepended to the cipher text, then I would never need to specify the IV when decrypting.
(Is it wrong to send the IV in plaintext?)
It's interesting that even libsodium's box/secretbox have nonce as a separate part of its 'easy to use' interface.
I also would like to know why it isn't just an opaque part of the ciphertext.
I'm aware that nonces are not always random but sometimes a counter, which creates an easy way to track nonce re-use. Is that the only reason?
Also because it's not part of the cipher text I always end up wondering if I can transmit it in plaintext -- thus, I search stackoverflow for some sketchy answer that says it can -- because nobody every documents this in a library..
This is a historic API.
libhydrogen's API is different: https://github.com/jedisct1/libhydrogen/wiki/Secret-key-encr...
The symmetric encryption functions are meant to be fast and efficient. Calls to the RNG are often not, which is why they're used for key generation but not for every encryption operation.
The nonce could be stored as an internal counter, but that would require the encryption operation to modify internal state, which would then require locking in order to be thread safe. That would introduce more complexity than requiring the caller to supply the nonce.
I would argue there would still be value in offering simple to use APIs/hard to misuse APIs and separate advanced APIs. If the calls to RNG are becoming a bottle-neck, you can still switch to the advanced API.
To me that seems like premature optimization.
Calls to the symmetric encryption function typically are the bottleneck. It's optimizing the thing known to actually be the bottleneck in the majority of cases.
Moreover, RNGs require internal state to prevent giving the same random numbers twice, so in this context it's just a slower counter that would still introduce locking.
They might be the bottleneck in things where the load is crypto heavy sure.
I was more talking about a case where a developer would like to add crypto to their chat application or whatever where the amount of crypto being done is so low it shouldn't really matter.
That's always the trouble. Nothing is optimal for everything but you don't want a different API for every possible combination.
Particularly in a case like this, you would end up with two APIs, one that takes an immutable state plus a nonce and the other that takes a mutable state that is really the immutable state plus the nonce which it increments for you, with the separate question of whether it handles the locking for you or not (note that not all platforms even have threads). They're so similar that having both is confusing, but if you're to have only one then the applications that need good performance (which are common) want the one where the caller supplies the nonce, as does anything else that may need visibility into that, e.g. if messages are sent from more than one process or device.
Moreover, the performance implications creep up on you. You don't expect it to matter for your chat app until you start letting users attach files to messages and suddenly you're encrypting 10GB files instead of 10 byte messages. Then you start losing users to apps that use faster encryption, or worse, no encryption.
It could be present on cipher text, so that you don't have to supply it out of band when decrypting..
Then it becomes opaque even though the developer often needs it. For example, WireGuard uses a nonce window for replay detection.
Because these are cryptographic primitives and you're supposed to be educated enough for using them correctly.
If you do not want to think about that, use a higher-level cryptographic libraries that make all these decisions for you: how IV is generated, how it is stored together with the ciphertext, how do you compute authentication token, how to you compute the encryption key, what algorithm you use, etc.
Fair argument... We don't have many high-level algorithms. And to interoperability we often have to use standardized algorithms.
There is also HS1-SIV which uses Chacha20.
Thank you to the authors for being entirely explicit about the implementation to the point of providing pseudo code and not having any expectation that implementors or users will know anything about cryptography.
is XChaCha20-Poly1305 better if I don't need the performance of hardware accelerated AES GCM or am not guaranteed to always run on a device that has that hardware acceleration (AESNI, CLMUL)?
If you trust your CSPRNG and you wanted AES-GCM-SIV just to be allowed to use random nonces, yeah. If there is any reason you might still repeat a nonce, XChaCha will break.
Presumably you could form a similar IV construction with Poly1305 as your hash, no?
S2V for XChaCha20-Poly1305 has actually been discussed on the CFRG mailing list (and a little bit off-list between myself and others interested in this topic). The only question that hasn't been addressed at the time of the last email was, where does it fit?
XChaCha20 uses HChaCha20 to derive a 256-bit subkey between the 256-bit key and first 128 bits of the nonce. Then it uses ChaCha20 with the subkey and the remaining 64 bits of the nonce. ChaCha20 has an internal counter that goes between 0 and 2^64 - 1.
AEAD_XChaCha20_Poly1305 uses the first 32 bytes of the XChaCha20 keystream to determine the Poly1305 key for that message, then begins the encryption starting at the next block. (The remaining 32 bytes of block_counter = 0 are discarded in XChaCha20, but not in XSalsa20; this is a subtle difference that probably only makes streaming APIs easier to design and has no significant security considerations.)
With that in mind: XChaCha20 is already deriving a key from the key and (extended) nonce. And it already has an internal block counter (which side-steps counter/nonce wrapping issue that was addressed with AES-CTR in the AES-GCM-SIV design).
The simplest solution is to, instead of just trusting your CSPRNG, do this:
But this would be, strictly speaking, indistinguishable from XChaCha20-Poly1305 without SIV.For context: I authored the XChaCha20 RFC draft, implemented it in pure-PHP (twice; the second time was to support 32-bit systems).
HTML version:
https://tools.ietf.org/html/rfc8452
Thanks! The RFC markup is so respectful of text I don't feel bad for updating it from https://www.rfc-editor.org/rfc/rfc8452.txt.