A month ago, the team behind the SHA-3 has published an RFC for TurboSHAKE and KangarooTwelve: secure hash functions that employ the same primitive as SHA-3, but with reduced number of rounds to make it faster. K12 is basically a tree-based hash over TurboSHAKE.
Could someone explain the differences between these two points? They seem identical to me.
> [The hash function] also should be second-preimage resistant: For a given message M1, it must be virtually impossible to find a second message M2 where M1 != M2 that produces the same hash hash(M1) == hash(M2).
> These functions should be also collision resistant: it must be virtually impossible to find two messages M1 and M2 where hash(M1) == hash(M2).
The second objective is easier than the first. It may be easier to find any M1 and M2 that collide, than to find an M2 that collides with a specific M1.
The difference is "for a given message M1". In the 2nd requirement, you may choose both M1 and M2. For the 1st requirement, you are given M1 and must find M2.
Collision resistance implies second-preimage resistance, but second-preimage resistance does not imply collision resistance.
Some care is needed with the definitions. For any hash function, the adversary can compute a bunch of hashes, and those outputs obviously have known first preimages. And a hash function with a known collision has a known second preimage given one of the colliding inputs.
I’m disappointed that they didn’t discuss my favorite feature of BLAKE3: it’s a tree hash. If you have a file and the BLAKE3 hash of that file, you can generate a proof that a portion of the file is correct. And you can take a file, split it into pieces (of known length and offset), hash them as you receive them, and then assemble them into the full file and efficiently calculate the full file’s hash. The other options cannot do this, although you could certainly build this on top of them.
Imagine how much better S3 would be if it used BLAKE3 instead of MD5. (Hah, S3 and its competitors don’t even manage to fully support MD5 for multipart uploads, although they could support BLAKE3 very well with multipart uploads!)
https://keccak.team/2025/rfc9861.html
> [The hash function] also should be second-preimage resistant: For a given message M1, it must be virtually impossible to find a second message M2 where M1 != M2 that produces the same hash hash(M1) == hash(M2).
> These functions should be also collision resistant: it must be virtually impossible to find two messages M1 and M2 where hash(M1) == hash(M2).
Some care is needed with the definitions. For any hash function, the adversary can compute a bunch of hashes, and those outputs obviously have known first preimages. And a hash function with a known collision has a known second preimage given one of the colliding inputs.
I’m disappointed that they didn’t discuss my favorite feature of BLAKE3: it’s a tree hash. If you have a file and the BLAKE3 hash of that file, you can generate a proof that a portion of the file is correct. And you can take a file, split it into pieces (of known length and offset), hash them as you receive them, and then assemble them into the full file and efficiently calculate the full file’s hash. The other options cannot do this, although you could certainly build this on top of them.
Imagine how much better S3 would be if it used BLAKE3 instead of MD5. (Hah, S3 and its competitors don’t even manage to fully support MD5 for multipart uploads, although they could support BLAKE3 very well with multipart uploads!)