Learnings from the SHAttered Security Incident

What is SHA-1?

In short, SHA-1 is a popular hash algorithm that supports an infinite length of input data and provides a fixed length of output.

Effects of SHAttered

Though Document [2] mentioned the theoretical collision with a complexity less than 269 as early as 2005, it only remained a theoretical study without any practical demonstration, even though earlier tests successfully optimized the complexity to 257.5 in Document [5] written in 2013. To settle the argument about the study, there was a need for evidence. Thus, the document titled “The first collision for full SHA-1” created the first collision instance.

Git Example

Git is a content-addressable file system, i.e., the storage of data as key-value pairs inside Git, and retrieval is the use of a key to search for desired content. In this way, any content submitted to Git generates a unique key through the hash algorithm, and one can use this key to retrieve the stored content uniquely. Coincidentally, Git uses SHA-1 as the hash algorithm.

HTTPS Example

Compared with Git, it seems that HTTPS can better solve the same problem by using certificates.

Historical Progress and Improvement Suggestions

Furthermore, this incident also reveals the price for collision attacks.
Compared with MD5 whose price is 30 seconds on the mobile phone, the price for SHA-1 is 110 GPUs for one year. Though this price is affordable for large companies, it is too high for middle- and small-sized enterprises and personal users. Also, do not forget that Moore’s law is still applicable. So we must consider possessing computing resources with better efficiency and cost-effectiveness.

Conclusion

In this article, we discussed the Security Hash Algorithm, with respect to the revelation that researchers have successfully carried out an SHA-1 collision test, which means that they achieved the same output for two different inputs. The result violates the inherent conditions for an SHA-1, thereby its reference to a collision. Next, we discussed running the hash operations on a Git, and an HTTPS instance. Then we looked at the lifecycle of the hash function, along with some best practices for users to deal with the SHA algorithm.

References

[1] How would Git handle a SHA-1 collision on a blob? http://stackoverflow.com/questions/9392365/how-would-git-handle-a-sha-1-collision-on-a-blob
[2] Wang X, Yin Y L, Yu H. Finding collisions in the full SHA-1[C]//Annual International Cryptology Conference. Springer Berlin Heidelberg, 2005: 17–36.
[3] Lifetimes of cryptographic hash functions http://valerieaurora.org/hash.html
[4] Why doesn’t Git use more modern SHA? http://stackoverflow.com/questions/28159071/why-doesnt-git-use-more-modern-sha
[5] Stevens M. New collision attacks on SHA-1 based on optimal joint local-collision analysis[C]//Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer Berlin Heidelberg, 2013: 245–261.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store