What Is the Best Hashing Algorithm?
Hashing algorithms are used all over the internet. Learn what hashing algorithms are, explore their applications, and how to identify what the best hashing algorithm is for your specific needs
When talking about hashing algorithms, usually people immediately think about password security. However, hashing algorithms can do much more than that — from data validation and search to file comparison to integrity checks.
With so many different applications and so many algorithms available, a key question arises: “What is the best hashing algorithm?” In this article, we’re going to talk about the numerous applications of hashing algorithms and help you identify the best hashing algorithms to meet your specific needs.
What Is the Best Hashing Algorithm? Let’s First Explain What a Hashing Algorithm Is…
We can’t really jump into answering the question “what is the best hashing algorithm?” without first at least explaining what a hashing algorithm is. Hashing is a process that allows you to take plaintext data or files and apply a mathematical formula (i.e., hashing algorithm) to it to generate a random value of a specific length. In other words:
- A hashing algorithm is a one-way cryptographic function that generates an output of a fixed length (often shorter than the original input data).
- Once something is hashed, it’s virtually irreversible as it would take too much computational power and time to feasibly attempt to reverse engineer.
Hashing is one of the three basic elements of cryptography: encoding, encryption, and hashing. All three of these processes differ both in function and purpose. Let’s explore and compare each of these elements in the table below:
|What It Is||It’s a publicly available scheme that’s relatively easy to decode.||It’s a two-way function that’s reversible when the correct decryption key is applied.||It’s a one-way function that’s used for pseudonymization.|
|What It Does||This process transforms data so that it can be properly consumed by a different system.||This process transforms data to keep it secret so it can be decrypted only by the intended recipient.||This process generates a unique hash value (output) that uniquely identifies your input data (like a fingerprint) to ensure data integrity without exposing said data.|
|How Secure It Is||Easily reversible.||Reversible only by the intended recipient.||Non-reversible (or nearly impossible).|
|Does It Require Keys?||No key required to decode data.||Key needed to decrypt data.||No decoding or decryption needed. The hashed data is compared with the stored (and hashed) one — if they match, the data in question is validated.|
|Use Cases and Applications||Data compression, data transfer, storage, file conversion and more.||Secure transfer (in-transit encryption) and storage (at-rest encryption) of sensitive information, emails, private documents, contracts and more.||Search, file organization, passwords, data and software integrity validation, and more.|
|Algorithm Examples||ASCII, Unicode, URL Encoding, Base64.||AES, Blowfish, RSA.||MD5, SHA-1, SHA-256.|
The ideal hashing algorithm is:
- Easy to compute.
- Irreversible or extremely difficult to decipher the original alphanumeric value.
- Produces a unique output.
Let’s have a look to what happens to a simple text when we hash it using two different hashing algorithms (MD5 and HAS-256):
What Is Hashing Used For?
In the digital world, hashing is virtually everywhere. A typical user comes across different forms of hashing every day without knowing it. You don’t believe it? Let me give you a few examples of the most popular hashing applications and usages:
Did you know that all the credit cards providers like MasterCard, American Express (AMEX), Visa, JCB, and many government identification numbers use a hashing algorithm as an easy way to validate the number you provided? Based on the Payment Card Industry’s Data Security Standard (PCI DSS), this method is also used for IMEI and SIM card numbers, Canadian Social Insurance numbers (just to name a few examples).
A digital signature is a type of electronic signature that relies on the use of hashing algorithms to verify the authenticity of digital messages or documents. When you send a digitally signed email, you’re using a hashing algorithm as part of the digital signing process.
- A unique hash value of the message is generated by applying a hashing algorithm to it.
- The value is then encrypted using the sender’s private key.
- The recipient decrypts the hashed message using the sender’s public key.
- The hashing value generated by the recipient and the decrypted sender’s hash digest are then compared. If the two values match, it means that the document or message has not been tampered with and the sender has been verified.
File or Software Integrity Checks
When you download a file from a website, you don’t know whether it’s genuine or if the file has been modified to contain a virus. How can you be sure? The best way to do that is to check its integrity by comparing the hashed algorithm on the download page with the value included in the software you just downloaded. If they match, it means that the file has not been tampered with; thus, you can trust it.
File or Software Code Signing Certificates
Software developers and publishers use code signing certificates to digitally sign their code, scripts, and other executables. This confirms to end-users the authenticity and the integrity of the file or application available to download on a website.
The developer or publisher’s digital signature is attached to the code with a code signing certificate to provide a verifiable identity. This way, users won’t receive an “Unknown Publisher” warning message during the download or installation. Once again, this is made possible by the usage of a hashing algorithm.
A standard code signing certificate will display your verified organizational information instead of the “Unknown publisher” warning.
However, depending on the type of code signing certificate the signer uses, the software may (or may not) still trigger a Windows Defender SmartScreen warning window:
- A standard code signing certificate will result in a SmartScreen warning popping up, but it’ll display the organization’s verified information instead of an “Unknown Publisher” warning.
- An extended validation (EV) code signing certificate will bypass the Windows SmartScreen warnings because it makes your software automatically be trusted by Windows browsers and operating systems.
Want to learn more about how code signing works? Then check out this article link.
File Organization and Data Lookup
A few decades ago, when you needed to request a copy of your university marks or a list of the classes you enrolled in, the staff member had to look for the right papers in the physical paper-based archive. That process could take hours or even days!
Today, things have dramatically improved. Each university student has a unique number (or ID) linked to all its personal information stored in the university database (often stored in a hash table). When an authorized staff member needs to retrieve some of that information, they can do so in a blink of an eye!
Last but not least, hashing algorithms are also used for secure password storage. How? When you register on a website and create a password, the provider usually saves only the password’s hash value instead of your plaintext password. This means that when you log in to your account, usually the provider hashes the password you just typed and compares it with the one stored in its database. If the two hash values match, then you get access to your profile. Easy and much more secure, isn’t it?
However, hashing your passwords before storing them isn’t enough — you need to salt them to protect them against different types of tactics, including dictionary attacks and rainbow table attacks.
Why Hashing Algorithms Are Important
OK, now we know that hashing algorithms can help us to solve many problems, but why are hashing algorithms so important?
Quick Search Functionality
Hashing allows a quick search, faster than many other data retrieval methods (i.e., arrays or lists), which can make a big difference when searching through millions of data. Consider a library as an example. Have you ever asked yourself how a reference librarian can find the exact location of a book in a matter of seconds when it would have taken you ages to go through all the titles available on the library shelves? It’s all thanks to a hash table!
Database Password Security
It increases password security in databases. If you store password hashes instead of plaintext passwords, it prevents as your actual password doesn’t need to be stored, it makes it more difficult to hackers to steal it. Hashed passwords cannot be reversed. Taking into account that, based on the data from the Verizon’s 2021 Data Breach Investigations Report (DBIR), stolen credentials are still the top cause of data breaches, it’s easy to understand why using a hashing algorithm in password management has become paramount.
File & Code Integrity Checks
Hashing allows you to compare two files or pieces of data without opening them and know if they’re different. This method is often also used by file backup programs when running an incremental backup.
Needless to say, it’s a powerful ally in code/data integrity as it certifies the originality of a code or document.
The Most Popular Types of Hashing Algorithms
Now that we know why hashing algorithms are so important and how we can use them, let’s have a closer look to the most popular types of hashing algorithms available:
- MD5 (message digest version 5): Designed in 1991, this hashing algorithm produces a 128-bit hash value. It’s still one of the most commonly used despite being one of the most insecure algorithms. (It’s susceptible to brute force attacks. Stay away from it!)
- SHA (secure hashing algorithm) family:
- SHA-1: This hashing algorithm generates a 160-bit hash value. Vulnerable to brute force attacks, it’s no longer considered a secure hashing algorithm. As a result, Microsoft, Google and Mozilla no longer accept SHA-1 SSL certificates (since 2017).
- SHA-256: This hashing algorithm is a variant of the SHA2 hashing algorithm, recommended and approved by the National Institute of Standards and Technology (NIST). It generates a 256-bit hash value. Even if it’s 30% slower than the previous algorithms, it’s more complicated, thus, it’s more secure.
- SHA-384: This hashing algorithm is the latest member of the SHA family, it’s much faster than the SHA-256 and it’s based on a totally different approach (sponge construction).
- Whirlpool: This hashing algorithm is based on the advanced encryption standard (AES) and produces a 512-bit hash digest.
- RIPMED (RACE integrity primitives evaluation message digest): This hashing algorithm is based on the old MD4 algorithm. The latest and more secured version is RIPMED-320.
- CRC32 (cyclic redundancy check): This hashing algorithm is used to identify networks errors and disk write errors in networks and storage devices. It focuses on speed instead than security.
- Argon2: This hashing algorithm specifically designed to protect credentials. Considered one of the most secure and recommended by the Open Web Application Security Project (OWASP), it provides a high level of defense against GPU-based attacks.
- Bcrypt: This password hashing function was built to slow down brute force attacks. Salt is included in the hashing process, which protects your stored hash values against rainbow tables attacks. We will talk more about the usage of salted passwords and rainbow table attacks later in this article.
- PBKDF2 (password-based key derivation function): Recommended by NIST, this hashing algorithm is much slower than SHA, therefore, it’s another suitable hashing algorithm for password protection. It uses a salt to generate harder-to-guess password hashes and it produces an output of configurable size (e.g., 256 bits).
Traits of Strong Hashing Algorithm Vulnerabilities
Strong hash functions are those that embody certain characteristics:
This means that the hash values should be too computationally challenging and burdensome to compute back to its original input.
This property refers to the randomness of every hash value. If you make even a tiny change to the input, the entire hash value output should change entirely.
A hash collision is something that occurs when two inputs result in the same output. Collision resistance means that a hash should generate unique hashes that are as difficult as possible to find matches for.
Say, for example, you use a hashing algorithm on two separate documents and they generate identical hash values. MD5 and SHA1 are often vulnerable to this type of attack. The problem with hashing algorithms is that, as inputs are infinite, it’s impossible to ensure that each hash output will be unique. Yes, it’s rare and some hashing algorithms are less risky than others.
However, hash collisions can be exploited by an attacker. How? If an attacker discovers two input strings with the same hash output (collision), they can replace a file available to download with a malicious file with the same hash. The user downloading the file will think that it’s genuine as the hash provided by the website is equal to the one included in the replaced file.
How Cybercriminals Use Weak Algorithms to Their Advantage
It’s no secret that cybercriminals are always looking for ways to crack passwords to gain unauthorized access to accounts. Let’s have a look at some of the ways that cybercriminals attack hashing algorithm weaknesses:
- Brute force: one of the easiest way to crack passwords, above all in case of fast hashing algorithms like MD5 and SHA1. Basically, the hacker tries any possible combination of characters.
- Rainbow table: different and unfortunately, more effective of a simple brute force attack, it’s based on a fast simple search-and-compare system. Often confused with the more time-consuming hash tables, it requires less storage as it saves only the first and the last values of chain data. How can this be used to steal data? Once the hacker has managed to get access to the password hashes, he can quickly decrypt them using the precomputed rainbow table.
- Birthday attack: it’s another example of the limitation of hashing algorithms due to their infinite input. It belongs to the brute force attacks family and it’s based on the birthday paradox in mathematics, where the probability of two people sharing the same birth date is higher than we think. For example, in a group of 23 people there are 50% chances that two share the same birthday date. Based on this assumption, the success of the attack is directly linked to the likelihood of collisions.
… And these are just a few examples. Needless to say, using a weak hashing algorithm can have a disastrous effect, not only because of the financial impact, but also because of the loss of sensitive data and the consequent reputational damage.
Just to give you an idea, PwC’s 2022 Global Digital Trust Insights shows that more than 25% of companies expect an increase of their cybersecurity expenses of up to 10% in 2022.
SpyCloud’s 2021 Annual Credential Exposure Report, highlights the fact that there were 33% more breaches in 2020 compared to 2019. The company also reports that they recovered more than 1.4 billion stolen credentials.
Examples of Breaches That Have Resulted From Weak Hashing Algorithms
What has all this to do with hashing algorithms? Let’s have a look to a few examples of data breaches caused by a weak hashing algorithm in the last few years:
- LinkedIn data breach (2012): In this breach, 117 million passwords were leaked. The passwords were hashed with SHA-1 and not salted.
- Yahoo! data breach (2013 – 2014): In September 2016, Yahoo announced that 500 million account names and passwords were compromised two years prior in 2014. In December 2016, Yahoo! confirmed that another massive data breach occurred in 2013 that exposed one billion accounts. After the incident, Venafi’s data labs team analyzed many Yahoo! websites’ digital certificates in depth and found out that the company used a mix of old MD5 and SHA1 certificates. Read more about the outcome of Venafi analysis here.
- Facebook (2019): In this data breach, it turns out that millions of passwords were incorrectly stored in plain text. Luckily enough, in this case, Facebook says they found out and informed the users before anything happened.
What can we do then if not even a hashing algorithm is enough to stop these attacks? The answer is… “season” your password with some salt and pepper!
Why You Should Always Salt Your Passwords Prior to Hashing them
A good way to make things harder for a hacker is password salting.
Let’s say that you have two users in your organization who are using the same password. When hashed, their password hashing will look the same. However, if you add a randomly generated string to each hashed password (salt), the two hashing algorithms will look different even if the passwords are still matching. As the attacker won’t know in advance where the salt will be added, they won’t be able to precompute its table and the attack will probably fail or end up being as slow as a traditional brute force attack.
Clever, isn’t it? But adding a salt isn’t the only tool at your disposal.
For additional security, you can also add some pepper to the same hashing algorithm. It’s another random string that is added to a password before hashing. However, OWASP shares that while salt is usually stored together with the hashed password in the same database, pepper is usually stored separately (such as in a hardware security module) and kept secret.
When salt and pepper are used with hashed algorithms to secure passwords, it means the attacker will have to crack not only the hashed password, but also the salt and pepper that are appended to it as well.
What Is the Best Hashing Algorithm? The Answer Depends on How You Want to Use It…
The best hashing algorithm is the one that is making as hard as possible for the attackers to find two values with the same hash output.
This also means, though, that the effectiveness of an algorithm strictly depends on how you want to use it. Different hashing speeds work best in different scenarios. For example:
- To protect passwords, experts suggest using a strong and slow hashing algorithm like Argon2 or Bcrypt, combined with salt (or even better, with salt and pepper). (Basically, avoid faster algorithms for this usage.)
- To verify file signatures and certificates, SHA-256 is among your best hashing algorithm choices. Being considered one of the most secured hashing algorithms on the market by a high number of IT IT security companies, it’s used by U.S. government agencies, lead technology companies, and it’s integrated in the blockchain protocol used by Bitcoin. Why? Because it’s fast, has a very low probability of collision and, unlike some other older hashing algorithms, even a minor change to the original data completely changes the hash value.
- For data lookup and files organization, you will want to opt for a fast hashing algorithm that allows you to search and find data quickly.
As you can see, one size doesn’t fit all. Each algorithm has its own purpose and characteristics, and you should always consider how you’re going to use it in the decision-making process.
Hashing Algorithm Speeds Matter
Some hashing algorithms, like MD5 and SHA, are mainly used for search, files comparison, data integrity… but what do they have in common? The speed.
When you do a search online, you want to be able to view the outcome as soon as possible. Same when you are performing incremental backups or verifying the integrity of a specific application to download.
On the other hand, if you want to ensure that your passwords are secure and difficult to crack, you will opt for a slow hashing algorithm (i.e., Argon2 and Bcrypt) that will make the hacker’s job very time consuming.
Final Thoughts: What Is the Best Hashing Algorithm?
I hope this article has given you a better idea about the best hashing algorithm to choose depending on your needs.
The digital world is changing very fast and the hackers are always finding new ways to get what they want. An algorithm that is considered secure and top of the range today, tomorrow can be already cracked and unsafe like it happened to MD5 and SHA-1.
There are ways though, to make the life of the attackers as difficult as possible and hashing plays a vital role in it.
By the way, if you are still using MD5 or SHA-1 hashing algorithms, well… don’t risk it — make sure you upgrade them!