Hashing is a fundamental technique in programming that converts data into a fixed-size string of characters. Unlike encryption, hashing is a one-way process: you can't reverse it to get the original data back.
This makes hashing perfect for storing passwords, verifying file integrity, and creating unique identifiers. In this tutorial, you'll learn how to use Python's built-in hashlib module to implement secure hashing in your applications.
By the end of this tutorial, you'll understand:
How to create basic hashes with different algorithms
Why simple hashing isn't enough for passwords
How to add salt to prevent rainbow table attacks
How to use key derivation functions for password storage
You can find the code on GitHub.
Prerequisites
To follow this tutorial, you should have:
Basic Python: Variables, data types, functions, and control structures
Understanding of strings and bytes: How to encode strings and work with byte data
No external libraries are required, as hashlib and os are both part of Python's standard library.
Table of Contents
Basic Hashing with Python’s hashlib
Let's start with the fundamentals. The hashlib module provides access to several hashing algorithms like MD5, SHA-1, SHA-256, and more.
Here's how to create a simple SHA-256 hash:
import hashlib
# Create a simple hash
message = "Hello, World!"
hash_object = hashlib.sha256(message.encode())
hex_digest = hash_object.hexdigest()
print(f"Original: {message}")
print(f"SHA-256 Hash: {hex_digest}")
Output:
Original: Hello, World!
SHA-256 Hash: dffd6021bb2bd5b0af676290809ec3a53191dd81c7f70a4b28688a362182986f
Here, we import the hashlib module, encode our string to bytes using .encode() as hashlib requires bytes, not strings.
Then we create a hash object using hashlib.sha256() and get the hexadecimal representation with .hexdigest().
The resulting hash is always 64 characters long regardless of input size. Meaning you have an output string that is 256 bits long. As each hexadecimal character requires 4 bits, the output has 256/4 = 64 hexadecimal characters. Even changing one character produces a completely different hash.
Let's verify that:
import hashlib
# Small change, big difference
message1 = "Hello, World!"
message2 = "Hello, World?" # Only changed ! to ?
hash1 = hashlib.sha256(message1.encode()).hexdigest()
hash2 = hashlib.sha256(message2.encode()).hexdigest()
print(f"Message 1: {message1}")
print(f"Hash 1: {hash1}")
print(f"\nMessage 2: {message2}")
print(f"Hash 2: {hash2}")
print(f"\nAre they the same? {hash1 == hash2}")
Output:
Message 1: Hello, World!
Hash 1: dffd6021bb2bd5b0af676290809ec3a53191dd81c7f70a4b28688a362182986f
Message 2: Hello, World?
Hash 2: f16c3bb0532537acd5b2e418f2b1235b29181e35cffee7cc29d84de4a1d62e4d
Are they the same? False
This property is called the avalanche effect where a tiny change creates a completely different output.
Why Simple Hashing Isn't Enough for Passwords
You might think you can just hash passwords and store them in your database. But there's a problem: attackers use rainbow tables, which are precomputed databases of hashes for common passwords.
Here's what happens:
import hashlib
# Simple password hashing (DON'T USE THIS!)
password = "password123"
hashed = hashlib.sha256(password.encode()).hexdigest()
print(f"Password: {password}")
print(f"Hash: {hashed}")
Output:
Password: password123
Hash: ef92b778bafe771e89245b89ecbc08a44a4e166c06659911881f383d4473e94f
If two users have the same password, they'll have identical hashes. An attacker who cracks one hash knows the password for all users with that hash.
So how do we handle this? Let’s learn in the next section.
Adding Salt to Your Hashes
The solution is salting: adding random data to each password before hashing. This way, even identical passwords produce different hashes.
Here's how to implement salted hashing:
import hashlib
import os
def hash_password_with_salt(password):
# Generate a random salt (16 bytes = 128 bits)
salt = os.urandom(16)
# Combine password and salt, then hash
hash_object = hashlib.sha256(salt + password.encode())
password_hash = hash_object.hexdigest()
# Return both salt and hash (you need the salt to verify later)
return salt.hex(), password_hash
# Hash the same password twice
password = "password123"
salt1, hash1 = hash_password_with_salt(password)
salt2, hash2 = hash_password_with_salt(password)
print(f"Password: {password}\n")
print(f"First attempt:")
print(f" Salt: {salt1}")
print(f" Hash: {hash1}\n")
print(f"Second attempt:")
print(f" Salt: {salt2}")
print(f" Hash: {hash2}\n")
print(f"Same password, different hashes? {hash1 != hash2}")
Output:
Password: password123
First attempt:
Salt: fc24b2d2245ff65b80c5bced38744171
Hash: 5ce634c05941d25871e7ee334b5c24c75f64c4f6d557db66909fcaa793d869f9
Second attempt:
Salt: bc8a1f79b07e56b51285557211f88bb0
Hash: 043599d90b2aa0556265869cead35724c7d9d9d37129d897c6b68bade9e737e6
Same password, different hashes? True
How this works:
os.urandom(16)generates 16 random bytes, which is our saltWe concatenate the salt and password bytes before hashing
We return both the salt (as
hex) and the hashYou must store both the salt and hash in your database
When a user logs in, you retrieve their salt, hash the entered password with that salt, and compare the result to the stored hash.
Verifying Salted Passwords
Now let's create a function to verify passwords against salted hashes:
import hashlib
import os
def hash_password(password, salt=None):
"""Hash a password with a salt. Generate new salt if not provided."""
if salt is None:
salt = os.urandom(16)
else:
# Convert hex string back to bytes if needed
if isinstance(salt, str):
salt = bytes.fromhex(salt)
password_hash = hashlib.sha256(salt + password.encode()).hexdigest()
return salt.hex(), password_hash
def verify_password(password, stored_salt, stored_hash):
"""Verify a password against a stored salt and hash."""
# Hash the provided password with the stored salt
_, new_hash = hash_password(password, stored_salt)
# Compare the hashes
return new_hash == stored_hash
Here’s how you can use the above:
print("=== User Registration ===")
user_password = "mySecurePassword!"
salt, password_hash = hash_password(user_password)
print(f"Password: {user_password}")
print(f"Salt: {salt}")
print(f"Hash: {password_hash}")
# Simulate user login attempts
print("\n=== Login Attempts ===")
correct_attempt = "mySecurePassword!"
wrong_attempt = "wrongPassword"
print(f"Attempt 1: '{correct_attempt}'")
print(f" Valid? {verify_password(correct_attempt, salt, password_hash)}")
print(f"\nAttempt 2: '{wrong_attempt}'")
print(f" Valid? {verify_password(wrong_attempt, salt, password_hash)}")
Output:
=== User Registration ===
Password: mySecurePassword!
Salt: 381779b5262deea84183e4b9454b98b1
Hash: 9756e1f0bc4c1aa4a72f35b0be8d3c8f430d31613371cf7de3c615bc475de98f
=== Login Attempts ===
Attempt 1: 'mySecurePassword!'
Valid? True
Attempt 2: 'wrongPassword'
Valid? False
This implementation shows a complete registration and login flow.
Using Key Derivation Functions
While salted SHA-256 is better than plain hashing, modern applications should use key derivation functions (KDFs) specifically designed for password hashing. These include PBKDF2 (Password-Based Key Derivation Function 2), bcrypt, scrypt, and Argon2. You can check the links to learn more about these key derivation functions.
These algorithms are intentionally slow and require more computational resources, making brute-force attacks much harder. Let's implement PBKDF2, which is built into Python:
import hashlib
import os
def hash_password_pbkdf2(password, salt=None, iterations=600000):
"""Hash password using PBKDF2 with SHA-256."""
if salt is None:
salt = os.urandom(32) # 32 bytes = 256 bits
elif isinstance(salt, str):
salt = bytes.fromhex(salt)
# PBKDF2 with 600,000 iterations (OWASP recommendation for 2024)
password_hash = hashlib.pbkdf2_hmac(
'sha256', # Hash algorithm
password.encode(), # Password as bytes
salt, # Salt as bytes
iterations, # Number of iterations
dklen=32 # Desired key length (32 bytes = 256 bits)
)
return salt.hex(), password_hash.hex(), iterations
def verify_password_pbkdf2(password, stored_salt, stored_hash, iterations):
"""Verify password against PBKDF2 hash."""
_, new_hash, _ = hash_password_pbkdf2(password, stored_salt, iterations)
return new_hash == stored_hash
# Hash a password
print("=== PBKDF2 Password Hashing ===")
password = "SuperSecure123!"
salt, hash_value, iterations = hash_password_pbkdf2(password)
print(f"Password: {password}")
print(f"Salt: {salt}")
print(f"Hash: {hash_value}")
print(f"Iterations: {iterations:,}")
This outputs:
=== PBKDF2 Password Hashing ===
Password: SuperSecure123!
Salt: b388aecd774f6a7ddd95405091548bb50102c99beb1a10326a4c54070da4a3a5
Hash: c681450f41d0cec9ea2aad1108efe2a430b9c3d9fc3af621071be10ac9b3615a
Iterations: 600,000
Now let’s verify the password and also compare the speeds of SHA-256 vs. PBKDF2:
print("\n=== Verification ===")
is_valid = verify_password_pbkdf2(password, salt, hash_value, iterations)
print(f"Password valid? {is_valid}")
# Show time comparison
import time
print("\n=== Speed Comparison ===")
test_password = "test123"
# Simple SHA-256
start = time.time()
for _ in range(100):
hashlib.sha256(test_password.encode()).hexdigest()
sha256_time = time.time() - start
# PBKDF2
start = time.time()
for _ in range(100):
hash_password_pbkdf2(test_password)
pbkdf2_time = time.time() - start
print(f"1000 SHA-256 hashes: {sha256_time:.3f} seconds")
print(f"1000 PBKDF2 hashes: {pbkdf2_time:.3f} seconds")
print(f"PBKDF2 is {pbkdf2_time/sha256_time:.1f}x slower")
Output:
=== Verification ===
Password valid? True
=== Speed Comparison ===
100 SHA-256 hashes: 0.000 seconds
100 PBKDF2 hashes: 53.631 seconds
PBKDF2 is 240068.1x slower
How PBKDF2 works:
Takes your password and salt
Applies the hash function (SHA-256) repeatedly – 600,000 times in this example
Each iteration makes the computation slower and harder to brute-force
You store the salt, hash, AND iteration count (so you can verify later)
The iteration count can be increased over time as computers get faster. Modern recommendations (2024) suggest 600,000 iterations for PBKDF2-SHA256.
Conclusion
You've learned how to implement secure password hashing in Python using the hashlib module. Here are the key takeaways:
Basic hashing with SHA-256 is useful for data integrity, not passwords
Salting prevents rainbow table attacks by making each hash unique
PBKDF2 adds computational cost through iterations, slowing down attackers
Always store the salt, hash, and iteration count together
Use key derivation functions (PBKDF2, bcrypt, Argon2) for passwords
The code examples in this tutorial provide a solid foundation for implementing authentication in your projects. But remember, security is an ongoing process. Stay updated on best practices and regularly review your security implementations.
Happy (secure) coding!