Hash Functions

#️⃣ SHA-256 Secure Hash Algorithm

Security Level 128-bit collision
Performance ⭐⭐⭐⭐ Very Good
Quantum Resistant ⚠️ Partial (64-bit)
Standardization NIST FIPS 180-4
Output Size 32 bytes (256 bits)
Block Size 64 bytes (512 bits)

📖 Overview

SHA-256 (Secure Hash Algorithm 256-bit) is a cryptographic hash function that produces a 256-bit (32-byte) hash value. It's part of the SHA-2 family designed by the NSA and published by NIST, and is one of the most widely used hash functions in cryptography, serving as the backbone of many security protocols including TLS/SSL, Bitcoin, and digital signatures.

✨ Key Features

📏

Fixed Output

Always produces exactly 256-bit (32-byte) hash values

🔒

One-Way Function

Computationally infeasible to reverse or find preimages

🌊

Avalanche Effect

Small input changes drastically change the output hash

💥

Collision Resistant

Extremely difficult to find two inputs with the same hash

🔄

Deterministic

Same input always produces the same output hash

📋

NIST Standard

Standardized in FIPS 180-4 and widely adopted

Hardware Support

Accelerated by SHA extensions on modern processors

🌐

Universal Adoption

Supported by virtually all cryptographic libraries

🎯 Common Use Cases

🔐 Cryptographic Applications

  • Digital Signatures: Message digest for RSA, ECDSA, and EdDSA
  • HMAC: Keyed-hash message authentication codes
  • Key Derivation: PBKDF2, HKDF, and other KDF functions
  • Certificate Fingerprints: X.509 certificate identification

🌐 System Applications

  • Blockchain: Bitcoin proof-of-work and transaction hashing
  • File Integrity: Checksums and integrity verification
  • Password Storage: Secure password hashing (with salt)
  • Data Deduplication: Content-based addressing

Algorithm Parameters

Parameter Value
Output Size 256 bits (32 bytes)
Block Size 512 bits (64 bytes)
State Size 256 bits (8 × 32-bit words)
Rounds 64
Security Level 128-bit collision, 256-bit preimage

Usage Examples

Basic Hashing

from metamui_crypto import SHA256

# Hash a message
message = b"Hello, World!"
hash_value = SHA256.hash(message)
print(f"SHA-256: {hash_value.hex()}")
# Output: a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e

# Hash a string
text = "The quick brown fox jumps over the lazy dog"
hash_value = SHA256.hash(text.encode('utf-8'))
print(f"SHA-256: {hash_value.hex()}")

Incremental Hashing

from metamui_crypto import SHA256

# Create hasher instance
hasher = SHA256.new()

# Update with data chunks
hasher.update(b"Hello, ")
hasher.update(b"World!")

# Get final hash
hash_value = hasher.finalize()
print(f"SHA-256: {hash_value.hex()}")

File Hashing

from metamui_crypto import SHA256

def hash_file(filepath, chunk_size=8192):
    """Hash a file using SHA-256"""
    hasher = SHA256.new()
    
    with open(filepath, 'rb') as f:
        while chunk := f.read(chunk_size):
            hasher.update(chunk)
    
    return hasher.finalize()

# Hash a large file
file_hash = hash_file('large_file.bin')
print(f"File SHA-256: {file_hash.hex()}")

HMAC-SHA256

from metamui_crypto import SHA256, HMAC

# Create HMAC-SHA256
key = b"secret_key"
message = b"Message to authenticate"

# Method 1: Using HMAC class
hmac = HMAC(key, SHA256)
hmac.update(message)
mac = hmac.finalize()
print(f"HMAC-SHA256: {mac.hex()}")

# Method 2: One-shot function
mac = HMAC.compute(key, message, SHA256)
print(f"HMAC-SHA256: {mac.hex()}")
from metamui_crypto import SHA256
import os

# WARNING: Don't use plain SHA-256 for passwords!
# This is for demonstration only

def hash_password_wrong(password):
    """INSECURE: Don't do this!"""
    return SHA256.hash(password.encode())

def hash_password_slightly_better(password, salt):
    """Still not recommended - use Argon2 instead"""
    return SHA256.hash(salt + password.encode())

# Correct approach - use a proper password hashing function
from metamui_crypto import Argon2
def hash_password_correct(password):
    """Secure password hashing"""
    salt = os.urandom(16)
    return Argon2.hash(password, salt)

Implementation Details

SHA-256 Algorithm Steps

  1. Message Padding
    • Append bit ‘1’ to message
    • Append zeros until length ≡ 448 (mod 512)
    • Append 64-bit message length
  2. Initialize Hash Values
    h0 = 0x6a09e667
    h1 = 0xbb67ae85
    h2 = 0x3c6ef372
    h3 = 0xa54ff53a
    h4 = 0x510e527f
    h5 = 0x9b05688c
    h6 = 0x1f83d9ab
    h7 = 0x5be0cd19
    
  3. Process Message in 512-bit Blocks
    • Expand 16 32-bit words to 64 words
    • Perform 64 rounds of compression
    • Update hash values
  4. Produce Final Hash
    • Concatenate h0   h1   h2   h3   h4   h5   h6   h7

Core Operations

# Logical functions used in SHA-256
def Ch(x, y, z):
    return (x & y) ^ (~x & z)

def Maj(x, y, z):
    return (x & y) ^ (x & z) ^ (y & z)

def Σ0(x):
    return ROTR(x, 2) ^ ROTR(x, 13) ^ ROTR(x, 22)

def Σ1(x):
    return ROTR(x, 6) ^ ROTR(x, 11) ^ ROTR(x, 25)

def σ0(x):
    return ROTR(x, 7) ^ ROTR(x, 18) ^ (x >> 3)

def σ1(x):
    return ROTR(x, 17) ^ ROTR(x, 19) ^ (x >> 10)

Message Schedule

# Expand 16 words to 64 words
W = [0] * 64
for i in range(16):
    W[i] = int.from_bytes(block[i*4:(i+1)*4], 'big')

for i in range(16, 64):
    W[i] = (σ1(W[i-2]) + W[i-7] + σ0(W[i-15]) + W[i-16]) & 0xFFFFFFFF

Security Considerations

Collision Resistance

  • Birthday Attack: ~2^128 operations to find collision
  • No Known Collisions: Unlike SHA-1, no collisions found
  • Theoretical Security: 128-bit security against collisions

Preimage Resistance

  • First Preimage: 2^256 operations to find input for given hash
  • Second Preimage: 2^256 operations to find different input with same hash

Length Extension Attacks

SHA-256 is vulnerable to length extension attacks:

# Vulnerable pattern - DON'T DO THIS
def create_token(secret, data):
    return SHA256.hash(secret + data)

# Attacker can create valid tokens without knowing secret!
# Use HMAC instead:
def create_token_secure(secret, data):
    return HMAC.compute(secret, data, SHA256)

Timing Attacks

When comparing hashes, use constant-time comparison:

import hmac

def verify_hash(computed, expected):
    """Constant-time hash comparison"""
    return hmac.compare_digest(computed, expected)

# Bad: Vulnerable to timing attacks
# return computed == expected

Performance Characteristics

Speed Benchmarks

Platform Speed (MB/s) Hardware Acceleration
x86-64 (Software) 250-300 No
x86-64 (SHA Extensions) 2000-3000 Yes
ARM64 (Software) 200-250 No
ARM64 (Crypto Extensions) 1500-2000 Yes

Optimization Techniques

# Batch hashing for better performance
def hash_many(messages):
    """Hash multiple messages efficiently"""
    return [SHA256.hash(msg) for msg in messages]

# Parallel hashing for large datasets
from concurrent.futures import ThreadPoolExecutor

def parallel_hash(messages, workers=4):
    """Hash messages in parallel"""
    with ThreadPoolExecutor(max_workers=workers) as executor:
        return list(executor.map(SHA256.hash, messages))

Common Use Cases

1. File Integrity Verification

import json
from pathlib import Path

def create_checksum_file(directory):
    """Create SHA-256 checksums for all files"""
    checksums = {}
    
    for file_path in Path(directory).rglob('*'):
        if file_path.is_file():
            hash_value = hash_file(file_path)
            checksums[str(file_path)] = hash_value.hex()
    
    with open('checksums.json', 'w') as f:
        json.dump(checksums, f, indent=2)

def verify_checksums(checksum_file):
    """Verify file integrity using checksums"""
    with open(checksum_file, 'r') as f:
        checksums = json.load(f)
    
    for file_path, expected_hash in checksums.items():
        if Path(file_path).exists():
            actual_hash = hash_file(file_path).hex()
            if actual_hash != expected_hash:
                print(f"MISMATCH: {file_path}")
            else:
                print(f"OK: {file_path}")

2. Digital Signatures

from metamui_crypto import SHA256, Ed25519

def sign_document(document, private_key):
    """Sign document using SHA-256 hash"""
    # Hash the document first
    doc_hash = SHA256.hash(document)
    
    # Sign the hash
    signature = Ed25519.sign(doc_hash, private_key)
    
    return signature

def verify_document(document, signature, public_key):
    """Verify document signature"""
    # Hash the document
    doc_hash = SHA256.hash(document)
    
    # Verify signature on hash
    return Ed25519.verify(signature, doc_hash, public_key)

3. Blockchain and Proof of Work

import struct
import time

def proof_of_work(data, difficulty=4):
    """Simple proof of work using SHA-256"""
    target = '0' * difficulty
    nonce = 0
    
    start_time = time.time()
    
    while True:
        # Create block with nonce
        block = data + struct.pack('<Q', nonce)
        
        # Hash the block
        hash_value = SHA256.hash(block)
        hash_hex = hash_value.hex()
        
        # Check if hash meets difficulty
        if hash_hex.startswith(target):
            elapsed = time.time() - start_time
            return {
                'nonce': nonce,
                'hash': hash_hex,
                'time': elapsed,
                'attempts': nonce + 1
            }
        
        nonce += 1

4. Key Derivation

from metamui_crypto import SHA256

def derive_key_simple(password, salt, iterations=100000):
    """Simple key derivation (use PBKDF2 or Argon2 in practice)"""
    derived = salt + password.encode()
    
    for _ in range(iterations):
        derived = SHA256.hash(derived)
    
    return derived

# Better approach using PBKDF2
from metamui_crypto import PBKDF2

def derive_key_proper(password, salt, iterations=100000):
    """Proper key derivation"""
    return PBKDF2.derive(
        password=password.encode(),
        salt=salt,
        iterations=iterations,
        key_length=32,
        hash_function=SHA256
    )

Comparison with Other Hash Functions

Algorithm Output Size Speed Security Use Case
SHA-256 256 bits Medium High General purpose
SHA-512 512 bits Fast (64-bit) Very High High security
SHA-1 160 bits Fast Broken Legacy only
MD5 128 bits Very Fast Broken Non-cryptographic
Blake2b Variable Faster High Modern alternative
Blake3 Variable Fastest High Performance critical

Migration Guide

From MD5 or SHA-1

# Old code using MD5 (INSECURE)
import hashlib
def old_hash(data):
    return hashlib.md5(data).hexdigest()

# Migrate to SHA-256
from metamui_crypto import SHA256
def new_hash(data):
    return SHA256.hash(data).hex()

# Migration with backward compatibility
def migrate_hash(data, algorithm='sha256'):
    if algorithm == 'md5':
        # Warning: MD5 is broken!
        import hashlib
        return hashlib.md5(data).hexdigest()
    elif algorithm == 'sha1':
        # Warning: SHA-1 is broken!
        import hashlib
        return hashlib.sha1(data).hexdigest()
    else:
        return SHA256.hash(data).hex()

Upgrading Hash Storage

def upgrade_password_hashes(user_db):
    """Upgrade from SHA-256 to Argon2"""
    for user in user_db:
        if user['hash_algorithm'] == 'sha256':
            # Mark for upgrade on next login
            user['needs_rehash'] = True
        
    # On login:
    def verify_and_upgrade(password, user):
        if user['hash_algorithm'] == 'sha256':
            # Verify old hash
            if SHA256.hash(password.encode()).hex() == user['password_hash']:
                # Upgrade to Argon2
                new_hash = Argon2.hash(password)
                user['password_hash'] = new_hash
                user['hash_algorithm'] = 'argon2'
                user['needs_rehash'] = False
                return True
        return False

Best Practices

  1. Don’t Use for Passwords: Use Argon2 or bcrypt instead
  2. Use HMAC for MACs: Don’t use hash(secret + data)
  3. Verify Implementations: Test against known vectors
  4. Consider Alternatives: Blake3 for speed, SHA-512 for security
  5. Handle Binary Safely: Use hex or base64 for text representation

Test Vectors

# NIST test vectors
test_vectors = [
    {
        "message": b"abc",
        "hash": "ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad"
    },
    {
        "message": b"",
        "hash": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
    },
    {
        "message": b"abcdbcdecdefdefgefghfghighijhijkijkljklmklmnlmnomnopnopq",
        "hash": "248d6a61d20638b8e5c026930c3e6039a33ce45964ff2167f6ecedd419db06c1"
    }
]

# Verify implementation
for vector in test_vectors:
    result = SHA256.hash(vector["message"]).hex()
    assert result == vector["hash"], f"Test failed for: {vector['message']}"
    print(f"✓ Test passed: {vector['message'][:20]}...")

Common Pitfalls

1. Using for Password Storage

# WRONG - Vulnerable to rainbow tables
password_hash = SHA256.hash(password.encode()).hex()

# CORRECT - Use password hashing function
from metamui_crypto import Argon2
password_hash = Argon2.hash(password, salt)

2. Length Extension Vulnerability

# WRONG - Vulnerable to length extension
mac = SHA256.hash(secret + message)

# CORRECT - Use HMAC
mac = HMAC.compute(secret, message, SHA256)

3. Comparing Hashes Insecurely

# WRONG - Timing attack
if computed_hash == expected_hash:
    return True

# CORRECT - Constant time comparison
import hmac
if hmac.compare_digest(computed_hash, expected_hash):
    return True

Resources