BLAKE3

High-Speed Parallel Cryptographic Hash Function


Overview

BLAKE3 is a cryptographic hash function designed by Jack O’Connell, Jean-Philippe Aumasson, Samuel Neves, and Zooko Wilcox-O’Hearn. It is the successor to BLAKE2 and provides significantly higher throughput by exploiting a Merkle tree structure that enables parallel computation at every level — from SIMD instructions within a single core to multi-threaded hashing across cores to GPU batch processing.

BLAKE3 supports three modes of operation:

The core compression function processes 64-byte blocks using a 16-word state with 7 rounds (compared to 10 in BLAKE2). The tree structure splits input into 1024-byte chunks, compresses each chunk independently, then merges results in a binary tree. This design is inherently parallel at every stage.


Specifications

Property Value
Output size 256 bits default (arbitrary via XOF)
Block size 64 bytes
Chunk size 1024 bytes (16 blocks)
Rounds 7 per compression
State words 16 x 32-bit
Key size (keyed mode) 256 bits
Internal security 128 bits (birthday bound for 256-bit output)

Modes:

Mode Input Description
hash(data) Arbitrary bytes Standard cryptographic hash
keyed_hash(key, data) 256-bit key + arbitrary bytes MAC construction
derive_key(context, material) Context string + key material KDF via context separation

Security


Hardware Acceleration

BLAKE3 has the broadest hardware acceleration coverage in the MetaMUI suite, spanning CPU SIMD, GPU compute, and WebAssembly SIMD.

CPU SIMD

Acceleration Parallelism Description
AVX-512 16-block batch Compresses 16 blocks simultaneously using 512-bit registers
AVX-2 8-block batch Compresses 8 blocks simultaneously using 256-bit registers
NEON Block compression SIMD-accelerated compression function on ARM
WASM SIMD128 Block compression 128-bit SIMD in WebAssembly environments

GPU Compute

Acceleration Target Operations
Apple Metal macOS/iOS GPU compress_blocks_simd — batch block compression; process_chunks_tile — parallel chunk processing; tree_merge — Merkle tree merge
CUDA NVIDIA GPU Batch hashing with multiple CUDA streams for throughput

Parallelism Model

BLAKE3’s tree structure enables parallelism at three levels:

  1. Intra-block: SIMD instructions parallelize the quarter-round operations within a single compression
  2. Inter-chunk: Independent 1024-byte chunks can be compressed on separate threads or GPU work items
  3. Tree merge: Parent node computation in the Merkle tree can be parallelized across tree levels

This makes BLAKE3 particularly well-suited for hashing large inputs (files, streams) and for batch hashing many small inputs (e.g., verifying a set of transaction hashes).


Platform Support

BLAKE3 is implemented across all 10 platforms with SIMD acceleration where available:

Platform Language SIMD Support Implementation Path
Native C AVX-512, AVX-2 metamui-crypto-c/
Systems Rust AVX-512, AVX-2, NEON metamui-crypto-rust/
Backend Go Portable + assembly metamui-crypto-go/
Data Science Python Via C bindings metamui-crypto-python/
JVM Java Portable metamui-crypto-java/
JVM/Android Kotlin Portable metamui-crypto-kotlin/
.NET C# Portable metamui-crypto-csharp/
Apple Swift NEON, Metal metamui-crypto-swift/
Web TypeScript WASM SIMD128 metamui-crypto-typescript/
Browser/Edge WASM SIMD128 metamui-crypto-wasm/

API Example

// Standard hash
let hash = blake3::hash(b"input data");

// Keyed hash (MAC)
let key: [u8; 32] = /* 256-bit key */;
let mac = blake3::keyed_hash(&key, b"authenticated data");

// Key derivation
let derived_key = blake3::derive_key("metamui-crypto 2024 session key", key_material);

// Incremental hashing (streaming)
let mut hasher = blake3::Hasher::new();
hasher.update(b"first chunk");
hasher.update(b"second chunk");
let hash = hasher.finalize();

// Extended output (XOF)
let mut output = [0u8; 64];
hasher.finalize_xof().fill(&mut output);

Test Vectors


References

  1. BLAKE3 Specification — O’Connor, J., Aumasson, J.-P., Neves, S., Wilcox-O’Hearn, Z. BLAKE3: One function, fast everywhere. https://github.com/BLAKE3-team/BLAKE3-specs/blob/master/blake3.pdf
  2. BLAKE3 Reference Implementationhttps://github.com/BLAKE3-team/BLAKE3
  3. blake3.io — Official website. https://blake3.io/