Falcon-512/1024

Fast Fourier Lattice-based Compact Signatures over NTRU


Overview

Falcon is a post-quantum digital signature algorithm selected by NIST for standardization. It produces the smallest signatures among NIST PQC signature candidates while maintaining strong security guarantees. Falcon is based on the theoretical framework of Gentry, Peikert, and Vaikuntanathan (GPV) applied to NTRU lattices, using fast Fourier sampling (ffsampling) over the reals to produce short lattice vectors as signatures.

Falcon is the showcase algorithm in the MetaMUI suite, with the most extensive SIMD optimizations, cross-language verification, and hardware acceleration support.

How it works:

  1. KeyGen — Generate an NTRU key pair (f, g, F, G) satisfying fG - gF = q mod X^n+1, then derive the public key h = g/f mod q.
  2. Sign — Hash the message to a point in Z_q[X]/(X^n+1), then use ffsampling with the secret key’s LDL tree to find a short vector (s1, s2) such that s1 + s2*h = hash(msg) mod q.
  3. Verify — Recompute s1 = c - s2*h mod q from the signature’s s2 component, then check that ||(s1, s2)||^2 is below the acceptance bound.

Specifications

Parameter Set NIST Level Public Key (bytes) Secret Key (bytes) Signature (bytes) Degree n
Falcon-512 1 897 1281 ~666 (compressed) 512
Falcon-1024 5 1793 2305 ~1280 (compressed) 1024

NIST encoding headers:

Component Header Byte Falcon-512 Falcon-1024
Public Key 0x00 \| logn 0x09 0x0A
Secret Key 0x50 \| logn 0x59 0x5A
Signature 0x30 \| logn 0x39 0x3A

Signatures use compressed encoding: the s2 polynomial coefficients are Huffman-coded, yielding variable-length signatures that average ~666 bytes for Falcon-512 and ~1280 bytes for Falcon-1024. The signer retries with fresh randomness if the compressed signature exceeds the maximum size.


Security


Hardware Acceleration

Falcon has the most extensive hardware acceleration in the MetaMUI suite, covering NTT, FFT, and batch operations across multiple instruction sets and GPU compute.

SIMD Dispatch Order

The runtime selects the best available backend: Metal GPU > AVX-512 > AVX-2 > NEON > Portable

Instruction Set Details

Acceleration Parallelism Operations Implementation
AVX-512 32-way butterfly NTT forward/inverse, FFT metamui-crypto-rust/metamui-falcon512/src/simd/avx512_ntt.rs, avx512_fft.rs
AVX-2 16-way butterfly NTT forward/inverse, FFT metamui-crypto-rust/metamui-falcon512/src/simd/avx2_ntt.rs, avx2_fft.rs
NEON 8-way butterfly NTT forward/inverse, FFT metamui-crypto-rust/metamui-falcon512/src/simd/neon_ntt.rs, neon_fft.rs
Apple Metal GPU parallel NTT forward/inverse, pointwise multiply, batch verify metamui-crypto-rust/metamui-falcon512/src/metal/
Portable Scalar All operations (fallback) metamui-crypto-rust/metamui-falcon512/src/simd/portable.rs, portable_fft.rs

Fused Operations

Fused multiply-add and multiply-subtract operations reduce memory traffic and improve throughput for the polynomial arithmetic that dominates signing:

These are implemented in portable, NEON, AVX-2, and AVX-512 variants.

ExpandedKey Caching

The ExpandedKey structure pre-computes the LDL tree decomposition from the secret key once, then reuses it across all signing retries. This avoids redundant tree computation on each attempt, which is significant because Falcon’s sign-with-retry loop may execute multiple iterations before producing a signature within the norm bound.

Apple Metal GPU

Metal compute shaders provide GPU-accelerated batch operations:

Shader source: metamui-crypto-rust/metamui-falcon512/src/metal/shaders/

Constant-Time Audit

All SIMD paths (AVX-512, AVX-2, NEON, portable) use branchless masking throughout and have been audited for constant-time behavior. No secret-dependent branches or memory access patterns exist in the signing or verification paths.


Platform Support

Falcon is implemented across all 10 platforms with cross-language interoperability verified:

Platform Language Implementation Path Cross-Lang Verified
Native C metamui-crypto-c/ Yes (gold standard)
Systems Rust metamui-crypto-rust/metamui-falcon512/ Yes
Backend Go metamui-crypto-go/falcon512/, falcon1024/ Yes
Data Science Python metamui-crypto-python/metamui_crypto/falcon512/, falcon1024/ Yes
JVM Java metamui-crypto-java/ Yes
JVM/Android Kotlin metamui-crypto-kotlin/ Yes
.NET C# metamui-crypto-csharp/ Yes
Apple Swift metamui-crypto-swift/ Yes
Web TypeScript metamui-crypto-typescript/packages/falcon/ Format check only
Browser/Edge WASM metamui-crypto-wasm/ Format check only

Cross-language verification chain: C (generates vectors) > Go > Rust > Python > Java > Kotlin > C# > Swift — all produce and verify signatures interoperably using shared test vectors.

WASM note: The TypeScript/WASM implementation uses a non-NIST encoding format (896-byte public key, 666-byte fixed-length signature). Cryptographic correctness has been verified via the Rust backend.


API Example

use metamui_falcon512::{Falcon512, KeyPair};

// Key generation
let keypair = Falcon512::keygen(&mut rng);

// Signing
let message = b"Document to be signed";
let signature = Falcon512::sign(&keypair.secret_key, message);

// Verification
let is_valid = Falcon512::verify(&keypair.public_key, message, &signature);
assert!(is_valid);

// Using ExpandedKey for repeated signing (amortizes LDL tree cost)
let expanded = Falcon512::expand_key(&keypair.secret_key);
let sig1 = Falcon512::sign_with_expanded(&expanded, b"First message");
let sig2 = Falcon512::sign_with_expanded(&expanded, b"Second message");

Test Vectors

Falcon uses both NIST-format test vectors and cross-language interoperability vectors:


References

  1. Falcon Specification — Fouque, P.-A., Hoffstein, J., Kirchner, P., et al. Falcon: Fast-Fourier Lattice-based Compact Signatures over NTRU. https://falcon-sign.info/
  2. NIST PQC Round 3 — Falcon was selected for standardization in the NIST Post-Quantum Cryptography project. https://csrc.nist.gov/projects/post-quantum-cryptography
  3. GPV Framework — Gentry, C., Peikert, C., Vaikuntanathan, V. Trapdoors for Hard Lattices and New Cryptographic Constructions. STOC 2008.