ADR-006: PBKDF2-HMAC-SHA256 for API Key Hashing¶
Status¶
Accepted
Date¶
2024-03-15
Context¶
MAID's admin API (packages/maid-engine/src/maid_engine/api/) uses API keys for
authentication. API keys are generated by the APIKeyManager in
packages/maid-engine/src/maid_engine/api/auth.py and must be stored securely.
If the key storage (file or document store) is compromised, an attacker should not
be able to recover the original API keys.
The initial implementation specification called for plain SHA-256 hashing of API keys. While API keys have higher entropy than user passwords (they are randomly generated), plain SHA-256 has no computational cost factor, meaning a compromised hash can be brute-forced at GPU speeds (billions of hashes per second).
Decision¶
Use PBKDF2-HMAC-SHA256 with per-key random salts for API key hashing. The
implementation in APIKeyManager (in packages/maid-engine/src/maid_engine/api/auth.py)
works as follows:
- Key generation: A raw API key is generated with
secrets.token_urlsafe(). A 32-byte random salt is generated withsecrets.token_bytes(). - Hashing: The key is hashed using
hashlib.pbkdf2_hmac("sha256", ...)with the salt and a configurable iteration count. - Storage: The
APIKeydataclass storeskey_hash(hex-encoded PBKDF2 output),key_salt(hex-encoded random salt), andpbkdf2_iterations(the iteration count used at creation time). - Validation:
validate_key()retrieves the stored salt and iteration count for the candidate key, recomputes the PBKDF2 hash, and uses constant-time comparison viahmac.compare_digest(). - Configuration: Iteration count (
MAID_SECURITY__PBKDF2_ITERATIONS, default 600,000) and salt length (MAID_SECURITY__API_KEY_SALT_LENGTH, default 32 bytes) are configurable viaSecuritySettingsinpackages/maid-engine/src/maid_engine/config/settings.py. - Forward compatibility: The iteration count is stored per-key, so keys created
with different iteration counts (e.g., after a settings change) can still be
validated. Legacy keys without
key_saltfall back to plain SHA-256 for backward compatibility.
Lookup performance is maintained through a prefix index: API keys include a cleartext prefix (16 characters) used for O(1) lookup before the expensive PBKDF2 verification step.
Consequences¶
Positive¶
- Brute-force resistance: At 600,000 PBKDF2 iterations, even a fast GPU can only compute approximately 10,000-50,000 hashes per second (compared to billions for plain SHA-256). This makes offline brute-force attacks impractical even for shorter key spaces.
- Per-key salts: Each key has a unique random salt, preventing rainbow table attacks and ensuring identical keys produce different hashes.
- Industry standard: PBKDF2 is recommended by OWASP and NIST (SP 800-132) for key derivation. The 600,000 iteration default meets OWASP's 2023 minimum recommendation.
- Standard library:
hashlib.pbkdf2_hmacis available in Python's standard library with no additional dependencies.
Negative¶
- Validation latency: Each key validation requires computing a full PBKDF2 derivation (approximately 50-200ms depending on hardware and iteration count). This is mitigated by the prefix index, which avoids PBKDF2 computation for non-matching keys.
- Not the strongest option: Argon2 (memory-hard) provides better resistance
against GPU/ASIC attacks than PBKDF2. However, Argon2 requires the
argon2-cffipackage, which has C extension dependencies. - Migration complexity: The backward compatibility code for legacy SHA-256 keys
adds conditional logic to the validation path. Keys without
key_saltuse the weaker plain SHA-256 path.
Alternatives Considered¶
Plain SHA-256¶
The original specification's approach. Rejected because SHA-256 has no cost factor. A leaked database of SHA-256 hashed API keys could be brute-forced at billions of attempts per second on commodity GPUs.
bcrypt¶
A well-established password hashing algorithm with built-in salt and cost factor.
Rejected because bcrypt has a 72-byte input limit (API keys can be longer), and
bcrypt is a C extension dependency that complicates installation on some platforms.
Argon2¶
The Password Hashing Competition winner, designed to be both CPU-hard and
memory-hard. Considered the strongest option. Rejected because the argon2-cffi
package requires C compilation, and MAID aims to be installable with
pip install on pure-Python environments. If this requirement changes, migrating
to Argon2 would be straightforward since the per-key iteration count storage
pattern already supports algorithm migration.