Decrypting Tokens
How to decrypt previously encrypted tokens for debugging, verification, or re-encryption.
When to Decrypt
Decryption is useful for:
- Debugging: Verifying attribute normalization produced expected token signatures
- Verification: Confirming tokens match between datasets
- Re-encryption: Decrypting tokens to re-encrypt with a different key
- Cross-language validation: Ensuring Java and Python produce identical tokens
Note: Decryption produces HMAC-SHA256 hashed tokens (base64 encoded), not the original attribute values. Token generation is one-way.
CLI Decrypt Mode
Use the -d or --decrypt flag with the same encryption key used for token generation.
Java
java -jar opentoken-cli/target/opentoken-cli-*.jar \
-d \
-i ../../resources/output.csv \
-t csv \
-o ../../resources/decrypted.csv \
-e "Secret-Encryption-Key-Goes-Here."
Python
python -m opentoken_cli.main \
-d \
-i ../../../resources/output.csv \
-t csv \
-o ../../../resources/decrypted.csv \
-e "Secret-Encryption-Key-Goes-Here."
Docker
docker run --rm -v $(pwd)/resources:/app/resources \
opentoken:latest \
-d \
-i /app/resources/output.csv \
-t csv \
-o /app/resources/decrypted.csv \
-e "Secret-Encryption-Key-Goes-Here."
Decrypted Output Format
Decrypted tokens are HMAC-SHA256 hashes (base64 encoded)—equivalent to --hash-only output:
RecordId,RuleId,Token
ID001,T1,abc123def456... # Base64-encoded HMAC hash
ID001,T2,fed456abc123... # Same format as hash-only mode
...
This output can be used to:
- Compare with hash-only tokens from another run
- Verify token consistency across datasets
- Debug normalization issues
Standalone Decryptor Tool
A Python decryptor tool is available in tools/decryptor/:
cd tools/decryptor
pip install pycryptodome
python decryptor.py \
-e "Secret-Encryption-Key-Goes-Here." \
-i ../../resources/output.csv \
-o ../../resources/decrypted.csv
Requirements
- Python 3.10+
pycryptodomelibrary- CSV with
RuleId,Token,RecordIdcolumns
Cross-Language Decryption
Tokens encrypted by Java can be decrypted by Python and vice versa:
# Encrypt with Java
java -jar opentoken-cli-*.jar \
-i data.csv -t csv -o tokens.csv \
-h "HashingKey" -e "EncryptionKey32Characters!!!!!"
# Decrypt with Python
python -m opentoken_cli.main \
-d \
-i tokens.csv -t csv -o decrypted.csv \
-e "EncryptionKey32Characters!!!!!"
Requirements for cross-language compatibility:
- Same encryption key (exactly 32 characters/bytes)
- Same token file format
- Both implementations use AES-256-GCM with identical parameters
Security Considerations
Key Handling
- Never commit encryption keys to version control
- Use environment variables or secret stores:
export OPENTOKEN_ENCRYPTION_KEY="YourKey32Characters!!!!!!!!!!!!" java -jar opentoken-cli-*.jar -d -e "$OPENTOKEN_ENCRYPTION_KEY" ... - Rotate keys periodically and re-encrypt tokens as needed
Access Control
- Limit decryption access: Only authorized personnel should have encryption keys
- Audit decryption events: Log when and why tokens are decrypted
- Secure decrypted output: Decrypted tokens are still sensitive (HMAC hashes)
What Decryption Does NOT Reveal
Decryption produces HMAC hashes, not original data:
Original: John Doe, 1980-01-15, Male
↓
Token Signature: DOE|JOHN|MALE|1980-01-15
↓
SHA-256 Hash: abc123...
↓
HMAC-SHA256: def456... ← Decryption reveals this
↓
AES-256-GCM: xyz789... ← Encrypted token
You cannot reverse the original attribute values from decrypted tokens.
Troubleshooting
| Problem | Solution |
|---|---|
| “Decryption error” | Verify encryption key matches the key used for encryption |
| Key length error | Encryption key must be exactly 32 characters |
| Blank tokens in output | Blank tokens in input (from invalid records) remain blank |
| Tokens don’t match across languages | Run interoperability test: tools/interoperability/java_python_interoperability_test.py |
Next Steps
- Hash-only mode: Hash-Only Mode (no encryption needed)
- Batch processing: Running Batch Jobs
- Security guidance: Security