Decrypting Tokens

How to decrypt previously encrypted tokens for debugging, verification, or re-encryption.


When to Decrypt

Decryption is useful for:

  • Debugging: Verifying attribute normalization produced expected token signatures
  • Verification: Confirming tokens match between datasets
  • Re-encryption: Decrypting tokens to re-encrypt with a different key
  • Cross-language validation: Ensuring Java and Python produce identical tokens

Note: Decryption produces HMAC-SHA256 hashed tokens (base64 encoded), not the original attribute values. Token generation is one-way.


CLI Decrypt Mode

Use the -d or --decrypt flag with the same encryption key used for token generation.

Java

java -jar opentoken-cli/target/opentoken-cli-*.jar \
  -d \
  -i ../../resources/output.csv \
  -t csv \
  -o ../../resources/decrypted.csv \
  -e "Secret-Encryption-Key-Goes-Here."

Python

python -m opentoken_cli.main \
  -d \
  -i ../../../resources/output.csv \
  -t csv \
  -o ../../../resources/decrypted.csv \
  -e "Secret-Encryption-Key-Goes-Here."

Docker

docker run --rm -v $(pwd)/resources:/app/resources \
  opentoken:latest \
  -d \
  -i /app/resources/output.csv \
  -t csv \
  -o /app/resources/decrypted.csv \
  -e "Secret-Encryption-Key-Goes-Here."

Decrypted Output Format

Decrypted tokens are HMAC-SHA256 hashes (base64 encoded)—equivalent to --hash-only output:

RecordId,RuleId,Token
ID001,T1,abc123def456...  # Base64-encoded HMAC hash
ID001,T2,fed456abc123...  # Same format as hash-only mode
...

This output can be used to:

  • Compare with hash-only tokens from another run
  • Verify token consistency across datasets
  • Debug normalization issues

Standalone Decryptor Tool

A Python decryptor tool is available in tools/decryptor/:

cd tools/decryptor
pip install pycryptodome

python decryptor.py \
  -e "Secret-Encryption-Key-Goes-Here." \
  -i ../../resources/output.csv \
  -o ../../resources/decrypted.csv

Requirements

  • Python 3.10+
  • pycryptodome library
  • CSV with RuleId, Token, RecordId columns

Cross-Language Decryption

Tokens encrypted by Java can be decrypted by Python and vice versa:

# Encrypt with Java
java -jar opentoken-cli-*.jar \
  -i data.csv -t csv -o tokens.csv \
  -h "HashingKey" -e "EncryptionKey32Characters!!!!!"

# Decrypt with Python
python -m opentoken_cli.main \
  -d \
  -i tokens.csv -t csv -o decrypted.csv \
  -e "EncryptionKey32Characters!!!!!"

Requirements for cross-language compatibility:

  • Same encryption key (exactly 32 characters/bytes)
  • Same token file format
  • Both implementations use AES-256-GCM with identical parameters

Security Considerations

Key Handling

  • Never commit encryption keys to version control
  • Use environment variables or secret stores:
    export OPENTOKEN_ENCRYPTION_KEY="YourKey32Characters!!!!!!!!!!!!"
    java -jar opentoken-cli-*.jar -d -e "$OPENTOKEN_ENCRYPTION_KEY" ...
    
  • Rotate keys periodically and re-encrypt tokens as needed

Access Control

  • Limit decryption access: Only authorized personnel should have encryption keys
  • Audit decryption events: Log when and why tokens are decrypted
  • Secure decrypted output: Decrypted tokens are still sensitive (HMAC hashes)

What Decryption Does NOT Reveal

Decryption produces HMAC hashes, not original data:

Original: John Doe, 1980-01-15, Male
    ↓
Token Signature: DOE|JOHN|MALE|1980-01-15
    ↓
SHA-256 Hash: abc123...
    ↓
HMAC-SHA256: def456...  ← Decryption reveals this
    ↓
AES-256-GCM: xyz789...  ← Encrypted token

You cannot reverse the original attribute values from decrypted tokens.


Troubleshooting

Problem Solution
“Decryption error” Verify encryption key matches the key used for encryption
Key length error Encryption key must be exactly 32 characters
Blank tokens in output Blank tokens in input (from invalid records) remain blank
Tokens don’t match across languages Run interoperability test: tools/interoperability/java_python_interoperability_test.py

Next Steps