CLI Quickstart
For a high-level overview and other entry points, see Quickstarts.
Run the OpenToken CLI end-to-end to generate tokens from a sample dataset in minutes.
Prerequisites
Choose one of:
- Docker (recommended) - No other dependencies needed
- Java 21+ and Maven 3.8+
- Python 3.10+
Quick Start with Docker
The fastest way to get started. No Java or Python installation required.
Linux/Mac
cd /path/to/OpenLinkToken
./run-opentoken.sh \
-i ./resources/sample.csv \
-o ./resources/output.csv \
-t csv \
-h "HashingKey" \
-e "Secret-Encryption-Key-Goes-Here."
Windows PowerShell
cd C:\path\to\OpenLinkToken
.\run-opentoken.ps1 `
-i .\resources\sample.csv `
-o .\resources\output.csv `
-FileType csv `
-h "HashingKey" `
-e "Secret-Encryption-Key-Goes-Here."
CLI Arguments
| Argument | Short | Description | Required |
|---|---|---|---|
--input |
-i |
Input file path (CSV or Parquet) | Yes |
--output |
-o |
Output file path | Yes |
--type |
-t |
File type: csv or parquet |
Yes |
--hashingsecret |
-h |
Secret key for HMAC hashing | Yes |
--encryptionkey |
-e |
32-character key for AES encryption | No* |
--hash-only |
Skip encryption, output hashed tokens only | No |
*Required unless --hash-only is specified.
Example: CSV Input
Input file (sample.csv):
RecordId,FirstName,LastName,BirthDate,Sex,PostalCode,SSN
patient_001,John,Doe,1980-01-15,Male,98004,123-45-6789
patient_002,Jane,Smith,1975-03-22,Female,90210,987-65-4321
Command:
java -jar opentoken-cli-*.jar \
-i sample.csv \
-t csv \
-o tokens.csv \
-h "MyHashingSecret" \
-e "MyEncryptionKey-32Characters!"
Output (tokens.csv):
RecordId,RuleId,Token
patient_001,T1,Gn7t1Zj16E5Qy+z9iINtcz...
patient_001,T2,pUxPgYL9+cMxkA+8928Pi...
patient_001,T3,rwjfwIo5OcJUItTx8KCo...
patient_001,T4,9o7HIYZkhizczFzJL1HFy...
patient_001,T5,QpBpGBqaMhagfcHGZhVa...
patient_002,T1,...
Example: Parquet Input
java -jar opentoken-cli-*.jar \
-i input.parquet \
-t parquet \
-o tokens.parquet \
-h "MyHashingSecret" \
-e "MyEncryptionKey-32Characters!"
Hash-Only Mode
Generate tokens without encryption (faster, but tokens cannot be decrypted):
java -jar opentoken-cli-*.jar \
-i sample.csv \
-t csv \
-o tokens.csv \
-h "MyHashingSecret" \
--hash-only
Security Note (Hash-Only)
--hash-only output is intended for internal use and should not be shared externally. Hash-only tokens are deterministic and can still be linkable across datasets.
The primary use case for hash-only mode is to build an internal overlap-analysis dataset that you join against encrypted tokens received from an external partner (after decrypting their tokens to the hash-only equivalent). If you need to exchange tokens across organizations, use encrypted mode and follow Sharing Tokenized Data.
Understanding the Output
Token File
Each input record produces 5 tokens (T1–T5):
| Column | Description |
|---|---|
RecordId |
Original record identifier |
RuleId |
Token rule (T1, T2, T3, T4, or T5) |
Token |
Base64-encoded encrypted/hashed token |
Metadata File
A .metadata.json file is created alongside the output:
{
"Platform": "Java",
"JavaVersion": "21.0.0",
"OpenTokenVersion": "1.7.0",
"TotalRows": 2,
"TotalRowsWithInvalidAttributes": 0,
"InvalidAttributesByType": {},
"BlankTokensByRule": {},
"HashingSecretHash": "e0b4e60b...",
"EncryptionSecretHash": "a1b2c3d4..."
}
Troubleshooting
“Encryption key not provided”
Either provide -e "YourKey" or use --hash-only flag.
“Invalid BirthDate”
Ensure dates are in YYYY-MM-DD format and between 1910-01-01 and today.
“File not found”
Check that input file path is correct and file exists.
“Invalid SSN”
SSN must be 9 digits. Area code cannot be 000, 666, or 900-999.
Next Steps
- Java Quickstart - Build from source
- Python Quickstart - Use Python CLI
- Configuration - Advanced options
- Token Rules - Understand T1-T5