Reference: API Overview
OpenToken provides three interfaces for generating privacy-preserving tokens: a Java library, a Python library, and a command-line interface (CLI). All three produce identical tokens for the same input, enabling cross-language and cross-platform workflows.
Choosing the Right Interface
| Interface | Best for | Example use case |
|---|---|---|
| Java API | JVM-based pipelines, enterprise integrations, high-throughput batch jobs | Embedding token generation in a Spring or Spark (Scala/Java) application |
| Python API | Python data workflows, PySpark, Databricks notebooks, rapid prototyping | Tokenizing DataFrames in a Jupyter notebook or Databricks cluster |
| CLI | One-off batch processing, scripted pipelines, CI/CD jobs, Docker containers | Processing CSV/Parquet files from a shell script or scheduled job |
Java Library API
The Java API integrates directly into JVM applications.
Key classes:
TokenDefinition— Loads the built-in T1–T5 rule definitionsTokenGenerator— Validates/normalizes attribute values and generates tokensSHA256Tokenizer— Applies the SHA-256 digest step before transformationsHashTokenTransformer/EncryptTokenTransformer— Optional post-processing (HMAC and/or AES-GCM)
Quick example:
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import com.truveta.opentoken.attributes.Attribute;
import com.truveta.opentoken.attributes.person.BirthDateAttribute;
import com.truveta.opentoken.attributes.person.FirstNameAttribute;
import com.truveta.opentoken.attributes.person.LastNameAttribute;
import com.truveta.opentoken.attributes.person.PostalCodeAttribute;
import com.truveta.opentoken.attributes.person.SexAttribute;
import com.truveta.opentoken.attributes.person.SocialSecurityNumberAttribute;
import com.truveta.opentoken.tokens.TokenDefinition;
import com.truveta.opentoken.tokens.TokenGenerator;
import com.truveta.opentoken.tokens.TokenGeneratorResult;
import com.truveta.opentoken.tokens.tokenizer.SHA256Tokenizer;
import com.truveta.opentoken.tokentransformer.EncryptTokenTransformer;
import com.truveta.opentoken.tokentransformer.HashTokenTransformer;
import com.truveta.opentoken.tokentransformer.TokenTransformer;
List<TokenTransformer> transformers = List.of(
new HashTokenTransformer(hashingSecret),
new EncryptTokenTransformer(encryptionKey)
);
TokenGenerator tokenGenerator = new TokenGenerator(
new TokenDefinition(),
new SHA256Tokenizer(transformers)
);
Map<Class<? extends Attribute>, String> personAttributes = new HashMap<>();
personAttributes.put(FirstNameAttribute.class, "Elena");
personAttributes.put(LastNameAttribute.class, "Vasquez");
personAttributes.put(BirthDateAttribute.class, "1992-07-14");
personAttributes.put(SexAttribute.class, "Female");
personAttributes.put(PostalCodeAttribute.class, "30301");
personAttributes.put(SocialSecurityNumberAttribute.class, "452-38-7291");
TokenGeneratorResult result = tokenGenerator.getAllTokens(personAttributes);
result.getTokens().forEach((ruleId, token) -> System.out.println(ruleId + ": " + token));
Full reference: Java API Reference
Python Library API
The Python API mirrors the Java API for cross-language parity.
Key classes:
TokenDefinition— Loads the built-in T1–T5 rule definitionsTokenGenerator— Validates/normalizes attribute values and generates tokensSHA256Tokenizer— Applies the SHA-256 digest step before transformationsHashTokenTransformer/EncryptTokenTransformer— Optional post-processing (HMAC and/or AES-GCM)
Quick example:
from opentoken.attributes.person.birth_date_attribute import BirthDateAttribute
from opentoken.attributes.person.first_name_attribute import FirstNameAttribute
from opentoken.attributes.person.last_name_attribute import LastNameAttribute
from opentoken.attributes.person.postal_code_attribute import PostalCodeAttribute
from opentoken.attributes.person.sex_attribute import SexAttribute
from opentoken.attributes.person.social_security_number_attribute import SocialSecurityNumberAttribute
from opentoken.tokens.token_definition import TokenDefinition
from opentoken.tokens.token_generator import TokenGenerator
from opentoken.tokens.tokenizer.sha256_tokenizer import SHA256Tokenizer
from opentoken.tokentransformer.encrypt_token_transformer import EncryptTokenTransformer
from opentoken.tokentransformer.hash_token_transformer import HashTokenTransformer
token_definition = TokenDefinition()
tokenizer = SHA256Tokenizer([
HashTokenTransformer(hashing_secret),
EncryptTokenTransformer(encryption_key),
])
token_generator = TokenGenerator(token_definition, tokenizer)
person_attributes = {
FirstNameAttribute: "Elena",
LastNameAttribute: "Vasquez",
BirthDateAttribute: "1992-07-14",
SexAttribute: "Female",
PostalCodeAttribute: "30301",
SocialSecurityNumberAttribute: "452-38-7291",
}
result = token_generator.get_all_tokens(person_attributes)
for rule_id, token in result.tokens.items():
print(f"{rule_id}: {token}")
Full reference: Python API Reference
Command-Line Interface (CLI)
The CLI processes CSV or Parquet files without writing code.
Basic usage:
java -jar opentoken-cli-*.jar \
-i input.csv -t csv -o output.csv \
-h "HashingSecret" -e "EncryptionKey32Chars!!!!!!!!!!"
Or with Python:
python -m opentoken_cli.main \
-i input.csv -t csv -o output.csv \
-h "HashingSecret" -e "EncryptionKey32Chars!!!!!!!!!!"
Key options:
| Flag | Purpose |
|---|---|
-i / --input |
Input file path |
-o / --output |
Output file path |
-t / --type |
File type (csv or parquet) |
-h / --hashingsecret |
HMAC-SHA256 secret |
-e / --encryptionkey |
AES-256 key (32 chars) |
--hash-only |
Skip encryption |
Full reference: CLI Reference
Metadata Output
Every token generation run produces a .metadata.json file alongside the token output. This file contains:
- Processing statistics (total rows, invalid records)
- SHA-256 hashes of secrets (for verification, not the secrets themselves)
- Timestamp and platform information
Full reference: Metadata Format
Custom Token Registration
OpenToken supports defining custom token rules beyond T1–T5. Custom rules can include additional attributes (e.g., MRN) or different attribute combinations.
Full reference: Token Registration
Additional Reference Pages
- Java API Reference — Complete Java class and method documentation
- Python API Reference — Complete Python class and method documentation
- CLI Reference — All CLI flags, modes, and examples
- Metadata Format — Metadata file schema and fields
- Token Registration — Adding custom token rules
Related Documentation
- Quickstarts — Get started in 5 minutes
- Concepts: Token Rules — How T1–T5 are composed
- Concepts: Normalization — Attribute standardization
- Configuration — Environment variables and input formats
- Security — Cryptographic details and key management