Reference: API Overview

OpenToken provides three interfaces for generating privacy-preserving tokens: a Java library, a Python library, and a command-line interface (CLI). All three produce identical tokens for the same input, enabling cross-language and cross-platform workflows.


Choosing the Right Interface

Interface Best for Example use case
Java API JVM-based pipelines, enterprise integrations, high-throughput batch jobs Embedding token generation in a Spring or Spark (Scala/Java) application
Python API Python data workflows, PySpark, Databricks notebooks, rapid prototyping Tokenizing DataFrames in a Jupyter notebook or Databricks cluster
CLI One-off batch processing, scripted pipelines, CI/CD jobs, Docker containers Processing CSV/Parquet files from a shell script or scheduled job

Java Library API

The Java API integrates directly into JVM applications.

Key classes:

  • TokenDefinition — Loads the built-in T1–T5 rule definitions
  • TokenGenerator — Validates/normalizes attribute values and generates tokens
  • SHA256Tokenizer — Applies the SHA-256 digest step before transformations
  • HashTokenTransformer / EncryptTokenTransformer — Optional post-processing (HMAC and/or AES-GCM)

Quick example:

import java.util.HashMap;
import java.util.List;
import java.util.Map;

import com.truveta.opentoken.attributes.Attribute;
import com.truveta.opentoken.attributes.person.BirthDateAttribute;
import com.truveta.opentoken.attributes.person.FirstNameAttribute;
import com.truveta.opentoken.attributes.person.LastNameAttribute;
import com.truveta.opentoken.attributes.person.PostalCodeAttribute;
import com.truveta.opentoken.attributes.person.SexAttribute;
import com.truveta.opentoken.attributes.person.SocialSecurityNumberAttribute;
import com.truveta.opentoken.tokens.TokenDefinition;
import com.truveta.opentoken.tokens.TokenGenerator;
import com.truveta.opentoken.tokens.TokenGeneratorResult;
import com.truveta.opentoken.tokens.tokenizer.SHA256Tokenizer;
import com.truveta.opentoken.tokentransformer.EncryptTokenTransformer;
import com.truveta.opentoken.tokentransformer.HashTokenTransformer;
import com.truveta.opentoken.tokentransformer.TokenTransformer;

List<TokenTransformer> transformers = List.of(
  new HashTokenTransformer(hashingSecret),
  new EncryptTokenTransformer(encryptionKey)
);

TokenGenerator tokenGenerator = new TokenGenerator(
  new TokenDefinition(),
  new SHA256Tokenizer(transformers)
);

Map<Class<? extends Attribute>, String> personAttributes = new HashMap<>();
personAttributes.put(FirstNameAttribute.class, "Elena");
personAttributes.put(LastNameAttribute.class, "Vasquez");
personAttributes.put(BirthDateAttribute.class, "1992-07-14");
personAttributes.put(SexAttribute.class, "Female");
personAttributes.put(PostalCodeAttribute.class, "30301");
personAttributes.put(SocialSecurityNumberAttribute.class, "452-38-7291");

TokenGeneratorResult result = tokenGenerator.getAllTokens(personAttributes);
result.getTokens().forEach((ruleId, token) -> System.out.println(ruleId + ": " + token));

Full reference: Java API Reference


Python Library API

The Python API mirrors the Java API for cross-language parity.

Key classes:

  • TokenDefinition — Loads the built-in T1–T5 rule definitions
  • TokenGenerator — Validates/normalizes attribute values and generates tokens
  • SHA256Tokenizer — Applies the SHA-256 digest step before transformations
  • HashTokenTransformer / EncryptTokenTransformer — Optional post-processing (HMAC and/or AES-GCM)

Quick example:

from opentoken.attributes.person.birth_date_attribute import BirthDateAttribute
from opentoken.attributes.person.first_name_attribute import FirstNameAttribute
from opentoken.attributes.person.last_name_attribute import LastNameAttribute
from opentoken.attributes.person.postal_code_attribute import PostalCodeAttribute
from opentoken.attributes.person.sex_attribute import SexAttribute
from opentoken.attributes.person.social_security_number_attribute import SocialSecurityNumberAttribute
from opentoken.tokens.token_definition import TokenDefinition
from opentoken.tokens.token_generator import TokenGenerator
from opentoken.tokens.tokenizer.sha256_tokenizer import SHA256Tokenizer
from opentoken.tokentransformer.encrypt_token_transformer import EncryptTokenTransformer
from opentoken.tokentransformer.hash_token_transformer import HashTokenTransformer

token_definition = TokenDefinition()
tokenizer = SHA256Tokenizer([
  HashTokenTransformer(hashing_secret),
  EncryptTokenTransformer(encryption_key),
])
token_generator = TokenGenerator(token_definition, tokenizer)

person_attributes = {
  FirstNameAttribute: "Elena",
  LastNameAttribute: "Vasquez",
  BirthDateAttribute: "1992-07-14",
  SexAttribute: "Female",
  PostalCodeAttribute: "30301",
  SocialSecurityNumberAttribute: "452-38-7291",
}

result = token_generator.get_all_tokens(person_attributes)
for rule_id, token in result.tokens.items():
  print(f"{rule_id}: {token}")

Full reference: Python API Reference


Command-Line Interface (CLI)

The CLI processes CSV or Parquet files without writing code.

Basic usage:

java -jar opentoken-cli-*.jar \
  -i input.csv -t csv -o output.csv \
  -h "HashingSecret" -e "EncryptionKey32Chars!!!!!!!!!!"

Or with Python:

python -m opentoken_cli.main \
  -i input.csv -t csv -o output.csv \
  -h "HashingSecret" -e "EncryptionKey32Chars!!!!!!!!!!"

Key options:

Flag Purpose
-i / --input Input file path
-o / --output Output file path
-t / --type File type (csv or parquet)
-h / --hashingsecret HMAC-SHA256 secret
-e / --encryptionkey AES-256 key (32 chars)
--hash-only Skip encryption

Full reference: CLI Reference


Metadata Output

Every token generation run produces a .metadata.json file alongside the token output. This file contains:

  • Processing statistics (total rows, invalid records)
  • SHA-256 hashes of secrets (for verification, not the secrets themselves)
  • Timestamp and platform information

Full reference: Metadata Format


Custom Token Registration

OpenToken supports defining custom token rules beyond T1–T5. Custom rules can include additional attributes (e.g., MRN) or different attribute combinations.

Full reference: Token Registration


Additional Reference Pages