Python Quickstart
For a high-level overview and other entry points, see Quickstarts.
Install the Python packages and run the OpenToken CLI with a virtual environment.
Prerequisites
- Python 3.10+
- pip (usually included with Python)
Verify your installation:
python --version # Should show 3.10 or higher
pip --version
Setup Virtual Environment
Important: The virtual environment should be created at the repository root.
cd /path/to/OpenLinkToken
# Create virtual environment at repo root
python -m venv .venv
# Activate (Linux/Mac)
source .venv/bin/activate
# Activate (Windows)
.\.venv\Scripts\activate
Install Dependencies
# Install core library
cd lib/python/opentoken
pip install -r requirements.txt -e .
# Install CLI
cd ../opentoken-cli
pip install -r requirements.txt -e .
Run Token Generation
Basic Encrypted Tokens
python -m opentoken_cli.main \
-i ../../../resources/sample.csv \
-t csv \
-o ../../../resources/output.csv \
-h "YourHashingSecret" \
-e "YourEncryptionKey-32Chars-Here!"
Hash-Only Mode (No Encryption)
python -m opentoken_cli.main \
-i ../../../resources/sample.csv \
-t csv \
-o ../../../resources/output.csv \
-h "YourHashingSecret" \
--hash-only
Parquet Format
python -m opentoken_cli.main \
-i input.parquet \
-t parquet \
-o output.parquet \
-h "YourHashingSecret" \
-e "YourEncryptionKey-32Chars-Here!"
Verify Output
# View token output
head ../../../resources/output.csv
# View metadata
cat ../../../resources/output.metadata.json
Using the Python API Programmatically
from opentoken.attributes.person.birth_date_attribute import BirthDateAttribute
from opentoken.attributes.person.first_name_attribute import FirstNameAttribute
from opentoken.attributes.person.last_name_attribute import LastNameAttribute
from opentoken.attributes.person.postal_code_attribute import PostalCodeAttribute
from opentoken.attributes.person.sex_attribute import SexAttribute
from opentoken.attributes.person.social_security_number_attribute import SocialSecurityNumberAttribute
from opentoken.tokens.token_definition import TokenDefinition
from opentoken.tokens.token_generator import TokenGenerator
from opentoken.tokens.tokenizer.sha256_tokenizer import SHA256Tokenizer
from opentoken.tokentransformer.encrypt_token_transformer import EncryptTokenTransformer
from opentoken.tokentransformer.hash_token_transformer import HashTokenTransformer
record_id = "patient_123"
person_attributes = {
FirstNameAttribute: "John",
LastNameAttribute: "Doe",
BirthDateAttribute: "1980-01-15",
SexAttribute: "Male",
PostalCodeAttribute: "98004",
SocialSecurityNumberAttribute: "123-45-6789",
}
token_definition = TokenDefinition()
tokenizer = SHA256Tokenizer([
HashTokenTransformer("HashingSecret"),
EncryptTokenTransformer("Secret-Encryption-Key-Goes-Here."),
])
generator = TokenGenerator(token_definition, tokenizer)
result = generator.get_all_tokens(person_attributes)
if result.invalid_attributes:
print(f"Invalid attributes: {sorted(result.invalid_attributes)}")
for rule_id, token in result.tokens.items():
print(f"{record_id},{rule_id},{token}")
Cross-Language Parity
OpenToken guarantees that Java and Python produce identical tokens for the same input. This is verified by interoperability tests:
cd tools/interoperability
pip install -r requirements.txt
python java_python_interoperability_test.py
The test:
- Generates tokens using Java CLI
- Generates tokens using Python CLI
- Compares all tokens byte-by-byte
- Fails if any mismatch is found
PySpark Integration
For distributed processing on Spark or Databricks:
cd lib/python/opentoken-pyspark
pip install -r requirements.txt -e .
See Spark or Databricks for usage.
Troubleshooting
“ModuleNotFoundError: No module named ‘opentoken’”
Make sure you installed with -e . (editable mode) from the correct directory.
“Python version not supported”
OpenToken requires Python 3.10+. Check with python --version.
Virtual Environment Not Activated
If commands fail, ensure venv is active:
source /path/to/OpenLinkToken/.venv/bin/activate
Import Errors After Updates
Reinstall the packages:
pip install -e . --force-reinstall
Development Setup
For contributing to OpenToken:
# Install development dependencies
pip install -r dev-requirements.txt
# Run tests
pytest
# Run with coverage
pytest --cov=opentoken --cov-report=html
Next Steps
- Java Quickstart - Cross-language reference
- CLI Reference - All command options
- Python API Reference - Programmatic usage
- Spark Integration - Distributed processing