Skip to content

Custom Model Training

Train Phony to generate data that matches YOUR domain. Works with any language—Turkish, Japanese, or your custom terminology.


How It Works

Phony learns statistical patterns from your data, not memorizes it. The trained model generates new data that looks like yours but never reproduces original values.

Your Data → Learn Patterns → Generate Similar (Not Same)

• Learns character/word distributions
• Preserves statistical properties
• Never reproduces original data
• Works with ANY language

Training Sources

Open Source (Local)

  • Text file (.txt, one item per line)
  • CSV/TSV (any column)
  • JSON/JSONL (any field path)
bash
# CLI
phony train --source names.txt --output names.phony

# PHP
Phony::train($data, ['ngram' => 2]);

Cloud (Additional Sources)

  • Database column (MySQL, PostgreSQL, SQLite)
  • API response (fetch & extract)
  • S3/GCS bucket (bulk data)
  • Clipboard/Paste (quick training in Web UI)

Cloud Workflow

  1. Connect database or upload file
  2. Select column/field to train from
  3. Configure n-gram size, excludeOriginals, etc.
  4. Preview generated samples in UI
  5. Train & Save model (with version)
  6. Use in sync jobs or download for local use

Model Management

  • List all models with stats
  • Generate samples to test model quality
  • Download .phony file for local/OSS use
  • Share within your team
  • Version history with diff
  • Rollback to previous version
  • Scheduled re-training from live data

Usage Examples

bash
# Train locally
$ phony train --source turkish_names.txt -o turkish.phony

# Load and use
Phony::loadModel('turkish.phony');
Phony::model('turkish')->generate(1000);

# In Cloud sync jobs
first_name: phony:model:turkish_names_v1

Tier Limits

FeatureFREESTARTERTEAMBUSINESS
Cloud Models3UnlimitedUnlimitedUnlimited
Database Column Training
S3/GCS Training
Model VersioningLast 5UnlimitedWith diff
Team Sharing
Scheduled Re-training

Phony Cloud Platform Specification