Running TRINITY

Running a simulation

A TRINITY run is fully specified by one plain-text parameter file, and there is a single command — from the repository root:

python run.py param/simple_cluster.param

The path may be absolute or relative to the repository root. run.py scans the file and dispatches automatically: if the file contains list ([...]) or tuple(...) syntax it runs a parameter sweep across a parallel worker pool, otherwise it runs a single simulation. There is no separate command or flag for sweeps. On an HPC cluster you can instead generate a SLURM job array with --emit-jobs (see Running on a cluster (SLURM) below).

Output is written to the directory named by the path2output parameter; the default sentinel def_dir resolves to outputs/<model_name>/ for a single run, or one outputs/<run_name>/ subfolder per combination for a sweep. See Outputs below for the file layout.

Parameter-file formats

A parameter file lists one keyword value entry per line (see Parameter Specifications for the full keyword reference). The value syntax alone decides whether the file is a single run or a sweep:

param/sweep_hybrid_example.param

 # Plain key/value — fixed across every run
 dens_profile    densPL
 nISM            0.1
 path2output     outputs/demo

 # tuple(...) — only these explicit (mCloud, sfe) pairs are run
 tuple(mCloud, sfe)    [1e5, 0.01] [1e7, 0.10]

 # [list] — swept Cartesian-style across each tuple pair
 nCore    [1e3, 1e4]

The file above mixes all three value forms. How many simulations a file generates depends only on which forms it uses:

Value syntax	Mode	Runs generated
no `[ ]`, no `tuple()`	single	1
`key [a, b, c]`	Cartesian	every combination (e.g. `mCloud` × `sfe` = 3 × 2 = 6)
`tuple(x, y) [..] [..]`	tuple	only the listed points, no expansion
tuple and list together	hybrid	tuple points × list combinations

The hybrid example therefore runs 2 tuple pairs × 2 nCore values = 4 simulations. Single-purpose worked examples ship as param/sweep_example.param (Cartesian), param/sweep_tuple_example.param (tuple), and param/sweep_hybrid_example.param (hybrid).

Command-line flags

All flags are optional. Most take effect only in sweep mode; for a single run, --dry-run prints the raw parameter file and exits without running, while --workers and --yes are ignored:

Flag	Description
`--dry-run`, `-n`	Preview all combinations (with any GMC warnings) without running.
`--workers N`, `-w`	Parallel workers for the in-process sweep pool — or the array concurrency cap with `--emit-jobs`. Default inside a SLURM job: the full allocation (`SLURM_CPUS_PER_TASK`, else `SLURM_CPUS_ON_NODE` / CPU affinity); on a laptop: `max(1, CPU count // 2 - 1)`. Must be `>= 1`; refused if it exceeds the cores available to this process.
`--yes`, `-y`	Skip the interactive confirmation prompt.
`--verbose`, `-v`	DEBUG-level logs and the full base-parameter list.
`--emit-jobs DIR`	Generate a SLURM job-array bundle in `DIR` (one task per combination) instead of running locally; requires a sweep file. Mutually exclusive with `--collect-report` (see below).
`--collect-report DIR`	Aggregate a finished `--emit-jobs` bundle into `sweep_report.txt` / `.json`; needs no parameter file.

Before launching, run.py runs a GMC-parameter plausibility check (cloud mass vs. core/ISM density, cloud radius, …) on every combination; invalid ones are listed up front so you can abort rather than waste compute. Press Ctrl+C — or send SIGTERM, e.g. from SLURM scancel — to cancel cleanly: in-flight workers are stopped and a report of completed / failed / cancelled runs is written to the output directory.

Running on a cluster (SLURM)

On a laptop or a single multi-core node, a sweep runs across an in-process worker pool sized by --workers. To scale across nodes on an HPC cluster (e.g. bwForCluster Helix / bwUniCluster), generate a SLURM job array instead — one array task per combination, so the scheduler packs them across nodes and restarts failures independently:

python run.py param/sweep_example.param --emit-jobs jobs/
# edit jobs/submit_sweep.sbatch: --account, --partition, --time, --mem
sbatch jobs/submit_sweep.sbatch
python run.py --collect-report jobs/      # after the array finishes

Running the in-process pool on a login node is discouraged; run.py prints a warning when SLURM is detected without an active job.

--emit-jobs DIR writes a self-contained, submittable bundle:

jobs/
├── params/<run_name>.param   # one per combination, absolute path2output
├── runs.tsv                  # param_path <TAB> output_dir; line N = array task N
├── manifest.json             # index: names, params, output dirs
├── submit_sweep.sbatch       # #SBATCH --array=1-N[%K]; one sim per task
└── logs/                     # %A_%a.out per task

Each array task runs python run.py <combo>.param with one CPU and math-library threads pinned to one (OMP_NUM_THREADS=1 …, MPLBACKEND=Agg); parallelism comes from running many tasks, not from threading one. Passing --workers K at emit time caps concurrency as --array=1-N%K.

When the array finishes, --collect-report DIR reads each task’s .exit_code / .duration sentinels and writes the same sweep_report.txt / .json as a local sweep, then prints a ready sbatch --array=<failed ids> jobs/submit_sweep.sbatch to rerun only the failures.

Outputs land in the same path2output/<run_name>/ layout as a local sweep (see Outputs below). Bundled inputs (SPS, cooling tables, lib/default/) resolve relative to the package, so the clone location does not matter; only path2output follows the launch directory — set it to an absolute path on a work/scratch filesystem for cluster runs.

Outputs

File layout

A single run writes these files into path2output:

path2output/
├── dictionary.jsonl            # simulation state, one JSON object per snapshot
├── metadata.json               # run constants + termination + final-state blocks
├── metadata_humanreadable.txt  # pretty-printed show_run summary
└── trinity.log                 # log file (written when log_file = True)

A sweep writes those same files into one subdirectory per combination, adds a fully-resolved .param sidecar to each, and writes two top-level reports:

outputs/my_sweep/
├── 1e5_sfe001_n1e3/
│   ├── 1e5_sfe001_n1e3.param   # full resolved params for this run
│   ├── dictionary.jsonl
│   ├── metadata.json
│   ├── metadata_humanreadable.txt
│   └── trinity.log
├── 1e5_sfe001_n1e4/
│   └── ...
├── sweep_report.txt            # human-readable sweep summary
└── sweep_report.json           # machine-readable sweep summary

Auto-generated run names

Each sweep combination is named automatically:

{mCloud}_sfe{sfe*100:03d}_n{nCore}[_density-profile][_PHII][_other-swept-keys]

The optional suffixes appear only for parameters set explicitly in the sweep file — a key you leave out gets no suffix (nothing is compared against default.param):

_PL{alpha} for dens_profile = densPL (e.g. _PL0, _PL-2), or _BE{Omega} for densBE (e.g. _BE14).
_yesPHII / _noPHII when include_PHII is set — useful when sweeping the flag to compare runs with and without HII pressure.
Any other swept parameter without a curated slot above gets a generic _{key}{value} suffix so distinct combinations never collapse onto the same folder. snake_case keys become camelCase and decimal points in floats become p (minus signs are kept, as in _PL-2). Examples: sweeping ZCloud = [0.5, 1.0] yields _ZCloud0p5 / _ZCloud1p0; coll_counter = [True, False] yields _collCounterTrue / _collCounterFalse. Multiple generic suffixes are emitted in sorted-key order for stability.

Folder-name safety rules applied to generic values:
- Hard-rejected with an immediate ValueError (no sweep runs): values containing /, \, .., or any control character. This means filepath-typed parameters cannot be swept — set them once in your base param file. The check protects against silently nesting directories or escaping the sweep root.
- Sanitised to -: any character outside [A-Za-z0-9.+-] (spaces, brackets, shell wildcards, unicode, =, : …). The sweep still runs but with a safe folder name.
- Length-capped at 200 characters for the full run name; the sweep aborts with a clear error if you cross it (reserve room for sibling filenames within the 255-byte filesystem cap).

For example, 1e7_sfe010_n1e4_noPHII is mCloud=1e7, sfe=0.10, nCore=1e4 with include_PHII = False.

The folder name is only a unique human-readable handle: every sweep run also writes its full resolved parameter set to a per-run .param file (plus the sweep-wide sweep_report.json), so a run with no suffix for some key still has that key recorded — it just took the default.param value. Master plot scripts that compare across sweeps should read parameters from those sidecars rather than parse them out of the folder name.

Output data model

dictionary.jsonl

Each simulation writes its full state to dictionary.jsonl as a stream of newline-delimited JSON objects, one per snapshot. Writes are append-only and crash-safe (the run flushes buffered snapshots on a clean exit, Ctrl+C, or SIGTERM), so the file remains readable after a crash — the last line may be partial but every prior line is a complete snapshot. Snapshots are saved before each ODE step, so all values in one snapshot share a single t_now.

Snapshot keys group into a handful of categories:

Category	Example keys
Administrative	`model_name`, `current_phase`, `SimulationEndReason`
Cloud setup	`mCloud`, `mCluster`, `rCloud`, `nEdge`
Dynamical state	`t_now`, `R2`, `v2`, `Eb`, `T0`, `R1`, `Pb`
Feedback (SPS)	`Lmech_W`, `Lmech_SN`, `Qi`, `Lbol`, `pdot_total`
Pressures	`P_drive`, `P_HII`, `P_ram`
Forces	`F_grav`, `F_ram`, `F_ram_wind`, `F_ram_SN`, `F_HII`, `F_rad`
Bubble / shell profiles	`log_bubble_T_arr` + `bubble_T_arr_r_arr`, `bubble_v_arr` + `bubble_v_arr_r_arr`, `log_shell_n_arr` + `shell_r_arr`

A handful of long 1-D profile arrays are downsampled before serialisation to keep snapshots manageable. Each simplified array is paired with its own abscissa (*_r_arr) and, where the values span many decades, stored in \(\log_{10}\) space (log_*). The target point budget is set by simplify_npoints (see Parameter Specifications). To recover a profile, linearly interpolate the paired abscissa against the (possibly log-space) values.

The recommended way to read the file is the reader API in Output Reader API, which hides the JSONL layout, the per-key unit metadata, and the legacy .json format behind a small set of classes.

metadata.json

Run-constant parameters and end-of-run summaries live in a sibling metadata.json (current schema version 4) rather than being repeated in every snapshot. The reader rehydrates the run constants into each snapshot on load, so consumers never have to read this file directly — but it is small and human-inspectable. Its top-level blocks are:

Block	Contents
run constants	Every input parameter / set-once derived value that does not change after phase 0 (`mCloud`, `sfe`, `dens_profile`, physical constants, …). Membership is derived from the ParamSpec registry (see Parameter Specifications), so it always matches the schema.
`termination`	`{exit_code, outcome, detail, timestamp, model_name}` — how the run ended.
`final_state`	Every scalar / bool / string on the state dict at run end, in internal units (pc, Myr, pc⁻³). Long arrays are excluded — their final values are the last line of `dictionary.jsonl`.
`termination_debug`	Last-two-snapshot diff, a NaN/Inf inventory, and physics sanity checks, for post-mortem debugging.

All writes go through an atomic helper, so an interrupted write can never leave a corrupt file.

show_run

For a quick human-readable view of a finished run — run context, termination reason, and final state — without writing any plotting code:

python -m trinity._output.show_run path2output/

It reads metadata.json and pretty-prints a curated subset; pass --json for the full dump or --quiet for a scriptable exit code (useful in batch loops over a sweep tree).

Logging

The parameter reference in Parameter Specifications lists the four logging parameters (log_level, log_console, log_file, log_colors) and their defaults. This section covers the conceptual ladder of log levels.

Each level includes itself and all more severe levels: CRITICAL > ERROR > WARNING > INFO > DEBUG. Setting log_level = INFO emits INFO, WARNING, ERROR, and CRITICAL messages.

Level	Typical messages	When to use
`DEBUG`	Variable values, loop iterations, intermediate calculations, function entry/exit.	Development; debugging specific issues.
`INFO`	Phase transitions, major events (bubble burst, cloud edge reached), initialisation and completion markers.	Normal simulation runs (default).
`WARNING`	Values clamped to limits, fallback behaviour, unusual but non-critical conditions.	Production runs where only potential problems matter.
`ERROR`	Calculation failures, recoverable errors.	Silent runs where only actual errors matter.
`CRITICAL`	Unrecoverable failures, fatal errors.	When only simulation-stopping errors should print.

Console logging is off by default (log_console = False); with it enabled at log_level = INFO, output looks like:

2026-01-08 15:30:00 | INFO     | trinity.main | === TRINITY Simulation Starting ===
2026-01-08 15:30:00 | INFO     | trinity.main | Model: test_simulation
2026-01-08 15:30:01 | INFO     | trinity.sps.read_sps | SPS data processed
2026-01-08 15:30:03 | INFO     | trinity.phase1_energy | Entering energy-driven phase
2026-01-08 15:30:45 | INFO     | trinity.main | === TRINITY Simulation Complete ===

Troubleshooting

Most parameter errors are typos against the schema; the authoritative list of valid keywords and defaults is the ParamSpec registry (trinity/_input/registry.py), from which trinity/_input/default.param is generated and mirrored in Parameter Specifications. For issues and feature requests, see https://github.com/JiaWeiTeh/trinity/issues.