Directory And Outputs
FRUST writes ordinary parquet files for tables and optional calculation directories for backend files. The exact layout depends on whether you run a local pipeline, independent cluster jobs, or a dependent stage chain.
flowchart TD
A["runs/example"] --> B["Independent pipeline outputs<br/>run_mols_tag.parquet"]
A --> C["Dependent chain output directory<br/>one folder per ligand or TS tag"]
C --> D["init.parquet"]
D --> E["init.hess.parquet"]
E --> F["init.hess.optts.parquet"]
F --> G["init.hess.optts.freq.parquet"]
G --> H["init.hess.optts.freq.solv.parquet"]
C --> I["Saved calculation folders<br/>ORCA, xTB, g-xTB files"]
A --> J["submitit logs<br/>when using frust.cluster"]
Independent Pipeline Outputs
For submit_jobs(...), FRUST builds parquet names from the pipeline name and
job tag:
<out_dir>/<pipeline>_<tag>.parquet
For example:
runs/mols_example/run_mols_example.parquet
runs/ts_per_rpos_example/run_ts_per_rpos_anisole_rpos_2.parquet
The tag is sanitized so it is safe for file names and scheduler job names.
Dependent Chain Outputs
For submit_chain(...), each generated input gets its own save directory:
runs/ts_chain_example/<tag>/
Inside that directory, the staged TS chain evolves the parquet filename:
init.parquet
init.hess.parquet
init.hess.optts.parquet
init.hess.optts.freq.parquet
init.hess.optts.freq.solv.parquet
Read the deepest parquet first
The deepest suffix usually contains the most complete dataframe. If
run_cleanup was used, earlier parquet files may have been removed.
Saved Calculation Files
When save_step=True or save_output_dir=True, FRUST keeps backend files that
are useful for debugging. Depending on the engine and options, saved folders
can include ORCA input/output files, xTB logs, optimized XYZ files, Hessians,
charges, and other backend artifacts.
When to keep saved files
Use saved calculation directories when:
- a row has
*-NT=Falseand*-erroris not enough; - an ORCA job converged to the wrong stationary point;
- you need to inspect an xTB or g-xTB optimized geometry outside FRUST;
- you want to archive the exact backend inputs used for a final result.
Merging Parquet Files
Large submitit runs may produce many parquet files. Use the packaged command to merge them:
merge_parquet --input-dir runs/example --output merged.parquet --recursive
Then inspect the merged table with pandas:
import pandas as pd
df = pd.read_parquet("merged.parquet")