Conversation
There was a problem hiding this comment.
CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
There was a problem hiding this comment.
Pull request overview
This PR introduces initial ARC ↔ TCKDB integration by adding a post-run upload sweep (driven by output/output.yml), TCKDB payload/sidecar writing + idempotency, and several supporting schema/provenance enhancements across ARC output, parsers, and job execution.
Changes:
- Add a TCKDB upload sweep (species/reaction modes), CLI re-runner, payload sidecars, and idempotency-key utilities.
- Extend ARC output/provenance to support richer thermo points, TS path-search artifacts (NEB/GSM), IRC direction tracking, and constraints.
- Improve operational robustness (Arkane stderr handling, SSH pooling, remote file handling, TS adapter filtering) and update documentation.
Reviewed changes
Copilot reviewed 65 out of 69 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/tckdb_upload_boundary.md | Defines ARC→TCKDB semantic boundary and guardrail expectations. |
| docs/output_yml_schema.md | Updates output.yml schema docs (thermo_points, GSM log). |
| docs/gaussian.md | Adds Gaussian run/troubleshooting notes (new doc). |
| arc/testing/test_JobAdapter/calcs/Species/spc1/opt_a472/submit.sh | Adds PBS submit fixture for JobAdapter tests. |
| arc/testing/test_JobAdapter/calcs/Species/spc1/opt_a472/input.gjf | Adds Gaussian input fixture for JobAdapter tests. |
| arc/testing/test_JobAdapter/calcs/Species/spc1/opt_a1313/submit.sh | Adds PBS submit fixture for JobAdapter tests. |
| arc/testing/test_JobAdapter/calcs/Species/spc1/opt_a1313/input.gjf | Adds Gaussian input fixture for JobAdapter tests. |
| arc/testing/test_JobAdapter/calcs/Species/spc1_and_2_others/conf_opt_a472/submit.sh | Adds PBS submit fixture for multi-species JobAdapter tests. |
| arc/testing/test_JobAdapter/calcs/Species/spc1_and_2_others/conf_opt_a472/input.gjf | Adds Gaussian input fixture for multi-species JobAdapter tests. |
| arc/testing/test_JobAdapter/calcs/Species/spc1_and_2_others/conf_opt_a1313/submit.sh | Adds PBS submit fixture for multi-species JobAdapter tests. |
| arc/testing/test_JobAdapter/calcs/Species/spc1_and_2_others/conf_opt_a1313/input.gjf | Adds Gaussian input fixture for multi-species JobAdapter tests. |
| arc/testing/test_JobAdapter_ServerTimeLimit/calcs/Species/spc1/opt_101/err.txt | Adds PBS walltime-limit stderr fixture. |
| arc/testing/test_JobAdapter_scan/calcs/Species/methanol_and_5_others/scan_a472/input.gjf | Adds Gaussian scan input fixture. |
| arc/testing/test_JobAdapter_scan/calcs/Species/methanol_and_5_others/scan_a1313/input.gjf | Adds Gaussian scan input fixture. |
| arc/tckdb/sweep.py | Implements end-of-run sweep dispatch and artifact uploading. |
| arc/tckdb/payload_writer.py | Writes payload JSON + sidecar metadata; replay gating helper. |
| arc/tckdb/payload_writer_test.py | Adds unit tests for payload/sidecar writing and replay gate. |
| arc/tckdb/idempotency.py | Adds idempotency-key builders for payloads and artifacts. |
| arc/tckdb/idempotency_test.py | Adds unit tests for idempotency-key stability/distinctness. |
| arc/tckdb/constraints.py | Adds constraint dataclass + serializer for TCKDB payloads. |
| arc/tckdb/config.py | Adds TCKDB config parsing, API-key resolution, artifact-kind validation. |
| arc/tckdb/cli.py | Adds standalone CLI to rerun sweep against existing project output.yml. |
| arc/tckdb/cli_test.py | Adds tests for CLI parsing and dispatch behavior. |
| arc/tckdb/init.py | Exposes adapter/config public API. |
| arc/statmech/arkane.py | Unifies tunneling method constant; fixes Arkane success signal to output.py existence; maps thermo_points. |
| arc/statmech/arkane_test.py | Adds tests for Arkane stderr classification + output.py success gating. |
| arc/species/species.py | Ensures monoatomic species get a trivial final_xyz; adds rotor scan software field; renames cp_data→thermo_points in ThermoData. |
| arc/species/species_test.py | Adds tests for monoatomic final_xyz behavior; updates xyz fixtures expectations. |
| arc/settings/submit.py | Clarifies example submit templates; adds ORCA auxiliary file copies; renames example keys and adds back-compat aliases. |
| arc/settings/submit_test.py | Updates tests to validate example keys/aliases without restricting user overrides. |
| arc/settings/settings.py | Updates ORCA NEB default level; expands candidate RMG install paths. |
| arc/scripts/save_arkane_thermo.py | Emits richer thermo_points (Cp/H/S/G) instead of cp_data. |
| arc/scripts/get_species_corrections.py | Adds helper script to compute AEC/BAC totals/components via Arkane in RMG env. |
| arc/scripts_test.py | Updates tests for thermo_points extraction. |
| arc/scheduler.py | Adds TS-guess path routing (NEB vs GSM), IRC direction tracking, TS adapter filtering, remote cleanup hook, and scan software tracking. |
| arc/scheduler_test.py | Adds tests for TS guess paths routing and GSM path key seeding. |
| arc/reaction/reaction_test.py | Adds regression test for opt-done gate with monoatomic reactant. |
| arc/parser/parser.py | Adds rich IRC parser wrapper, scan absolute energies wrapper, GSM trajectory filename handling. |
| arc/parser/parser_test.py | Adds tests for rich IRC parsing behavior. |
| arc/parser/constraints_test.py | Adds tests for Gaussian/ORCA constraint parsing. |
| arc/parser/adapters/orca.py | Adds ORCA input constraint parser. |
| arc/parser/adapter.py | Adds default hook for absolute scan energies in Hartree. |
| arc/job/trsh.py | Adds “DispUnconverged” detection and no_tight troubleshooting path. |
| arc/job/trsh_test.py | Adds tests for displacement-only unconverged detection. |
| arc/job/ssh.py | Adds delayed-existence retry for downloads; improves job-id parsing; SSH connect key handling; adds remove_dir. |
| arc/job/ssh_pool.py | Adds process-lifetime SSH connection pool. |
| arc/job/pipe/pipe_coordinator.py | Prevents pipe mode for engines resolving to remote servers. |
| arc/job/pipe/pipe_coordinator_test.py | Adds test ensuring pipe is disabled when engine resolves to remote server. |
| arc/job/adapters/ts/xtbgsm_test.py | Adds tests for GSM provenance capture and ograd preservation behavior. |
| arc/job/adapters/ts/xtb_gsm.py | Records GSM stringfile path as TSGuess log provenance; tracks node-output dir. |
| arc/job/adapters/ts/orca_neb.py | Makes NEB input refer to local reactant/product xyz filenames (not absolute). |
| arc/job/adapters/scripts/xtb_gsm/ograd | Preserves per-node energy/gradient/xtbout files for provenance. |
| arc/job/adapters/gaussian.py | Applies no_tight troubleshooting flag to drop opt=tight. |
| arc/job/adapter.py | Adds SSH reuse (pool/shared) across upload/submit/download; submit-script validation; remote cleanup; HTCondor job.log gating. |
| ARC.py | Wires TCKDB sweep into main execution and closes SSH pool on exit. |
| ARC_test.py | Adds tests intended to cover ARC.py/TCKDB sweep wiring. |
| .gitignore | Adds ignores for agent folders and ARC.egg-info. |
Comments suppressed due to low confidence (1)
docs/gaussian.md:33
- In the Gaussian input template,
MULTIPLICTYis misspelled; this should beMULTIPLICITYto avoid confusion for users copying the template.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # We treat both "ts_label set but TS missing/non-converged" and | ||
| # "no ts_label at all" as the partial case. The latter is rare | ||
| # (ARC normally emits a ts_label even on failure), but a | ||
| # reaction record with reactants/products and no TS is exactly | ||
| # the partial-shape we want to allow under the flag. | ||
| is_partial = bool(ts_label) and not ts_converged | ||
|
|
| ## Troubleshooting | ||
|
|
||
| There are many situations where we can gaussian job errors and gaussian in the output file reports the error code. | ||
|
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #890 +/- ##
==========================================
+ Coverage 60.63% 62.70% +2.07%
==========================================
Files 103 112 +9
Lines 31186 34596 +3410
Branches 8128 8843 +715
==========================================
+ Hits 18910 21694 +2784
- Misses 9926 10308 +382
- Partials 2350 2594 +244
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Initial work of integrating TCKDB Adapter in ARC