Skip to content

feat(train): show finish time in ETA logs#5328

Open
OutisLi wants to merge 3 commits intodeepmodeling:masterfrom
OutisLi:pr/time
Open

feat(train): show finish time in ETA logs#5328
OutisLi wants to merge 3 commits intodeepmodeling:masterfrom
OutisLi:pr/time

Conversation

@OutisLi
Copy link
Collaborator

@OutisLi OutisLi commented Mar 19, 2026

Make long-running training progress easier to read by keeping the relative ETA and appending a concise absolute finish time across the pt, pd, tf, and pt_expt backends.

Summary by CodeRabbit

  • New Features
    • Training logs now include both remaining ETA and an estimated local finish time (YYYY-MM-DD HH:MM).
    • Timezone-aware local timestamps are shown across training frameworks for clearer cross-region monitoring and more consistent periodic timing output.

Make long-running training progress easier to read by keeping the relative ETA and appending a concise absolute finish time across the pt, pd, tf, and pt_expt backends.

Made-with: Cursor
Copilot AI review requested due to automatic review settings March 19, 2026 12:09
@dosubot dosubot bot added the enhancement label Mar 19, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 19, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 6b555d09-3cad-4084-be9e-f48a1a99a4f1

📥 Commits

Reviewing files that changed from the base of the PR and between 605339a and bb6a9f1.

📒 Files selected for processing (1)
  • deepmd/tf/train/trainer.py
✅ Files skipped from review due to trivial changes (1)
  • deepmd/tf/train/trainer.py

📝 Walkthrough

Walkthrough

Adds timezone-aware finish-time formatting to training logs. A helper formats a local finish timestamp from an ETA; format_training_message accepts an optional current_time to include “at YYYY-MM-DD HH:MM”. Multiple training modules pass UTC timestamps converted to local time into the logger.

Changes

Cohort / File(s) Summary
Core logging utility
deepmd/loggers/training.py
Added _format_estimated_finish_time(eta_seconds, current_time=None) and extended format_training_message(...) to accept current_time; when eta is provided, messages include both duration and an “at ” suffix; updated docstrings.
PD training
deepmd/pd/train/training.py
Imported datetime and updated log_loss_valid to pass a timezone-aware current_time (UTC timestamp → local tz) into format_training_message().
PT training
deepmd/pt/train/training.py
Imported datetime and updated periodic timing log to pass a timezone-aware current_time into format_training_message().
PT experimental training
deepmd/pt_expt/train/training.py
Imported datetime, added last_log_time to compute interval wall time, replaced direct wall/ETA log formatting with format_training_message(); when enabled computes eta and passes timezone-aware current_time.
TF training
deepmd/tf/train/trainer.py
Imported datetime and updated on-the-fly validation logging to compute eta and pass a timezone-aware current_time into format_training_message().

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Trainer
participant Logger as "loggers.training\nformat_training_message"
participant Helper as "_format_estimated_finish_time"
participant Clock as "System Clock / tz conversion"
Trainer->>Logger: call format_training_message(batch, wall_time, eta, current_time)
Logger->>Helper: if eta provided -> compute finish time (eta_seconds, current_time)
Helper->>Clock: derive/convert current_time to local tz (if needed)
Clock-->>Helper: local finish timestamp (YYYY-MM-DD HH:MM)
Helper-->>Logger: formatted finish-time string
Logger-->>Trainer: formatted training message (includes "ETA X (at YYYY-MM-DD HH:MM)")

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • njzjz
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(train): show finish time in ETA logs' accurately describes the primary change: adding estimated finish time information to training ETA log messages across multiple backends.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/pt_expt/train/training.py`:
- Around line 801-805: The ETA calculation uses absolute display_step_id with
wall_elapsed from only the current run, making resumed training ETA too
optimistic; update the calculation in the block that computes eta (currently
using display_step_id, wall_elapsed, and self.num_steps) to compute
elapsed_steps = display_step_id - self.start_step (or max(1, that) to avoid
div-by-zero), compute avg_time_per_step = wall_elapsed / elapsed_steps, then set
remaining_steps = self.num_steps - display_step_id and eta = int(remaining_steps
* avg_time_per_step) so ETA reflects time per step since the run started.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 525e84e4-fdf5-44ae-8ab6-ba943790037e

📥 Commits

Reviewing files that changed from the base of the PR and between b2c8511 and 15eb655.

📒 Files selected for processing (5)
  • deepmd/loggers/training.py
  • deepmd/pd/train/training.py
  • deepmd/pt/train/training.py
  • deepmd/pt_expt/train/training.py
  • deepmd/tf/train/trainer.py

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves training progress logging by keeping the relative ETA while appending an absolute (local) estimated finish time across multiple backends (tf, pt, pd, pt_expt).

Changes:

  • Extend format_training_message to append an estimated local finish timestamp when eta is provided.
  • Pass current_time (localized) into ETA logging in TF/PT/PD trainers.
  • Switch pt_expt display logging to use the shared format_training_message formatter.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
deepmd/loggers/training.py Adds finish-time formatting helper and extends training summary message formatting/docs.
deepmd/tf/train/trainer.py Computes and logs ETA plus localized current time for finish-time estimation.
deepmd/pt/train/training.py Adds localized current time to ETA log formatting.
deepmd/pd/train/training.py Adds localized current time to ETA log formatting.
deepmd/pt_expt/train/training.py Uses shared training message formatter and adds ETA/current time logging.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@OutisLi OutisLi requested a review from njzjz March 19, 2026 12:22
@codecov
Copy link

codecov bot commented Mar 19, 2026

Codecov Report

❌ Patch coverage is 91.66667% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.33%. Comparing base (1ce26cf) to head (bb6a9f1).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
deepmd/loggers/training.py 88.88% 1 Missing ⚠️
deepmd/pt_expt/train/training.py 90.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5328   +/-   ##
=======================================
  Coverage   82.33%   82.33%           
=======================================
  Files         777      777           
  Lines       77813    77834   +21     
  Branches     3675     3675           
=======================================
+ Hits        64065    64086   +21     
- Misses      12573    12574    +1     
+ Partials     1175     1174    -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants