Skip to content

ABACUS MD simulation with LCAO runs out of memory after ~37 steps #7209

@ziyuegao-yoyo

Description

@ziyuegao-yoyo

Details

[I am performing an MD simulation with LCAO of a silicate melt (SiO₂) system with ~324 atoms. The calculation is running on a single node with 128 CPU cores and 256 GB memory. The version is abacus-develop.

The simulation starts normally, but after about one hour (~37 MD steps), it crashes due to memory exhaustion.

Since the job runs fine at the beginning and only fails after several MD steps, it seems that memory usage is increasing during the simulation rather than being a static memory limitation.

I would like to ask:

Could this behavior be related to a compilation issue (e.g., MPI / ELPA / ScaLAPACK / memory handling)?
Is there any known issue of memory accumulation or memory leak in MD simulations (especially with LCAO)?
What is the recommended way to solve it in this case?]

SiO2.zip

Task list for Issue attackers (only for developers)

  • Reproduce the performance issue on a similar system or environment.
  • Identify the specific section of the code causing the performance issue.
  • Investigate the issue and determine the root cause.
  • Research best practices and potential solutions for the identified performance issue.
  • Implement the chosen solution to address the performance issue.
  • Test the implemented solution to ensure it improves performance without introducing new issues.
  • Optimize the solution if necessary, considering trade-offs between performance and other factors (e.g., code complexity, readability, maintainability).
  • Review and incorporate any relevant feedback from users or developers.
  • Merge the improved solution into the main codebase and notify the issue reporter.

Metadata

Metadata

Assignees

No one assigned

    Labels

    MemoryMemory issuesPerformanceIssues related to fail running ABACUS

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions