Skip to content

Only throw on the root module when dumping benchmark-mxr files#4851

Open
ahsan-ca wants to merge 4 commits into
developfrom
fix-benchmark-mxr
Open

Only throw on the root module when dumping benchmark-mxr files#4851
ahsan-ca wants to merge 4 commits into
developfrom
fix-benchmark-mxr

Conversation

@ahsan-ca
Copy link
Copy Markdown
Contributor

@ahsan-ca ahsan-ca commented May 6, 2026

Motivation

Only throw on the root module so that submodules (which are processed
first by the pass manager and may legitimately have no precompile ops
or no multi-solution candidates) don't abort compilation before the
root module has had a chance to dump its benchmark MXR files.

Technical Details

Changelog Category

Add a CHANGELOG.md entry for any option other than Not Applicable

    • Added: New functionality.
    • Changed: Changes to existing functionality.
    • Removed: Functionality or support that has been removed. (Compared to a previous release)
    • Optimized: Component performance that has been optimized or improved.
    • Resolved Issues: Known issues from a previous version that have been resolved.
    • Not Applicable: This PR is not to be included in the changelog.

@ahsan-ca ahsan-ca requested a review from pfultz2 May 6, 2026 22:21
@ahsan-ca ahsan-ca self-assigned this May 6, 2026
@ahsan-ca ahsan-ca requested a review from causten as a code owner May 6, 2026 22:21
Copilot AI review requested due to automatic review settings May 6, 2026 22:21
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the GPU compile_ops benchmarking-MXR dump path to avoid throwing an exception unless MXR binaries were actually produced, improving usability of MIGRAPHX_GPU_DUMP_BENCHMARK_MXR in cases where no dump occurs.

Changes:

  • Track whether benchmark MXR binaries were emitted during compilation.
  • Gate the post-dump MIGRAPHX_THROW on that “binaries dumped” condition instead of the dump setting alone.
Comments suppressed due to low confidence (1)

src/targets/gpu/compile_ops.cpp:567

  • The PR title/description mention adding a flag to control whether an exception is thrown after dumping MXR files, but this change only adjusts the throw condition based on a local has_binaries boolean (no new user-facing option/env var). Either update the PR metadata to match the actual behavior change, or add the intended user-controlled flag and use it to gate this MIGRAPHX_THROW.
        if(has_binaries)
        {
            MIGRAPHX_THROW(
                "Benchmark MXR files dumped to " + mxr_path +
                ". Run the MXR files to create a problem cache, then recompile with the cache.");

Comment thread src/targets/gpu/compile_ops.cpp Outdated
@pfultz2
Copy link
Copy Markdown
Collaborator

pfultz2 commented May 7, 2026

Why do we need this flag?

@ahsan-ca
Copy link
Copy Markdown
Contributor Author

ahsan-ca commented May 7, 2026

Why do we need this flag?

I noticed that the mxr files were not being dumped. I debugged and it turns out we hit the exception before the mxr files are dumped.
I asked Cursor on why we hit the exception before any of the mxr files are dumped:

dump_mxr is true whenever MIGRAPHX_GPU_DUMP_BENCHMARK_MXR is set, regardless of whether
anything was actually saved or whether there were any precompile_ops to process.
Combined with the reverse iteration in pass_manager, what really happened in your prior runs was:

1. compile_ops::apply is called on the first submodule (deepest, processed first).
2. That submodule has zero gpu::precompile_op instructions, so cps is empty and nothing is saved.
3. The throw at the end fires anyway because dump_mxr is true.
4. pass_manager catches the exception, dumps the in-progress program to /tmp/migraphx/gpu::compile_ops*.mxr and
re-throws.
5. Compilation aborts before compile_ops ever reaches the main module — which is the only module that contains
the multi-solution MLIR precompile_ops you actually wanted dumped.

That's why mxr-files-dump/ was empty and the throw still fired: the throw came from a submodule pass invocation
that had nothing to dump.

@pfultz2
Copy link
Copy Markdown
Collaborator

pfultz2 commented May 7, 2026

Ok I see the issue. This can throw on a submodule which will stop before the main module is run. This doesnt fix that issue(only for the rare case that submodules dont have kernels). We need to just throw on the root module. We need to take module_pass_manager in the compile ops pass and then add a flag to compile for is_root:

void compile_ops::apply(module_pass_manager& mpm) const
{
    bool is_root = &mpm.get_module() == mpm.get_root_module();
    compile_manager cm;
    ...
    cm.compile(mpm.get_module(), is_root);
    ...
}

Then update the condition to check dump_mxr and is_root.

@ahsan-ca ahsan-ca force-pushed the fix-benchmark-mxr branch from 1aa3c4b to d9ee29d Compare May 7, 2026 17:01
@ahsan-ca ahsan-ca requested a review from Copilot May 7, 2026 17:01
@ahsan-ca
Copy link
Copy Markdown
Contributor Author

ahsan-ca commented May 7, 2026

Ok I see the issue. This can throw on a submodule which will stop before the main module is run. This doesnt fix that issue(only for the rare case that submodules dont have kernels). We need to just throw on the root module. We need to take module_pass_manager in the compile ops pass and then add a flag to compile for is_root:

void compile_ops::apply(module_pass_manager& mpm) const
{
    bool is_root = &mpm.get_module() == mpm.get_root_module();
    compile_manager cm;
    ...
    cm.compile(mpm.get_module(), is_root);
    ...
}

Then update the condition to check dump_mxr and is_root.

Thanks! Made the change.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

src/targets/gpu/compile_ops.cpp:571

  • When MIGRAPHX_GPU_DUMP_BENCHMARK_MXR is set, this always throws on the root module even if no benchmark MXR files were actually written (e.g., no compile_plan had results.size()>1). That makes the error text inaccurate and can abort compilation unnecessarily; consider tracking whether any MXR file was dumped and only throwing (and wording the message) when at least one file was created.
        static const auto mxr_path = string_value_of(MIGRAPHX_GPU_DUMP_BENCHMARK_MXR{});
        bool dump_mxr              = not mxr_path.empty();

        if(dump_mxr)
        {
            fs::create_directories(fs::path(mxr_path));
        }

        for(const auto& cp : cps)
        {
            if(cp.results.empty())
                continue;
            if(dump_mxr and cp.results.size() > 1)
            {
                cp.save_binaries(fs::path(mxr_path));
            }
            else
            {
                cp.replace(m);
            }
        }

        // Only throw on the root module so that submodules (which are processed
        // first by the pass manager and may legitimately have no precompile ops
        // or no multi-solution candidates) don't abort compilation before the
        // root module has had a chance to dump its benchmark MXR files.
        if(dump_mxr and is_root)
        {
            MIGRAPHX_THROW(
                "Benchmark MXR files dumped to " + mxr_path +
                ". Run the MXR files to create a problem cache, then recompile with the cache.");
        }

Comment thread src/targets/gpu/include/migraphx/gpu/compile_ops.hpp
Comment thread src/targets/gpu/compile_ops.cpp
@ahsan-ca ahsan-ca changed the title Flag to control throwing an exception after mxr files have been dumped Only throw on the root module when using MIGRAPHX_GPU_DUMP_BENCHMARK_MXR May 7, 2026
@ahsan-ca ahsan-ca changed the title Only throw on the root module when using MIGRAPHX_GPU_DUMP_BENCHMARK_MXR Only throw on the root module when dumping benchmark-mxr files May 7, 2026
@ahsan-ca ahsan-ca requested a review from Copilot May 7, 2026 17:26
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

src/targets/gpu/compile_ops.cpp:571

  • The unconditional throw in dump-mxr mode is still reachable even when no MXR files were actually written (e.g., no precompile ops in the module, or every plan resolves to a single solution so save_binaries() is never called). That makes the message "Benchmark MXR files dumped" misleading and can unnecessarily abort compilation. Consider tracking whether at least one binary was saved and only throwing (or adjusting the message) when something was actually dumped.
        // Only throw on the root module so that submodules (which are processed
        // first by the pass manager and may legitimately have no precompile ops
        // or no multi-solution candidates) don't abort compilation before the
        // root module has had a chance to dump its benchmark MXR files.
        if(dump_mxr and is_root)
        {
            MIGRAPHX_THROW(
                "Benchmark MXR files dumped to " + mxr_path +
                ". Run the MXR files to create a problem cache, then recompile with the cache.");
        }

Comment thread src/targets/gpu/compile_ops.cpp Outdated
@ahsan-ca
Copy link
Copy Markdown
Contributor Author

@pfultz2 PR is ready for review. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants