Skip to content

[Relax] Support constant folding for call_tir with tuple outputs#18736

Draft
guan404ming wants to merge 1 commit intoapache:mainfrom
guan404ming:fold-constant-tuple-outputs
Draft

[Relax] Support constant folding for call_tir with tuple outputs#18736
guan404ming wants to merge 1 commit intoapache:mainfrom
guan404ming:fold-constant-tuple-outputs

Conversation

@guan404ming
Copy link
Member

Why

Constant folding skipped call_tir nodes with tuple (multi-tensor) outputs, leaving foldable operations unoptimized.

How

  • Add ConstEvaluateCallTIRTuple to handle call_tir with TupleStructInfo output by allocating and packing multiple output tensors
  • Route VisitCallTIR through tuple vs single-tensor paths based on sinfo_args type
  • Add test for folding a split-like prim_func with two output tensors

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @guan404ming, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the Relax constant folding pass by enabling it to optimize call_tir operations that return multiple tensors as a tuple. This addresses a limitation where such operations were previously unoptimized, allowing for more aggressive constant propagation and potentially improving the performance of compiled models.

Highlights

  • Expanded Constant Folding: Constant folding now supports call_tir nodes that produce multiple output tensors (tuples), which were previously skipped.
  • New Evaluation Function: Introduced ConstEvaluateCallTIRTuple to specifically handle the evaluation and packing of multiple output tensors for call_tir calls.
  • Improved VisitCallTIR Logic: The VisitCallTIR method has been updated to correctly route call_tir nodes to either the single-tensor or tuple-output evaluation paths based on their sinfo_args type.
  • New Test Case: A dedicated test test_fold_tuple_output was added to validate the constant folding of a split-like prim_func with two output tensors.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • src/relax/transform/fold_constant.cc
    • Removed a restrictive comment in MatchConstShape regarding single output support.
    • Clarified the purpose of ConstEvaluateCallTIR to handle single tensor outputs.
    • Implemented ConstEvaluateCallTIRTuple to manage constant evaluation for call_tir with tuple outputs, including shape matching, tensor allocation, and argument packing.
    • Refactored VisitCallTIR to conditionally call ConstEvaluateCallTIRTuple for tuple outputs or ConstEvaluateCallTIR for single tensor outputs.
    • Removed a TODO comment related to tuple output support.
  • tests/python/relax/test_transform_fold_constant.py
    • Added test_fold_tuple_output, which defines a prim_func that splits a tensor into two and verifies that FoldConstant correctly folds the call_tir into a Tuple of Constant expressions.
Activity
  • The author, guan404ming, has submitted this pull request to introduce new functionality and a corresponding test. No further review comments or activity are available in the provided context.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request extends constant folding to support call_tir with tuple outputs, which was previously a limitation. The changes involve adding a new function ConstEvaluateCallTIRTuple to handle multiple tensor outputs and updating VisitCallTIR to route to the correct evaluation path. A corresponding test case is added to verify the new functionality.

My review focuses on the C++ implementation. I've suggested a couple of improvements for code clarity and correctness: one to use Downcast for safer type casting and another to refactor argument packing logic to be more idiomatic. Overall, the changes are well-implemented and the new test is thorough.

Comment on lines 193 to 194
auto tensor_sinfo = tuple_sinfo->fields[i].as<TensorStructInfoNode>();
if (!tensor_sinfo || tensor_sinfo->IsUnknownDtype()) return std::nullopt;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since MatchConstShape is called on line 191 and it succeeds, we know that tuple_sinfo->fields[i] is a TensorStructInfo. Therefore, you can use Downcast<TensorStructInfo> here instead of as<TensorStructInfoNode>() to make the assumption explicit and avoid a redundant null check. Downcast will perform a checked cast.

Suggested change
auto tensor_sinfo = tuple_sinfo->fields[i].as<TensorStructInfoNode>();
if (!tensor_sinfo || tensor_sinfo->IsUnknownDtype()) return std::nullopt;
auto tensor_sinfo = Downcast<TensorStructInfo>(tuple_sinfo->fields[i]);
if (tensor_sinfo->IsUnknownDtype()) return std::nullopt;

Comment on lines 199 to 208
std::vector<AnyView> packed_args(arr_args.size() + num_outputs);
std::vector<runtime::Tensor> temp_args(arr_args.begin(), arr_args.end());

size_t arg_offset = 0;
for (; arg_offset < arr_args.size(); ++arg_offset) {
packed_args[arg_offset] = temp_args[arg_offset];
}
for (size_t i = 0; i < num_outputs; ++i) {
packed_args[arg_offset++] = ret_tensors[i];
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The argument packing logic can be simplified for better readability and to be more idiomatic C++. Instead of using a C-style loop with an index, you can use range-based for loops to populate packed_args.

    std::vector<runtime::Tensor> temp_args(arr_args.begin(), arr_args.end());
    std::vector<AnyView> packed_args;
    packed_args.reserve(temp_args.size() + num_outputs);
    for (const auto& arg : temp_args) {
      packed_args.push_back(arg);
    }
    for (const auto& out_tensor : ret_tensors) {
      packed_args.push_back(out_tensor);
    }

@guan404ming guan404ming force-pushed the fold-constant-tuple-outputs branch from 5c9384d to ebb28f3 Compare February 9, 2026 03:55
Signed-off-by: Guan-Ming Chiu <guanmingchiu@gmail.com>
@guan404ming guan404ming force-pushed the fold-constant-tuple-outputs branch from ebb28f3 to d4f106e Compare February 9, 2026 04:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant