Skip to content

feat(stageleft): fix q! span attribution via hidden macro export#74

Open
shadaj wants to merge 1 commit into
mainfrom
sandbox-5d382e2d-58bd-43a1-84d8-d42e347b80b5
Open

feat(stageleft): fix q! span attribution via hidden macro export#74
shadaj wants to merge 1 commit into
mainfrom
sandbox-5d382e2d-58bd-43a1-84d8-d42e347b80b5

Conversation

@shadaj
Copy link
Copy Markdown
Member

@shadaj shadaj commented May 20, 2026

Summary

When code inside a q!() panics at runtime, the backtrace now correctly points to the q!() definition site rather than the splice site.

Background: How stageleft works today (before this PR)

When you write:

#[stageleft::entry]
fn my_entry(_ctx: BorrowBounds<'_>) -> impl Quoted<'_, i32> {
    q!(crate::compute(x + 1))
}

The q!() macro captures the expression as a string and stores it in a closure. At splice time (when the entry macro is invoked), to_tokens() parses that string, rewrites relative paths (crate::final_crate::__staged::), and emits the expression as a TokenStream. The generated code looks like:

{
    use final_crate::__staged::*;
    let x__free = captured_value;
    final_crate::__staged::compute(x__free + 1)
}

The problem: all tokens in this output have Span::call_site() spans, so panics, backtraces, and compiler diagnostics point to the splice site — not the original q!() source location.

How it works after this PR

q!() now emits a hidden #[macro_export] macro_rules! alongside the closure. The macro body contains the original expression tokens with their source spans preserved. At splice time, instead of emitting the expression directly, we emit an invocation of this macro. Because macro_rules! preserves definition-site spans in its expansion, the compiled code's debug info points back to the q!() source — so backtraces show the correct file and line.

Full expansion of q!()

Given:

// src/lib.rs, line 5
q!(crate::compute(x + 1))

In non-Rust-Analyzer mode, q!() expands to:

{
    // Hidden macro with preserved source spans in the body
    #[allow(unexpected_cfgs, non_local_definitions, clippy::crate_in_macro_def)]
    #[cfg_attr(not(stageleft_macro), macro_export)]
    #[doc(hidden)]
    macro_rules! __stageleft_quote_src_lib_rs_5_7 {
        ([__sl_p0 = $($__sl_p0:ident)::*, x__free = $x__free:expr,]
         [__sl_p0 :: compute(x__free + 1)]) => {
            {
                #[allow(non_snake_case)]
                {
                    let x__free = $x__free;
                    $($__sl_p0)::*::compute(x__free + 1)
                }
            }
        };
    }

    move |__stageleft_ctx: &_, __output: &mut QuotedOutput, __props: &mut _| {
        // fn() -> T pointers for type inference
        let x__free = FreeVariableWithContext::uninitialized(&x, __stageleft_ctx);

        if true {
            // Capture free variables and store metadata
            __output.captures.push(Capture { ident: "x__free", tokens: ... });
            __output.module_path = module_path!();
            __output.crate_name = option_env!("STAGELEFT_FINAL_CRATE_NAME").unwrap_or(env!("CARGO_PKG_NAME"));
            __output.pkg_name = env!("CARGO_PKG_NAME");
            __output.tokens = "__sl_p0 :: compute (x__free + 1)";
            __output.macro_name = Some("__stageleft_quote_src_lib_rs_5_7");
            __output.relative_paths = &["crate"];
            *__props = Some(...);
            panic!("stageleft: q!() closure completed");
        }

        // Unreachable: type inference block.
        // Invokes the just-defined macro with original path prefixes + uninit values.
        #[allow(unreachable_code, unused_qualifications)]
        {
            let x__free = x__free();
            __stageleft_quote_src_lib_rs_5_7!(
                [__sl_p0 = crate, x__free = x__free,]
                [__sl_p0 :: compute(x__free + 1)]
            )
        }
    }
}

In Rust Analyzer mode, the expansion is simpler (no macro generation, no path rewriting):

{
    move |__stageleft_ctx: &_, __output: &mut QuotedOutput, __props: &mut _| {
        let x__free = FreeVariableWithContext::uninitialized(&x, __stageleft_ctx);

        if true {
            __output.captures.push(...);
            // tokens field is empty in RA mode
            __output.tokens = "";
            __output.macro_name = None;
            ...
            panic!("stageleft: q!() closure completed");
        }

        // Unreachable: uses original paths directly (they resolve locally)
        #[allow(unreachable_code, unused_qualifications)]
        {
            let x__free = x__free();
            crate::compute(x__free + 1)  // original expression, unmodified
        }
    }
}

At splice time

When the entry macro is invoked (e.g., my_entry!()), to_tokens() runs the closure via catch_unwind, extracts the metadata, and emits:

{
    use final_crate::__staged::__deps::*;
    use final_crate::__staged::*;
    use crate::*;  // brings #[macro_export] macros into scope
    __stageleft_quote_src_lib_rs_5_7!(
        [__sl_p0 = final_crate::__staged, x__free = captured_value,]
        [__sl_p0 :: compute(x__free + 1)]
    )
}

Fallback path

When the macro isn't accessible (e.g., q!() was in a #[cfg(test)] module or an example binary), the splice emits the expression directly with path placeholders resolved at runtime:

{
    use final_crate::__staged::__deps::*;
    use final_crate::__staged::*;
    {
        let x__free = captured_value;
        final_crate::__staged::compute(x__free + 1)
    }
}

This loses span attribution but is functionally correct.

Key design decisions

  • Macro naming: The macro name is derived from the source file path (relative to crate root) + line + column, ensuring uniqueness and portability across machines.

  • Path handling: Relative paths (crate::, self::, super::) in the q!() expression refer to items relative to where the q!() is written. But when the expression is spliced into a different crate (or a different module within the same crate), these paths would resolve incorrectly. Stageleft solves this by rewriting them to point through the __staged module, which re-exports all items as pub.

    In this PR, the rewriting is split into two parts:

    1. At q!() expansion time: Relative path prefixes are replaced with placeholder idents (__sl_p0, __sl_p1, etc.). For example, crate::module::Foo becomes __sl_p0::module::Foo. This happens in the same visitor pass that detects free variables.
    2. At splice time: The placeholders are resolved to concrete paths (e.g., __sl_p0final_crate::__staged) and passed as macro arguments using $($__sl_pN:ident)::* repetition, which allows macro_rules! to concatenate the resolved prefix with the remaining path segments.

    This two-phase approach is necessary because the hidden macro is defined at q!() time (when the final crate name isn't known), but invoked at splice time (when it is). The $($ident)::* pattern is used instead of :path because :path fragments cannot be concatenated with ::suffix in macro_rules!.

  • Free variables: Passed as :expr parameters with name = value syntax. The macro body binds them with let name = $name;.

  • Literal body match: The second [...] argument contains the expression tokens (post-path-rewrite). This serves as a coherence check — if downstream tooling mutates the AST, the macro pattern won't match.

  • Fallback for inaccessible macros: When the q!() is in a #[cfg(test)] module or outside the crate (examples, trybuild), the macro may not be accessible. In these cases, the splice emits the expression directly with path resolution done at runtime.

  • Rust Analyzer mode: All macro generation is skipped for performance; paths are left as-is for local resolution.

  • Single-pass visitor: The FreeVariableVisitor now handles both free variable detection and path prefix rewriting in one traversal (when not in RA mode).

Other changes

  • stageleft_tool: gen_macro emits STAGELEFT_FINAL_CRATE_MANIFEST_DIR for portable path resolution. New gen_staged(bool) unifies deps/test-module/lib-pub generation with a single source parse.
  • stageleft_tool: Registers #[cfg(test)] module paths via a ctor, enabling the fallback detection at splice time.
  • CI: Removed CARGO_PROFILE_*_STRIP: debuginfo so backtrace tests can verify line numbers.
  • proc-macro2 bumped to 1.0.103 for Span::file() API.

@shadaj shadaj force-pushed the sandbox-5d382e2d-58bd-43a1-84d8-d42e347b80b5 branch 16 times, most recently from 5c4aa46 to c249897 Compare May 20, 2026 21:16
Comment thread stageleft/src/lib.rs
Comment on lines +655 to +659
/// Invokes a quoted closure, capturing its output via catch_unwind.
fn invoke_quoted<T, Ctx, Props>(
f: impl FnOnce(&Ctx, &mut QuotedOutput, &mut Option<Props>) -> T,
ctx: &Ctx,
) -> (QuotedOutput, Props) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any extra runtime cost in using catch_unwind?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, absolutely (especially with the global mutex). But a panic is the only way to get type inference to go through without executing the code being quoted. The earlier implementation (well before this PR, this diff is just a refactor) used MaybeUninit but we found out later that this was completely undefined behavior and we were getting lucky that it worked 🙃

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, thanks for the context!

@shadaj shadaj force-pushed the sandbox-5d382e2d-58bd-43a1-84d8-d42e347b80b5 branch 3 times, most recently from 6a349ca to ce0444d Compare May 20, 2026 23:10
.count()
};
let prefix_segments: Vec<_> = i.segments.iter().take(prefix_len).collect();
let key = quote::quote!(#(#prefix_segments)::*).to_string();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What on earth is this doing?

#[derive(Default)]
pub struct FreeVariableVisitor {
pub free_variables: BTreeSet<syn::Ident>,
pub relative_paths: BTreeMap<String, usize>,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The quoted crate, self and super identifiers are being mapped to what here? The order in which they appeared in the quoted code?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, and that index will then be used for the __sl_p{i} macro inputs that are used to feed the resolved paths at splice-time.

Comment thread stageleft_test/src/lib.rs Outdated
@shadaj shadaj force-pushed the sandbox-5d382e2d-58bd-43a1-84d8-d42e347b80b5 branch 3 times, most recently from 40d0d90 to 307aafc Compare May 20, 2026 23:36
@luckyworkama
Copy link
Copy Markdown
Contributor

It would be nice if you could update the PR description to document roughly how it works today as well as how it works after this change, I think that would be a nice addition.

@shadaj
Copy link
Copy Markdown
Member Author

shadaj commented May 20, 2026

It would be nice if you could update the PR description to document roughly how it works today as well as how it works after this change, I think that would be a nice addition.

Added a summary that does a decent job of capturing the PR. (disclaimer: AI generated but I did a couple rounds of polish to make sure it actually reflects my creative intent)

@shadaj shadaj marked this pull request as ready for review May 21, 2026 00:52
@MingweiSamuel
Copy link
Copy Markdown
Member

Are there no sort of tricks with Span::mixed_site() and resolved_at/located_at to make this work without the extra codegen?

@shadaj
Copy link
Copy Markdown
Member Author

shadaj commented May 28, 2026

Are there no sort of tricks with Span::mixed_site() and resolved_at/located_at to make this work without the extra codegen?

Nope, basically the issue is that we would want the generated (trybuild) code to effectively have a foreign span. This is a special case where the foreign span originates in Rust code, so we can pull this off on stable with a declarative macro.

@MingweiSamuel MingweiSamuel force-pushed the sandbox-5d382e2d-58bd-43a1-84d8-d42e347b80b5 branch from 307aafc to 28c9c04 Compare May 28, 2026 18:36
When code inside a q!() panics at runtime, the backtrace now correctly
points to the q!() definition site rather than the splice site.

How it works:
- q!() emits a hidden #[macro_export] macro_rules! whose body is the
  original expression with preserved source spans.
- Relative paths (crate::, self::, super::) become :path macro parameters,
  free variables become :expr parameters.
- At splice time, to_tokens() emits a macro invocation with resolved paths
  and captured values as arguments.
- Macro name is a SHA-256 hash of canonicalized file path + line + column +
  expression content for uniqueness across compilation contexts.

Also adds splice_untyped_ctx_original / splice_untyped_ctx_props_original
methods that bypass the macro when original AST output is preferred.

Co-authored-by: Infinity 🤖 <infinity@hydro.run>
PR: #74
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@MingweiSamuel MingweiSamuel force-pushed the sandbox-5d382e2d-58bd-43a1-84d8-d42e347b80b5 branch from 28c9c04 to 843dba9 Compare May 28, 2026 18:41
@MingweiSamuel MingweiSamuel requested a review from Copilot May 29, 2026 20:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 30 out of 31 changed files in this pull request and generated 3 comments.

Comment thread stageleft/src/lib.rs
Comment on lines +620 to +638
impl VisitMut for MetavarRewriter<'_> {
fn visit_path_mut(&mut self, path: &mut syn::Path) {
if let Some(first) = path.segments.first()
&& let Some(idx_str) =
first.ident.to_string().strip_prefix("__sl_p")
&& let Ok(idx) = idx_str.parse::<usize>()
&& let Some(resolved) = self.path_args.get(idx)
{
let resolved_path: syn::Path =
syn::parse2(resolved.clone()).unwrap();
let remaining: Vec<_> =
path.segments.iter().skip(1).cloned().collect();
*path = resolved_path;
for seg in remaining {
path.segments.push(seg);
}
}
syn::visit_mut::visit_path_mut(self, path);
}
Comment thread stageleft_tool/src/lib.rs
Comment on lines +617 to +627
// Generate lib_pub.rs if requested
if gen_pub {
let flow_lib_pub = gen_staged_mod(lib_path, parse_quote!(crate), None, false);

fs::write(
Path::new(&out_dir).join("lib_pub.rs"),
prettyplease::unparse(&flow_lib_pub),
)
.unwrap();
println!("cargo::rerun-if-changed=src");
}
Comment on lines +18 to +25
std::panic::set_hook(Box::new(move |_| {
let bt = std::backtrace::Backtrace::force_capture();
*bt_clone.lock().unwrap() = Some(bt.to_string());
}));
let result = std::panic::catch_unwind(|| {
panicking_entry!();
});
let _ = std::panic::take_hook();
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants