Skip to content

Conversation

@sunshowers
Copy link
Contributor

@sunshowers sunshowers commented Jul 15, 2020

Depends on #16.

Add an option to InlinerBuilder, defaulting to false, which causes
syn-inline-mod to annotate inlined modules with the paths they come from.

This is useful when mapping spans back to the files they come from.

(There's a fair bit of added complexity here. I'd totally understand if you
don't want to accept this, but this is essential to the work I'm doing so I'd
unfortunately have to fork the crate in that case.)

Closes #15.

@TedDriggs
Copy link
Owner

Getting caught up on all my GH notifications, but wanted to acknowledge seeing the PR. I'll get a review up tomorrow morning.

@sunshowers
Copy link
Contributor Author

thank you! :)

Copy link
Owner

@TedDriggs TedDriggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good; I'm now really curious to know what project you're working on that needs it.

There is one more change that I think is warranted but I couldn't leave a comment on because it's outside the edited ranges: You deserve to be listed as an author of the crate :) If you'd like to be included just put yourself in on the next commit.

let mut path = None;
let position = attrs.iter().position(|attr| {
if attr.path.is_ident(SYN_INLINE_MOD_PATH) {
// Ignore the attribute if it isn't of the correct form.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is your worry here that some other proc-macro might happen to squat on our attribute name? Finding an attribute of the right name and wrong format feels to me like something must have gone horribly wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah or the user puts the annotation in themselves or something. The options are to panic or ignore -- what do you think about them? I guess panicking is probably better/safer since it'd be totally unexpected to hit this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I realized this may end up dealing with external data so we'd probably rather not panic. I think this should be a pretty rare case though, as you pointed out something's really gone wrong (but we don't want to create an opportunity for an accidental DoS).

Copy link
Owner

@TedDriggs TedDriggs Jul 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DoS seems bad, but if you're dealing with untrusted input couldn't someone put that attribute into the original source code to try and make you follow a path into a directory you shouldn't?

I'm wondering if embedding these in-band is fundamentally unsafe.

I think this can be done safely, but we should explicitly disallow inlining if the input contains the magic attribute in the inner or outer attributes. That way we avoid attacks that attempt to feed us incorrect paths; those sorts of attack could be used to try and read arbitrary files from wherever this code was being run.

If we need to support the same code being inlined multiple times, then I see two options:

  1. We only fail inlining if the attribute is present and we've been asked to add path annotations. One upside of this approach is that we don't do work scanning attributes unless we're anticipating using the inserted annotations, while a downside of this approach is that someone could still sneak a bad path through because there's no connection between find_mod_path and InliningResult
  2. Rather than being a constant, the attribute path is a property of InlininerBuilder and later of InliningResult. That seems pretty ugly, however, especially since that would mean find_mod_path should take (r: &InliningResult, attributes: &[Attribute]) to avoid callers from hardcoding what they thought the attribute name was in their own code.

@sunshowers
Copy link
Contributor Author

sunshowers commented Jul 18, 2020

I'm now really curious to know what project you're working on that needs it.

A bit of a long-winded explanation, sorry!

  • I'm writing tooling to make it really easy to write coverage-guided fuzz targets, integrated with property-based tests.
  • To make it work I need to have a way to collect all functions annotated with the #[propfuzz] macro in a crate.
  • My initial idea was to use the inventory crate, but there are issues with it that make it currently unsuitable.
  • @dtolnay and I discussed this and we came up with the alternate plan to scan all the lib.rs files, searching through them for names of fuzz targets.
  • We already have tooling that can give us all the lib.rs files in a workspace, so the rest we need is a way to traverse those files.
  • syn-inline-mod is perfect for this.
  • However we might encounter errors along the way (whether standard Rust issues or domain-specific ones), and we need to report them in a style that users understand.
  • For that I'm writing a small crate that can take a syn::File, scan through it, collect errors along the way, then report them in a nice manner with the little squigglies underneath and all just like rustc (using codespan-reporting).
  • This means that given a particular item, we need to figure out the filename its span comes from.
  • That's what led me to this.

Add an option to `InlinerBuilder`, defaulting to `false`, which causes
`syn-inline-mod` to annotate inlined modules with the paths they come from.

This is useful when mapping spans back to the files they come from.
Copy link
Owner

@TedDriggs TedDriggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making those changes. This is looking even better, but your comment about untrusted input made me look at this through more paranoid eyes.

let mut path = None;
let position = attrs.iter().position(|attr| {
if attr.path.is_ident(SYN_INLINE_MOD_PATH) {
// Ignore the attribute if it isn't of the correct form.
Copy link
Owner

@TedDriggs TedDriggs Jul 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DoS seems bad, but if you're dealing with untrusted input couldn't someone put that attribute into the original source code to try and make you follow a path into a directory you shouldn't?

I'm wondering if embedding these in-band is fundamentally unsafe.

I think this can be done safely, but we should explicitly disallow inlining if the input contains the magic attribute in the inner or outer attributes. That way we avoid attacks that attempt to feed us incorrect paths; those sorts of attack could be used to try and read arbitrary files from wherever this code was being run.

If we need to support the same code being inlined multiple times, then I see two options:

  1. We only fail inlining if the attribute is present and we've been asked to add path annotations. One upside of this approach is that we don't do work scanning attributes unless we're anticipating using the inserted annotations, while a downside of this approach is that someone could still sneak a bad path through because there's no connection between find_mod_path and InliningResult
  2. Rather than being a constant, the attribute path is a property of InlininerBuilder and later of InliningResult. That seems pretty ugly, however, especially since that would mean find_mod_path should take (r: &InliningResult, attributes: &[Attribute]) to avoid callers from hardcoding what they thought the attribute name was in their own code.


match visitor.visit() {
Ok(syn::File { attrs, items, .. }) => {
if self.annotate_paths {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the comment above, I wonder if we should check here that no attribute in the input is using our path, returning an error if it is?

///
/// If this is `true`, `find_mod_path` may be used to figure out the path a module is defined
/// in.
pub fn paths_annotated(&self) -> bool {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you feel about this being called has_path_annotations? In my opinion that makes it clearer this is describing the current state of something.

@TedDriggs
Copy link
Owner

Thanks for the great explanation of what you're up to - that's helpful for me when thinking about the long-term evolution of this crate.


/// A module path in a list of attributes, found by `find_mod_path`.
#[derive(Clone, Debug)]
pub struct InlineModPath<'ast> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be called InlineModAttributes and have path: Option<PathBuf> as a field? Then the find_mod_path function would be changed to split_mod_attributes(attrs: &[Attribute]) -> InlineModAttributes {}.

@sunshowers
Copy link
Contributor Author

Just wanted to say that this is in my queue of things to do, but I have personal life stuff going on at the moment which make it unlikely I can revisit this before mid-August. Thanks!

@TedDriggs
Copy link
Owner

@sunshowers sorry to hear that. If I have time I'll see if I can move it forward - would you like me to block merging on your review of any subsequent changes I make?

@sunshowers
Copy link
Contributor Author

Thanks -- it would be lovely if you could give me a day or so to respond, but please don't block on me if I don't get to it by then!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Preserve source filenames during inlining

2 participants