Make it possible to lazily deserialize DefNode in Loader.java#3983
Make it possible to lazily deserialize DefNode in Loader.java#3983
Conversation
* TRUFFLERUBY_METRICS_REPS=5 jt metrics time --experimental-options -e0 For parsing-core: before: 0.097 0.099 0.092 0.096 after: 0.061 0.063 0.066 0.059 * Remove extra trailing spaces by using `<%-#`.
|
Not really, unfortunately. PM_NODE_FLAG_NEWLINE maps to whether or not a node is a candidate to emit a LINE event in tracepoint. Whether or not it actually does maps to whichever node is reached first on a given line inside the compiler. To recreate it, you'd have to recreate the exact order to CRuby compiler walks the AST, which would be pretty fragile as it can change depending on optimizations. Honestly, this is probably something that should be codified, because it's super weird to me to have it depend on the compiler. But that's the state it's in. I'm happy to serialize PM_NODE_FLAG_NEWLINE (we already serialize flags, I think it would just be another bit). But that would only help with candidate selection, not actually marking the nodes. That said, I'm fine with the changes in this PR. |
|
@eregon @kddnewton I advocated in having serialization provide this because the extra walk on a cold runtime seemed like a downside over size. Whether newlines is a flag (like legacy parser) or in a table I don't really have any opinion. |
|
Thank you for the summary. I wonder how hard it would be to do all the newline marking directly in the compilers.
I guess there is one way to know for sure, by trying to do it in TruffleRuby or JRuby.
From a quick look it's already serialized, but currently omitted for nodes which don't have other flags to save on serialized size. It would be trivial to change but since it's only candidates I think it wouldn't be a big help (would still need an extra traversal or handling while compiling to compute the real line events). Would probably make it easier for compilers though, since that's how it works on CRuby. |
|
@enebo Since you have had and probably still have the logic to mark real newline events from candidate newline flags in JRuby, do you think that's something easy enough to just replicate in every compiler? Would you have a link maybe? (I can also look myself another time, right now it's too late). |
|
I should say I like this overall. Using inheritance to opt into lazy also feels like a nice way to do this. |
Honestly, my memory is failing me past remembering the discussion. From what I do remember the actual marking was just stmt in the original grammar with some weirdness with |
|
I should add I have not looked at any line adjustment in compile.c so there perhaps is more weirdness but hopefully that is not changing the original tree's location of newline on nodes. |
|
I do know there's some weirdness where a call node is considered as being on the same line as its message, not its receiver. |
|
@kddnewton Do you think it makes sense to try and normalize this into prism since it would eliminate line numbering bugs across implementations? I am not 100% sure on wrinkles but it seems like it would be helpful (saying this as someone who has to fix a line numbering bug in JRuby for the next point release using the legacy parser). |
|
I mean I support that work, but I doubt it would fly with the other folks in CRuby. Also, it is a bit repetitive, because it means that for every tree we would need to walk it once to set the line nodes and then once more to compile it. Right now it just happens once. |
<%-#.There is still one problem though, which is that
Node#newLineFlagis not set insideDefNodewhen using lazy DefNodes.MarkNewlinesVisitoras a Loader field and use it increateDefNodeFromSavedPosition(), but MarkNewlinesVisitor holds onto the Source object, which holds on the source byte[]. There seems to be no usage of Source#bytes though (except in findLine() which is replaceable), so then it'd just keep theint[] lineOffsetswhich seems fine. The specific nodes being flagged might differ with multipledefon the same line, not sure if that matters.MarkNewlinesVisitorand let compilers deal with this entirely on their own.PM_NODE_FLAG_NEWLINEbut that's not equivalent toNode#newLineFlagit's just a potential newline and it would increase serialized size.cc @kddnewton @enebo @headius