-
Notifications
You must be signed in to change notification settings - Fork 500
[o11y] Additional hardening against missing outcome event, STW optimizations #5527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
d8a2a9c to
3d6898e
Compare
| auto& tailStreamWriter = KJ_UNWRAP_OR_RETURN(maybeTailStreamWriter); | ||
| // This is where we'll actually encode the span. This function should never be invoked if STW is | ||
| // inactive as span tracing is only used in STW. | ||
| auto& tailStreamWriter = KJ_ASSERT_NONNULL(maybeTailStreamWriter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is intended as additional hardening, using KJ_UNWRAP_OR_RETURN could cause us to miss out on span tracing being set up when it is not being used – I'm highly confident that we're already avoiding this though.
|
|
||
| KJ_IF_SOME(writer, maybeTailStreamWriter) { | ||
| auto& spanContext = KJ_UNWRAP_OR_RETURN(topLevelInvocationSpanContext); | ||
| auto& spanContext = KJ_ASSERT_NONNULL(topLevelInvocationSpanContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This serves as additional hardening too, I'm very confident that we always provide the Onset event when doing tracing.
|
Downstream PR is still WIP, but this is ready for review. |
3d6898e to
5dc1403
Compare
|
The generated output of |
5dc1403 to
b329282
Compare
mar-cf
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are changes here that shouldn't be bundled into the same commit.
hasBufferedTailWorkersflag (currentlybuffered). Used for an optimization to skip buffering when no BTWs exist, and to confirm efficiently creating tracers with valid consumers.- Onset detection + stricter assertions. How do the stricter assertion manifest? If we detect a tracing misconfiguration, we can report that but still function degraded.
It's not from this change, but its not clear that topLevelInvocationSpanContext is being used for two purposes, one of which is as an implicit flag if the onset was reported.
src/workerd/io/tracer.c++
Outdated
| // WorkerInterfaces and then ends up not using it due to an error/incorrect parameters; such | ||
| // error checking should be done beforehand to avoid unused allocations). Report such cases. | ||
| LOG_ERROR_PERIODICALLY( | ||
| "destructed WorkerTracer with STW without reporting Onset event", kj::getStackTrace()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is now "destructed WorkerTracer without reporting Onset event"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, will update this
src/workerd/io/tracer.h
Outdated
| kj::Maybe<kj::Own<tracing::TailStreamWriter>> maybeTailStreamWriter); | ||
| explicit WorkerTracer(PipelineLogLevel pipelineLogLevel, ExecutionModel executionModel); | ||
| kj::Maybe<kj::Own<tracing::TailStreamWriter>> maybeTailStreamWriter, | ||
| bool buffered); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: not a fan of bool arguments. Can this be a strong-bool or an enum?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do this.
Here's the intent of this PR, why I think this PR contains the right amount of changes: As for changes that are not needed here, I think we can get rid of hasBuffered() and the buffered class variable – at this stage, it is only used in the constructor. I'll update the PR accordingly. |
b329282 to
baef58c
Compare
CodSpeed Performance ReportMerging #5527 will not alter performanceComparing Summary
Footnotes
|
My point was this PR looks to be two mostly independent changes in one. See the comment for how I would split this up.
It should be possible to do without adding this with a query that will check parents. I'm not extremely opposed to adding it, but maybe you could (or already have) consider it and provide a reason why this is better? |
2d6e151 to
bda6a55
Compare
…zations - Introduce a new WorkerTracer parameter indicating whether any BTWs are present. In a follow-up, this will be used to optimize memory management, but for now it helps us assert that if we have a tracer with logLevel none, we have BTWs (otherwise the tracer would be redundant, indicating waste). - Log an error if a WorkerTracer is destructed without getting the Outcome event even when logLevel == none
bda6a55 to
60ea4d1
Compare
present. In a follow-up, this will be used to optimize memory management,
but for now it helps us assert that if we have a tracer with logLevel none,
we have BTWs (otherwise the tracer would be redundant, indicating waste).
even when logLevel == none