Conversation
|
| <span id="e2ee-${identity}" class="e2ee-on"></span> | ||
| </div> | ||
| </div> | ||
| <div id="user-ts-${identity}" class="user-ts-overlay"> |
Check warning
Code scanning / CodeQL
Unsafe HTML constructed from library input Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 6 days ago
At a high level, the problem is that renderParticipant constructs a chunk of HTML using innerHTML and directly interpolates participant.identity into the markup. To avoid DOM XSS, any untrusted data must not be concatenated into HTML strings that are assigned to innerHTML; instead, we should construct elements using document.createElement, set attributes via setAttribute/properties, and assign untrusted values only to non-HTML contexts (like textContent or value). This avoids the need to manually escape HTML.
The best fix without changing existing behavior is to refactor the HTML construction in renderParticipant so that it no longer uses a template literal with innerHTML. We can keep the same structure and IDs, but create each element programmatically. For the conditional volume control vs progress bar, we can still use DOM APIs to append the right controls. No changes are needed in LocalParticipant, RemoteParticipant, Participant, or Room because they only emit events / carry data; CodeQL’s taint path ends at the unsafe sink in demo.ts, so securing that sink addresses all variants.
Concretely:
-
In
examples/demo/demo.ts, replace the block:div.className = 'participant'; div.innerHTML = ` <video id="video-${identity}"></video> ... `; container.appendChild(div);
with explicit DOM construction:
div.className = 'participant'; const videoEl = document.createElement('video'); videoEl.id = `video-${identity}`; div.appendChild(videoEl); // etc. for audio, info bar, name, codec/size/bitrate spans, right side icons, // user-ts overlay div, and the conditional volume or progress control.
-
Preserve all existing IDs and class names so that later code (
querySelector,updateVideoSize, CSS) continues to work unchanged. -
Ensure that
identityis only ever used to form attribute values viaid = ...orsetAttribute, and never viainnerHTMLinterpolation.
No new imports or helper functions are strictly necessary; we can use standard DOM APIs already available in the browser environment.
…-sdk-js into dc/feature/user_timestamp
size-limit report 📦
|
1egoman
left a comment
There was a problem hiding this comment.
It might be possible I'm reviewing this prematurely - Shijing had asked I take a look at this. That being said, IMO there's still quite a bit of work to be done here and I'm not sure I'm fully following how this all works right now.
| // Always attempt to strip LKTS packet trailer before any e2ee | ||
| // processing. On the send side, the trailer is appended *after* encryption, | ||
| // so it must be removed *before* decryption. | ||
| if (isVideoFrame(encodedFrame) && encodedFrame.data.byteLength > 0) { | ||
| try { | ||
| const packetTrailerResult = stripPacketTrailerFromEncodedFrame( | ||
| encodedFrame as RTCEncodedVideoFrame, | ||
| ); | ||
| if (packetTrailerResult !== undefined && this.trackId && this.participantIdentity) { | ||
| const msg: PacketTrailerMessage = { | ||
| kind: 'packetTrailer', | ||
| data: { | ||
| trackId: this.trackId, | ||
| participantIdentity: this.participantIdentity, | ||
| timestampUs: packetTrailerResult.timestampUs, | ||
| frameId: packetTrailerResult.frameId, | ||
| rtpTimestamp: packetTrailerResult.rtpTimestamp, | ||
| }, | ||
| }; | ||
| postMessage(msg); | ||
| } | ||
| } catch { | ||
| // Best-effort: never break media pipeline if timestamp parsing fails. | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
question: Is the FrameCryptor (which I believe runs as part of the e2ee worker, which may not be loaded if e2ee is not enabled) the best place for this logic to live?
Is there a reason it has to run in a web worker context (ie, maybe it is super computationally expensive / etc)? Is this feature supposed to work if e2ee is disabled?
| private handleUserTimestamp( | ||
| trackId: string, | ||
| participantIdentity: string, | ||
| timestampUs: number, | ||
| frameId?: number, | ||
| rtpTimestamp?: number, | ||
| ) { | ||
| if (!this.room) { | ||
| return; |
There was a problem hiding this comment.
question: Similar to my last comment - I don't know a ton about the user timestamp feature, but my suspicion is that putting handleUserTimestamp inside the E2eeManager is mixing concerns unnecesarily.
| /** | ||
| * Lightweight worker for stripping LKTS packet trailers from inbound | ||
| * encoded video frames via RTCRtpScriptTransform. | ||
| * | ||
| * When a valid trailer is found, the metadata is posted back to the main | ||
| * thread so the SDK can store the RTP-to-frame-metadata mapping on the | ||
| * corresponding RemoteVideoTrack. | ||
| */ | ||
| import { |
There was a problem hiding this comment.
question: I'm a bit confused now - so this change is introducing a second webworker exclusively for packet trailer stripping, but somehow this worker ties into the existing E2eeManager / e2ee worker? Assuming this isn't in an incomplete state mid migration from one approach to another, I might need some more background context to better understand this.
As a heads up - introducing another webworker is a fairly significant change. I think it might be viable because this packet trailer stuff is fully additive, but it's something we've used sparingly, so I just want to make sure there's a good reason to use it here.
More related background context I wrote in another context where somebody was proposing introducing a new webworker, maybe you would find it useful:
Starting a webworker requires that you pass a url to another script that you want to run to the
new Worker("https://path.to/worker/script.js")(docs) constructor when creating the new worker, but because livekit is a library we don't have control over the end user's build process / have no way to force them to generate another file in their end application, let alone get a predictable path to where it might be hosted.The one place that is using a webworker right now in the web sdk is end to end encryption, and the way it gets around this problem (and the way we'd have to do this for compression) is by bubbling up the web worker construction to the end user, forcing them to build / host a separate "worker script" containing the e2ee worker code, call
new Worker("...")+ pass the url to this "worker script", and then pass the fully constructed Worker object into the main Livekit SDK. For example, this is how this is done in the client-sdk-js demo app. This is super cumbersome / not an easy thing to instruct a user to do.
| * | ||
| * @internal | ||
| */ | ||
| setUserTimestamp(timestampUs: number, rtpTimestamp?: number, frameId?: number) { |
There was a problem hiding this comment.
nitpick: As a heads up, in data tracks, the userTimestamp field is a bigint. It might make sense to use that here too.
| /** | ||
| * Look up the frame metadata associated with a given RTP timestamp. | ||
| * The entry is consumed (removed) after lookup, matching the Rust SDK | ||
| * behaviour. |
There was a problem hiding this comment.
nitpick: I suspect this was probably LLM generated with a prompt like "port the rust implementation over to the web sdk" - it would be good to strip out some of these egregious comparisons to the rust sdk before this would end up getting merged.
question: Also in this context, is actually the "consuming the entry like rust" type behavior a good idea? In rust there are benefits to avoid extra cloning, but it has the serious tradeoff of making requests non idempotent. Given the web doesn't have these same data ownership concerns it is a fairly serious rough edge for from what I can tell no real benefit, assuming there is some other way the entries in the userTimestampMap are cleared so it doesn't grow in an unbounded fashion.
| } | ||
|
|
||
| function createPacketTrailerWorker(): Worker { | ||
| return new Worker(new URL('./livekit-client.packet-trailer.worker.js', import.meta.url)); |
There was a problem hiding this comment.
issue: See my comment here - this type of thing isn't viable and won't work for all downstream consumers of the package transparently.
No description provided.