Skip to content

Add support for registry change watchers#45

Open
daguej wants to merge 5 commits into
simonbuchan:masterfrom
daguej:watchers
Open

Add support for registry change watchers#45
daguej wants to merge 5 commits into
simonbuchan:masterfrom
daguej:watchers

Conversation

@daguej
Copy link
Copy Markdown

@daguej daguej commented May 1, 2026

This implements support for efficient notifications of changes to a registry subtree via RegNotifyChangeKeyValue.

const watcher = reg.watch(reg.HKCU, "Some\\Path");
watcher.on('change', () => console.log('changed'));
// later, to stop watching and clean up:
watcher.close();

@simonbuchan
Copy link
Copy Markdown
Owner

simonbuchan commented May 2, 2026

Hey it's been a while since I last looked at this repo properly! Lets see if I remember how to do this...

First off; I do like the idea of this feature, and it's well implemented for what it's doing, but it's currently a little too different to the other APIs so I want to be a bit more stringent to make sure it's not going to be confusing or restrictive.

Specifically, I think starting a background thread is a bit out of what my idea of this package should be doing, but it's not reasonable to handle ideally without direct threading support in N-API...; the bigger issue is that it just makes it pretty much impossible to directly expose the Windows API in a way that makes it directly 1 to 1 in a way you can just point at MSDN docs.

That said, I think you can get most of the way there by using the stock Windows thread pool functions, which at least shares the threads with any other users of those threadpool APIs, and means you don't have to juggle dispatching tasks etc.

I think this should just look something like:

// obviously, also need error handling + cleanup
struct Waiter {
   HKEY hkey;

   TP_WAIT wait;
   HEVENT event;

   Waiter(HKEY hkey, TP_WAIT_CALLBACK callback, void* data)
     : hkey{hkey}
   {
     wait = CreateThreadpoolWait(callback, data, nullptr);
     CreateEvent(&event, ...);
  }

  Next() {
    // THREAD_AGNOSTIC is apparently required for using threadpools
    RegNotifyChangeKeyValue(hkey, TRUE, ... | REG_NOTIFY_THREAD_AGNOSTIC, event, TRUE);
    SetThreadpoolWait(&wait, event, nullptr);
  }
}

But that can be a tweak later if anyone cares enough to make the change, without affecting the current API.

I'm not a big fan of the class + event emitter JS API, it doesn't really match the existing API style (and I just don't like EventEmitter - but that's a whole different thing).

Perhaps something like this, but I haven't really put much thought into it yet:

// naming types for clarity
type ChangeListener = () => void;
type WatchHandle = // opaque N-API "external" value, can maybe use a private Symbol to emulate in TS now?
export function createWatch(..., changeListener: Listener): WatchHandle {
  assert(isWindows);
  // ...
  return native.createWatch(...);
}

// Will also be collected by GC, but an early check
export function closeHandle(handle: WatchHandle): void;

I know there was some older talk about a "v2 API" for this with an actual class Key export that feels more "javascript-ey", where the current API would match better; but I doubt I'll ever get around to this now I don't have any use case for this package at work.

It seems a bit odd you don't expose the RegNotifyChangeKeyValue args subkeys and flags? The other APIs attempt to match the underlaying method exactly unless there's good reason they can't; perhaps there's a reason that doesn't make as much sense here though.

In any case, I think a reasonable later extension to this would be to have native pre-filtering of the change implemented on the C++ side, to reduce the performance impact of noisy keys, so at least ensuring you have a watch API that allows further options to be added would be good.

@daguej
Copy link
Copy Markdown
Author

daguej commented May 2, 2026

Thanks for the quick response.

Unfortunately, due to the nature of RegNotifyChangeKeyValue, there's just no way we can avoid some kind of threading problem in a watcher, which means we cannot expose the API 1:1 in js-land. Not that (imo) we should really want to anyway; the win32 API for this is ugly.

Fundamentally, some thread has to block on a win32 API call while waiting for changes (whether by calling RegNotifyChangeKeyValue(…, …, …, NULL, FALSE), which itself blocks until a change occurs, or by creating a kernel event object and calling WaitFor…, as done here). This naturally can't happen on the JS thread, since it would bring node to a halt, so a background thread of some kind it must be.

N-API does have threading support built in, but it's designed around a different use case: unit of work gets queued from the main (JS) thread, unit is dispatched to a libuv worker thread pool where your native code runs, and on completion the unit's result is sent back to the main thread to be dispatched back in to JS.

The major problem here (and why I did not use N-API's threading facilities) is that libuv only spins up 4 worker threads for the entire node process, which includes node's own APIs that use the thread pool internally, and this is only configurable via environment variable (UV_THREADPOOL_SIZE), making it a real pain for a library like this to adjust.

So a watcher that has to block a libuv/N-API thread indefinitely would wreak havoc. Just one watcher would degrade performance, and 4 watchers would completely starve an app's worker threads. This would break a whole bunch of stuff (most of fs, dns, crypto, and zlib to name a few).

So that's out.

I don't think a Windows thread pool is the right tool here either. It does not eliminate the need to juggle task dispatch (native worker threads MUST NOT call into JS directly), and actually adds additional runtime overhead.

As it stands, the notification flow is:

flowchart LR
  A[Registry change signals Event] --> B["`*WaitForMultipleObjects* returns, dispatches callback from background thread via *tsfn.BlockingCall()*`"] -. thread context switch .-> C["`*ThreadSafeFunction* runs native *callback* on JS thread`"] --> D[Dispatch event to JS]
Loading

Using the Windows thread pool adds additional overhead:

flowchart LR
  A[Registry change signals Event] --> B["`Windows thread pool's internal watcher thread dispatches to native callback`"] -. thread context switch .-> C["`Native callback runs on thread pool thread, dispatches to JS thread via *tsfn.BlockingCall()*`"] -. thread context switch .-> D["`*ThreadSafeFunction* runs native *callback* on JS thread`"]  --> E[Dispatch event to JS]
Loading

Now there's 3 threads involved. Internally, the Windows thread pool allocates a thread that waits for the event(s) to signal, which then dispatches to another worker thread from the pool to run our native code, which then must dispatch back to the JS main thread. I'm not a fan of this.

Ideally, we could create one thread that calls WaitForMultipleObjects for all watchers created by the user, so there's only one additional thread and one thread context change necessary when triggering, but that would mean juggling all the active HEVENTs in C++ (re-waiting when a new watcher is created) and then routing signaled events to the right place in JS. Doable, but will likely get complex and prone to errors. (Or missed notifications? Threading is fun!)

The upside of the current implementation is that it is relatively simple if a bit wasteful of memory.

Beyond the tech details, my app's use case (and I would bet similarly for most users of this library) only requires watching one or two specific keys, so all of the above threading optimization just isn't relevant, and to be frank, I'm not inclined to spend the time doing it.

As far as the JS API goes, I'm obviously a fan of the way I've done it. It's a much more idiomatic nodejs style, and directly mirrors the API from fs.watch(…), which returns an EventEmitter with a change event and a close() method.

The alternative you proposed would certainly make sense to win32 C developer, but I don't love trying to shoehorn that pattern into JS, especially where node's stdlib already has an API precedent for this exact kind of thing.

Finally, I must admit I didn't expose bWatchSubtree and dwNotifyFilter out of laziness. The subtree option is definitely useful. The filter flags probably too, although we shouldn't let the JS user control REG_NOTIFY_THREAD_AGNOSTIC, since that's an implementation detail of the native code handling the notifications.

I'll add those as options.

@simonbuchan
Copy link
Copy Markdown
Owner

Right, I was suggesting that the threadpool approach as a way to offload the responsibility of managing background threads; the tradeoff is more complexity for better efficiency as N goes up; and using the native APIs means that behavior isn't specific to this package. But yes, the impact doesn't matter much for the use-case of a single callback, hence:

But that can be a tweak later if anyone cares enough to make the change, without affecting the current API.


As an aside, while googling around the right code for this, I did see someone mentioning that thread pool waits don't use a waiter thread with WaitMultiple* any more:

Up until Windows 8 the Thread Pool API wait operation too used multiple threads. This is inefficient so it was refactored to use internal NtAssociateWaitCompletionPacket call.

But I'm not sure what the basis of that claim is. As that page says, running your own IOCP is the most efficient, but that's far more out of scope for this!


While I agree that a waiter class that uses EventEmitter for events is quite idiomatic for node (though in the case of EventEmitter much more so for older libraries), it's not idiomatic for this node package. As I said, this package logically should be a Key class that has a bunch of methods, but it's currently a flat list of functions that either attempt to as directly as possible mirror the native Windows API, e.g.:

renameKey

Wraps RegRenameKey

Changes the name of the specified registry key.

(Note the quote of the first sentence of the MSDN docs), or it is a wrapper around those methods to handle the data formats that you would otherwise need to deal with.

In particular, note that createKey and openKey both return an (opaque) HKEY type that you should call closeKey on. Change listeners being a class while everything else isn't is just weird.

@daguej
Copy link
Copy Markdown
Author

daguej commented May 2, 2026

From the horse's mouth: Old versions of Windows used one waiter thread per 63 events watched; starting in Windows 8, one thread handles watching all events via IOCPs.

…but it's still an extra thread that dispatches to a thread pool thread that runs your native callback, so still 3 threads involved.


Perhaps further inspiration should be taken from the fs module, which provides both the C-style fd = open(path) to get a fd, read(fd), close(fd) and the more idiomatic stream = createReadStream(), stream.on('data', …), stream.close().

The former fits in with this module's existing design, and the latter provides a cleaner API if you want it.

@simonbuchan
Copy link
Copy Markdown
Owner

simonbuchan commented May 2, 2026

Ah, annoying that it apparently can't just directly wait on the IOCP directly in the workers though perhaps the cause is this GetQueuedConpletionStatus note:

This function associates a thread with the specified completion port. A thread can be associated with at most one completion port.

I can't imagine wanting to wait on an IOCP in a threadpool callback being a likely existing pattern, though, so I'd expect a flag to pinky promise you don't need that...

I think it's a pretty good option still if we did want to optimize for multiple waiters without taking on complexity, the double wake-up is annoying but not fatal in comparison to the expected existing overhead of node interop and the frequency of events.

The theoretical person who cared enough to write the code can decide that later though.


Perhaps further inspiration should be taken from the fs module

Read streams are a different API to the underlying raw methods, though: they're push not pull; it integrates with the other stream APIs, probably a bunch of other internal stuff: so it's more like the current raw vs formatted APIs where it's actually adding functionality and not just sugar.

Perhaps closer would be fs/promises, which for some reason didn't just make all the methods promise returning rather than callbacks but moved any function taking an fd to be methods on FileHandle instead. But that's kind of adding two different kinds of sugar at once, which confuses things...

As I mentioned I'm not theoretically against providing a wrapping class API for all the existing functions, I simply would see it as a migration path off the current one for a v2 release that removes the current API, and I don't really see the point if I'm not going to be bothered to do that part. With that having a class only for a waiter would feel less weird, but I don't really want to block this on an unrelated change like that...

I would still prefer there to be a non-EventEmitter API at least just for personal taste reasons. I don't exactly know what I want there yet, though: there's no direct mapping to the underlying API this can expose. Let me think about it when it's not midnight here and I'll get back to you, though feel free to make suggestions.


One last thing I wanted to get around to is testing the error cases which are always gnarly for callback stuff; but I'm not sure how to reproduce those reliably for registry watches? Probably the watched key getting deleted would do it...

Feel free to tackle that, too, but I'll be poking at that myself to figure out where and when anyways so I'm not going to block on you doing it.

@simonbuchan
Copy link
Copy Markdown
Owner

simonbuchan commented May 4, 2026

Well, while actually getting this set back up to test, I got annoyed this currently (seemingly?) requires an old version of the MSVC build tools to build, which gave the me the excuse to create a v2 (because of the updated deps increasing the minimum node version).

I have put up an early branch (v2) with a class Key based API (and all the APIs with replacements now deprecated, to ease upgrading), but I'll probably be a few more days on that over the next week or so (depending on my motivation) to update tests and docs, and it currently has Symbol.dispose in there so you can use using key = ..., which was only added "for real" in node 24 which I think is a bit too recent, so that will need a polyfill and tests, etc.


With that, I'm fine with a more conventional class for watch, possibly including being an EventEmitter (e.g. it does currently use and expose the node Buffer type), though I would like to give it a bit more cooking to figure out the API.

Regardless, I did finally notice you're opening a new key without letting the caller specify the flags (and I think it doesn't let you watch a root key like HKLM? nah, an empty subkey string works, but it's pretty lame still, and will try to close a predefined key) - I think that's because you require NOTIFY but that's fine for the caller to specify, and is included in READ, e.g.:

// No write access to HKLM
using key = reg.Key.HKLM.open(r`Software\My Company\My App`, reg.Access.READ);
loadConfig(key);
key.someWatchApi(() => {
  loadConfig(key);
});

The other reason to worry about this is handling the key getting closed out from under the watcher, but I think that can be handled nicely without too much impact on the existing native side, e.g. adding a #watcher: unknown = null field to Key

@simonbuchan
Copy link
Copy Markdown
Owner

Hey, sorry for being radio silent, was sick the whole time! I've started plucking away at the docs and tests for v2, but there's a decent amount still to figure out - especially about just how much to support from v1 to ease migration. The HKEY type in particular is a pain: it's currently a complete lie but I don't want to break any JS users that are using the documented .native property.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants