Skip to content

Conversation

@drivera73
Copy link
Contributor

@drivera73 drivera73 commented Dec 24, 2025

In some cases, when running primary zones using BIND as the DNS, the journal files may not get properly synchronized into their primary zone databases - possibly because bind isn't shut down cleanly using the O/S's service scripts. As a result, on the next bootup, BIND9 may refuse to boot because those journal files are corrupted.

These scripts try to maximize the instances under which those journal files are cleaned out properly. The right way to do the cleanout is by either running service named stop or rndc sync -clean. Either of these commands will instruct BIND to sync the journal files to their DBs and clean them out properly.

However, if that fails, then there are contingencies in place to forcibly remove those files - if they didn't get synchronized cleanly, then they're garbage and should be removed. If they did, then they disappear and there will be nothing to forcibly delete.

Either way, the intent is to ensure that BIND has no issues starting up when runnin a master zone.

This is related to issue #5068, also reported by me (via a different account ... sorry :) ).

Copy link
Member

@fichtner fichtner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn’t it be better to use a setup.sh script executed at each start? Not sure how the files are used across restarts.

@drivera73
Copy link
Contributor Author

Wouldn’t it be better to use a setup.sh script executed at each start? Not sure how the files are used across restarts.

That's indeed part of the solution. There is a patch for the existing setup.sh script, and the new early and stop syshook scripts.

@fichtner
Copy link
Member

Ok, sorry the path didn’t expand on mobile view very well.

I’d think we should try to minimize impact. Early could be reasonable in some cases maybe, but stop isn’t really useful as a trigger.

Cheers,
Franco

@drivera73
Copy link
Contributor Author

drivera73 commented Dec 24, 2025

Fair.

I added the stop hook for consistency, and to maximize the chances of the issue being triaged as cleanly as possible (i.e. with either rndc clean -sync or service named stop). The logic is this: by the time the stop hook is called there are only two possible scenarios:

  • The named service is still running
    • The named service must be stopped, but first call rndc clean -sync to flush out any journal files
    • Execute the other scenario ...
  • The named service has already been stopped, or was never started
    • The journal files are already cleaned up as a result
    • The journal files are not cleaned up, but since named is no longer running, they're effectively garbage

Under either scenario, at the end of the stop hook, the journal files can safely be deleted if present.

If we remove the stop hook, then those journal files would be cleaned up blindly on bootup by the early hook, which may not be as antiseptic as rndc clean -sync since all we can do is delete them directly. The goal of affording rndc the opportunity to do some cleanup is in case there's valuable information in those journals that we can still commit at the last minute. It may not happen very often, but might as well give it a shot ... who knows whose butt we'll be saving!

In reality, the early hook is just a contingency to maximize the chances of BIND bootup, since the expectation is that both the setup.sh and the stop hooks should be sufficient ... but since only the paranoid survive... :)

Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants