Skip to content

8385310: [CRaC] CRaC should fail gracefully on a second checkpoint attempt#317

Open
antonvoznia wants to merge 12 commits into
openjdk:cracfrom
antonvoznia:handle-second-checkpoint-fail
Open

8385310: [CRaC] CRaC should fail gracefully on a second checkpoint attempt#317
antonvoznia wants to merge 12 commits into
openjdk:cracfrom
antonvoznia:handle-second-checkpoint-fail

Conversation

@antonvoznia
Copy link
Copy Markdown
Contributor

@antonvoznia antonvoznia commented May 22, 2026

Some CRaC engines don't support the second checkpoint after a restore, or cannot do it right now.
The changes are focused to "improve" the failure mechanism.
This PR introduces 3 checkpointable status.

    never_after_restore - engine doesn't support the second
    ready_late          - engine cannot commit the checkpoint right now, but could do it later
    ready               - engine can commit the checkpoint

The statuses are managed by engine API library.

Testing locally:

  • criuengine
  • simengine (actually cannot fail the checkpoint)
  • custom engine


Progress

  • Change must not contain extraneous whitespace

Issue

  • JDK-8385310: [CRaC] CRaC should fail gracefully on a second checkpoint attempt (Enhancement - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/crac.git pull/317/head:pull/317
$ git checkout pull/317

Update a local copy of the PR:
$ git checkout pull/317
$ git pull https://git.openjdk.org/crac.git pull/317/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 317

View PR using the GUI difftool:
$ git pr show -t 317

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/crac/pull/317.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link
Copy Markdown

bridgekeeper Bot commented May 22, 2026

👋 Welcome back antonvoznia! A progress list of the required criteria for merging this PR into crac will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link
Copy Markdown

openjdk Bot commented May 22, 2026

@antonvoznia This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8385310: [CRaC] CRaC should fail gracefully on a second checkpoint attempt

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 485 new commits pushed to the crac branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@TimPushkin) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk Bot added ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 22, 2026
@mlbridge
Copy link
Copy Markdown

mlbridge Bot commented May 22, 2026

Webrevs

@antonvoznia
Copy link
Copy Markdown
Contributor Author

Could I ask you for review @rvansa @TimPushkin ?
Thanks.

Copy link
Copy Markdown
Collaborator

@TimPushkin TimPushkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expected this to be an optional (but maybe recommended for some engines) user's responsibility to do this checking instead of us fully blocking checkpoint attempts based on it. It's possible that between the moment we do the check and the moment we reach calling the engine to checkpoint the status changes. It seems a good fit for CRaCMXBean.

Also some tests should be added.

Comment thread src/hotspot/share/include/crlib/crlib_checkpointable_data.h Outdated
Comment thread src/hotspot/share/include/crlib/crlib_checkpointable_data.h Outdated
Comment thread src/hotspot/share/include/crlib/crlib_checkpointable_data.h Outdated
Comment thread src/hotspot/share/include/crlib/crlib_checkpointable_data.h Outdated
Comment thread src/hotspot/share/runtime/crac.hpp
Comment thread src/java.base/share/classes/jdk/internal/crac/mirror/Core.java Outdated
Comment thread src/java.base/share/native/libsimengine/simengine.cpp Outdated
Comment thread src/java.base/share/classes/jdk/internal/crac/mirror/Core.java Outdated
Comment thread src/java.base/share/classes/jdk/internal/crac/mirror/Core.java
Comment on lines +277 to +295
int status_code = getCheckpointableStatus0();
switch (CheckpointableStatus.fromCode(status_code)) {
case NEVER_AFTER_RESTORE -> {
CheckpointException ex = new CheckpointException();
ex.addSuppressed(new Exception("Current engine doesn't support second checkpoint after restore."));
throw ex;
}
case READY_LATER -> {
CheckpointException ex = new CheckpointException();
ex.addSuppressed(new Exception("CRaC cannot commit checkpoint right now, try later."));
throw ex;
}
case READY -> {
// fall through to checkpoint logic below
}
default -> {
System.err.printf("Engine returned unknown checkpointable status code: %d. Proceeding with checkpoint.%n", status_code);
}
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we wanted to make this a public interface? I'm not sure if we should completely block checkpoint based on the current status since it can change until we actually reach calling the engine: resource processing below can take a long time (resources are free to block the thread).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we wanted to make this a public interface?

I'm not sure I caught the point.
If you are talking about CRaCMXBean, I didn't find a way to use it in the internal src/java.base/share/classes/jdk/internal/crac/mirror/Core.java.
These changes could go into CRaCImpl.java but then we have a problem with JCMD call.

'm not sure if we should completely block checkpoint based on the current status since it can change until we actually reach calling the engine

I could move this verification into synchronized block
synchronized (checkpointRestoreLock) {
But anyway, I would like to call it before the beforeCheckpoint() is handled.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could move this verification into synchronized block synchronized (checkpointRestoreLock) {

The race is between the Java code and the engine so synchronizing only on Java side would not help.

I still think we should leave calling this up to the user, while checkpointRestore should just attempt to checkpoint and fail if that is not possible.

If you are talking about CRaCMXBean, I didn't find a way to use it in the internal src/java.base/share/classes/jdk/internal/crac/mirror/Core.java.

CRaCMXBean could use Core if we want to call this in Core. Otherwise it should indeed be in CRaCImpl.

Copy link
Copy Markdown
Collaborator

@TimPushkin TimPushkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments are mostly minor. Major things that I think need to be addressed before integration:

  1. Add this to CRaCMXBean. I'd remove it from Core.checkpointRestore1() completely because the API is prone to races (status can change between the check and the actual checkpoint) so it should be up to the user whether to use it on per-engine basis.
  2. Add a simple test, e.g. that we cannot checkpoint after direct-mapped restore.

Comment on lines +1000 to 1001
&checkpoint_availability_extension.header,
&image_constraints_extension.header,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image constraints are always used on restore (which is expected to happen more often than checkpoint) so would be better to place the new extension beneath it. Same in simengine.

extern "C" {
#endif

#define CRLIB_EXTENSION_CHECKPOINT_AVAILABILITY_NAME "checkpointable data"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#define CRLIB_EXTENSION_CHECKPOINT_AVAILABILITY_NAME "checkpointable data"
#define CRLIB_EXTENSION_CHECKPOINT_AVAILABILITY_NAME "checkpoint availability"

#endif

#define CRLIB_EXTENSION_CHECKPOINT_AVAILABILITY_NAME "checkpointable data"
#define CRLIB_EXTENSION_CHECKPOINTABLE(api) \
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be CRLIB_EXTENSION_CHECKPOINT_AVAILABILITY to follow naming traditions of other extensions


#define CRLIB_EXTENSION_CHECKPOINT_AVAILABILITY_NAME "checkpointable data"
#define CRLIB_EXTENSION_CHECKPOINTABLE(api) \
CRLIB_EXTENSION(api, crlib_checkpoint_availability_t, CRLIB_EXTENSION_CHECKPOINT_CRLIB_CHECKPOINT_AVAILABILITY_NAME)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A smoke test for the feature would reveal typos like this

Suggested change
CRLIB_EXTENSION(api, crlib_checkpoint_availability_t, CRLIB_EXTENSION_CHECKPOINT_CRLIB_CHECKPOINT_AVAILABILITY_NAME)
CRLIB_EXTENSION(api, crlib_checkpoint_availability_t, CRLIB_EXTENSION_CHECKPOINT_AVAILABILITY_NAME)

Comment on lines +23 to +24
#ifndef CRLIB_CHECKPOINT_AVAILABILITY_H
#define CRLIB_CHECKPOINT_AVAILABILITY_H
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me "checkpoint is available" means that there is an existing checkpoint image from which we can restore, while the extension is for checking it is currently possible/permitted to create a checkpoint image.

} // extern "C
#endif

#endif // CRLIB_CHECKPOINT_AVAILABILITY_H
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#endif // CRLIB_CHECKPOINT_AVAILABILITY_H
#endif // CRLIB_CHECKPOINT_AVAILABILITY_H

Comment on lines +84 to +87
if (crac::_engine->prepare_checkpoint_availability_api() == CracEngine::ApiStatus::OK) {
crlib_checkpointable_status_t status = crac::_engine->get_checkpointable_status();
if (status == CRLIB_CHECKPOINTABLE_NEVER) {
if (crac::_generation > 1) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crac:: prefix should not be needed in this method

Suggested change
if (crac::_engine->prepare_checkpoint_availability_api() == CracEngine::ApiStatus::OK) {
crlib_checkpointable_status_t status = crac::_engine->get_checkpointable_status();
if (status == CRLIB_CHECKPOINTABLE_NEVER) {
if (crac::_generation > 1) {
if (_engine->prepare_checkpoint_availability_api() == CracEngine::ApiStatus::OK) {
crlib_checkpointable_status_t status = _engine->get_checkpointable_status();
if (status == CRLIB_CHECKPOINTABLE_NEVER) {
if (_generation > 1) {

Comment on lines +90 to +92
// Engine doesn't support checkpoint after restore,
// but it's a first checkpoint. So return READY.
return CRLIB_CHECKPOINTABLE_READY;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not override what the engine reports, it is very unintuitive

@@ -63,6 +57,7 @@ class crac: AllStatic {

static jlong restore_start_time();
static jlong uptime_since_restore();
static jint checkpointable_status();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: since we have crlib_checkpointable_status_t we should return that and convert to jint in the JNI layer

Comment on lines +504 to +507
// With direct_map, restored memory pages are mmap'd directly from the
// checkpoint image file rather than copied into anonymous memory. A second
// checkpoint would overwrite that same image, corrupting the live mappings
// of the running process. CRIU has no safe way to re-checkpoint in this mode.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify: you can change CRaCCheckpointTo on restore and thus then checkpoint to another place saving the old image — some time ago I tested that this works and can be restored from, not sure if it still does though. But there seems to be no practical reason to do this, unless the second image is only the difference from the first one making the second checkpoint faster — BTW, are you sure this is not the case?

Comment on lines +500 to +502
if (!conf->direct_map()) {
return CRLIB_CHECKPOINTABLE_READY;
}
Copy link
Copy Markdown
Collaborator

@TimPushkin TimPushkin May 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think of it, I do not think this would actually work. On restore there are two instances of the engine: (1) the restoring one and (2) the one being restored. Options are not propagated from (1) to (2) this way. direct_map is set on restore i.e. only in (1), while this check is run before checkpoint i.e. in (2). So I believe this would work only if you set direct_map on the initial start up (which is incorrect and should give you a warning on checkpoint).

I think you would need to inspect the memory of the process to see if it comes from a mmaped file to make this work.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't do any introspection, that's not in line with what CRIU does. The restoring engine should rather transfer data to checkpointed engine (we do that for env and system properties, though the implementation passes only the PID number and the 'data' is actually transferred at JVM level)

Copy link
Copy Markdown
Collaborator

@TimPushkin TimPushkin May 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI currently the restoring JVM does not record CRaCEngineOptions in restore-data at all, it and some other restore-settable options that affect only the restoring JVM/engine are filtered out here:

// Some restore-settable flags should not be updated in the restored JVM.
static bool should_record_for_restore(const JVMFlag& flag) {
precond(flag.is_restore_settable());
if (strncmp(flag.name(), "CRaCEngine", ARRAY_SIZE("CRaCEngine") - 1) == 0) {
assert(strcmp(flag.name(), "CRaCEngine") == 0 || strcmp(flag.name(), "CRaCEngineOptions") == 0,
"unexpected CRaCEngine* flag: %s", flag.name());
return false;
}
if (strcmp(flag.name(), "IgnoreUnrecognizedVMOptions") == 0) {
return false;
}
return true;
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making the engine parse CRaCEngineOptions from restore-data is also not universal since the JVM can set additional options. A separate engine-internal mechanism would need to be implemented to make it pass info to its counterpart.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the code, it seems I could use user_data to save the value of current argument of direct_map on each restore() call.
Extract the value once the get_checkpointable_status() is called.
WDYT?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I am personally still not fully convinced if we need to forbid checkpoint after direct-map of CRIU at all: #317 (comment). Especially since it is not trivial.

Copy link
Copy Markdown
Contributor Author

@antonvoznia antonvoznia May 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also user_data is for passing from checkpoint to restore IIRC

can we extend the purpose of the mechanism? To avoid creating similar things a code generation for no reason

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe user_data has been unused for a while because we moved to more task-specific APIs, initially it was for CPU features AFAIR. So it should probably be deprecated and dropped rather than extended.

Anyway, relaying direct_map is an implementation detail of checkpointable_status in criuengine and should not touch user-facing APIs. It can use a similar implementation under the hood, just should not be user-visible.

Copy link
Copy Markdown
Collaborator

@TimPushkin TimPushkin May 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refreshed my knowledge on user-data. It is used to add arbitrary data to checkpoint image, in criuengine case, by adding files to the image. This is suitable for checkpoint->restore data passing but not the other way around because a particular restore should not mutate an image possibly used for many restores.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as @TimPushkin suggested at the first message, does it make sense to look at process memory mapping in /proc/158992/map and find if there the same path as image_location has?
By this, we can verify

  1. if the process mapped directly
  2. if the new CheckpointTo path differs from RestoreFrom.

Do you guys think the approach makes sense or not?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready Pull request is ready to be integrated rfr Pull request is ready for review

Development

Successfully merging this pull request may close these issues.

3 participants