8385310: [CRaC] CRaC should fail gracefully on a second checkpoint attempt#317
8385310: [CRaC] CRaC should fail gracefully on a second checkpoint attempt#317antonvoznia wants to merge 12 commits into
Conversation
|
👋 Welcome back antonvoznia! A progress list of the required criteria for merging this PR into |
|
@antonvoznia This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 485 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@TimPushkin) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
Webrevs
|
|
Could I ask you for review @rvansa @TimPushkin ? |
There was a problem hiding this comment.
I expected this to be an optional (but maybe recommended for some engines) user's responsibility to do this checking instead of us fully blocking checkpoint attempts based on it. It's possible that between the moment we do the check and the moment we reach calling the engine to checkpoint the status changes. It seems a good fit for CRaCMXBean.
Also some tests should be added.
| int status_code = getCheckpointableStatus0(); | ||
| switch (CheckpointableStatus.fromCode(status_code)) { | ||
| case NEVER_AFTER_RESTORE -> { | ||
| CheckpointException ex = new CheckpointException(); | ||
| ex.addSuppressed(new Exception("Current engine doesn't support second checkpoint after restore.")); | ||
| throw ex; | ||
| } | ||
| case READY_LATER -> { | ||
| CheckpointException ex = new CheckpointException(); | ||
| ex.addSuppressed(new Exception("CRaC cannot commit checkpoint right now, try later.")); | ||
| throw ex; | ||
| } | ||
| case READY -> { | ||
| // fall through to checkpoint logic below | ||
| } | ||
| default -> { | ||
| System.err.printf("Engine returned unknown checkpointable status code: %d. Proceeding with checkpoint.%n", status_code); | ||
| } | ||
| } |
There was a problem hiding this comment.
I thought we wanted to make this a public interface? I'm not sure if we should completely block checkpoint based on the current status since it can change until we actually reach calling the engine: resource processing below can take a long time (resources are free to block the thread).
There was a problem hiding this comment.
I thought we wanted to make this a public interface?
I'm not sure I caught the point.
If you are talking about CRaCMXBean, I didn't find a way to use it in the internal src/java.base/share/classes/jdk/internal/crac/mirror/Core.java.
These changes could go into CRaCImpl.java but then we have a problem with JCMD call.
'm not sure if we should completely block checkpoint based on the current status since it can change until we actually reach calling the engine
I could move this verification into synchronized block
synchronized (checkpointRestoreLock) {
But anyway, I would like to call it before the beforeCheckpoint() is handled.
There was a problem hiding this comment.
I could move this verification into synchronized block
synchronized (checkpointRestoreLock) {
The race is between the Java code and the engine so synchronizing only on Java side would not help.
I still think we should leave calling this up to the user, while checkpointRestore should just attempt to checkpoint and fail if that is not possible.
If you are talking about CRaCMXBean, I didn't find a way to use it in the internal
src/java.base/share/classes/jdk/internal/crac/mirror/Core.java.
CRaCMXBean could use Core if we want to call this in Core. Otherwise it should indeed be in CRaCImpl.
Co-authored-by: Timofei Pushkin <pushkin.td@gmail.com>
TimPushkin
left a comment
There was a problem hiding this comment.
The comments are mostly minor. Major things that I think need to be addressed before integration:
- Add this to
CRaCMXBean. I'd remove it fromCore.checkpointRestore1()completely because the API is prone to races (status can change between the check and the actual checkpoint) so it should be up to the user whether to use it on per-engine basis. - Add a simple test, e.g. that we cannot checkpoint after direct-mapped restore.
| &checkpoint_availability_extension.header, | ||
| &image_constraints_extension.header, |
There was a problem hiding this comment.
Image constraints are always used on restore (which is expected to happen more often than checkpoint) so would be better to place the new extension beneath it. Same in simengine.
| extern "C" { | ||
| #endif | ||
|
|
||
| #define CRLIB_EXTENSION_CHECKPOINT_AVAILABILITY_NAME "checkpointable data" |
There was a problem hiding this comment.
| #define CRLIB_EXTENSION_CHECKPOINT_AVAILABILITY_NAME "checkpointable data" | |
| #define CRLIB_EXTENSION_CHECKPOINT_AVAILABILITY_NAME "checkpoint availability" |
| #endif | ||
|
|
||
| #define CRLIB_EXTENSION_CHECKPOINT_AVAILABILITY_NAME "checkpointable data" | ||
| #define CRLIB_EXTENSION_CHECKPOINTABLE(api) \ |
There was a problem hiding this comment.
Should be CRLIB_EXTENSION_CHECKPOINT_AVAILABILITY to follow naming traditions of other extensions
|
|
||
| #define CRLIB_EXTENSION_CHECKPOINT_AVAILABILITY_NAME "checkpointable data" | ||
| #define CRLIB_EXTENSION_CHECKPOINTABLE(api) \ | ||
| CRLIB_EXTENSION(api, crlib_checkpoint_availability_t, CRLIB_EXTENSION_CHECKPOINT_CRLIB_CHECKPOINT_AVAILABILITY_NAME) |
There was a problem hiding this comment.
A smoke test for the feature would reveal typos like this
| CRLIB_EXTENSION(api, crlib_checkpoint_availability_t, CRLIB_EXTENSION_CHECKPOINT_CRLIB_CHECKPOINT_AVAILABILITY_NAME) | |
| CRLIB_EXTENSION(api, crlib_checkpoint_availability_t, CRLIB_EXTENSION_CHECKPOINT_AVAILABILITY_NAME) |
| #ifndef CRLIB_CHECKPOINT_AVAILABILITY_H | ||
| #define CRLIB_CHECKPOINT_AVAILABILITY_H |
There was a problem hiding this comment.
For me "checkpoint is available" means that there is an existing checkpoint image from which we can restore, while the extension is for checking it is currently possible/permitted to create a checkpoint image.
| } // extern "C | ||
| #endif | ||
|
|
||
| #endif // CRLIB_CHECKPOINT_AVAILABILITY_H |
There was a problem hiding this comment.
| #endif // CRLIB_CHECKPOINT_AVAILABILITY_H | |
| #endif // CRLIB_CHECKPOINT_AVAILABILITY_H | |
| if (crac::_engine->prepare_checkpoint_availability_api() == CracEngine::ApiStatus::OK) { | ||
| crlib_checkpointable_status_t status = crac::_engine->get_checkpointable_status(); | ||
| if (status == CRLIB_CHECKPOINTABLE_NEVER) { | ||
| if (crac::_generation > 1) { |
There was a problem hiding this comment.
crac:: prefix should not be needed in this method
| if (crac::_engine->prepare_checkpoint_availability_api() == CracEngine::ApiStatus::OK) { | |
| crlib_checkpointable_status_t status = crac::_engine->get_checkpointable_status(); | |
| if (status == CRLIB_CHECKPOINTABLE_NEVER) { | |
| if (crac::_generation > 1) { | |
| if (_engine->prepare_checkpoint_availability_api() == CracEngine::ApiStatus::OK) { | |
| crlib_checkpointable_status_t status = _engine->get_checkpointable_status(); | |
| if (status == CRLIB_CHECKPOINTABLE_NEVER) { | |
| if (_generation > 1) { |
| // Engine doesn't support checkpoint after restore, | ||
| // but it's a first checkpoint. So return READY. | ||
| return CRLIB_CHECKPOINTABLE_READY; |
There was a problem hiding this comment.
We should not override what the engine reports, it is very unintuitive
| @@ -63,6 +57,7 @@ class crac: AllStatic { | |||
|
|
|||
| static jlong restore_start_time(); | |||
| static jlong uptime_since_restore(); | |||
| static jint checkpointable_status(); | |||
There was a problem hiding this comment.
Nitpick: since we have crlib_checkpointable_status_t we should return that and convert to jint in the JNI layer
| // With direct_map, restored memory pages are mmap'd directly from the | ||
| // checkpoint image file rather than copied into anonymous memory. A second | ||
| // checkpoint would overwrite that same image, corrupting the live mappings | ||
| // of the running process. CRIU has no safe way to re-checkpoint in this mode. |
There was a problem hiding this comment.
Just to clarify: you can change CRaCCheckpointTo on restore and thus then checkpoint to another place saving the old image — some time ago I tested that this works and can be restored from, not sure if it still does though. But there seems to be no practical reason to do this, unless the second image is only the difference from the first one making the second checkpoint faster — BTW, are you sure this is not the case?
| if (!conf->direct_map()) { | ||
| return CRLIB_CHECKPOINTABLE_READY; | ||
| } |
There was a problem hiding this comment.
Now that I think of it, I do not think this would actually work. On restore there are two instances of the engine: (1) the restoring one and (2) the one being restored. Options are not propagated from (1) to (2) this way. direct_map is set on restore i.e. only in (1), while this check is run before checkpoint i.e. in (2). So I believe this would work only if you set direct_map on the initial start up (which is incorrect and should give you a warning on checkpoint).
I think you would need to inspect the memory of the process to see if it comes from a mmaped file to make this work.
There was a problem hiding this comment.
I wouldn't do any introspection, that's not in line with what CRIU does. The restoring engine should rather transfer data to checkpointed engine (we do that for env and system properties, though the implementation passes only the PID number and the 'data' is actually transferred at JVM level)
There was a problem hiding this comment.
FYI currently the restoring JVM does not record CRaCEngineOptions in restore-data at all, it and some other restore-settable options that affect only the restoring JVM/engine are filtered out here:
crac/src/hotspot/share/runtime/arguments.cpp
Lines 2698 to 2710 in aabad2d
There was a problem hiding this comment.
Making the engine parse CRaCEngineOptions from restore-data is also not universal since the JVM can set additional options. A separate engine-internal mechanism would need to be implemented to make it pass info to its counterpart.
There was a problem hiding this comment.
I looked at the code, it seems I could use user_data to save the value of current argument of direct_map on each restore() call.
Extract the value once the get_checkpointable_status() is called.
WDYT?
There was a problem hiding this comment.
And I am personally still not fully convinced if we need to forbid checkpoint after direct-map of CRIU at all: #317 (comment). Especially since it is not trivial.
There was a problem hiding this comment.
Also user_data is for passing from checkpoint to restore IIRC
can we extend the purpose of the mechanism? To avoid creating similar things a code generation for no reason
There was a problem hiding this comment.
I believe user_data has been unused for a while because we moved to more task-specific APIs, initially it was for CPU features AFAIR. So it should probably be deprecated and dropped rather than extended.
Anyway, relaying direct_map is an implementation detail of checkpointable_status in criuengine and should not touch user-facing APIs. It can use a similar implementation under the hood, just should not be user-visible.
There was a problem hiding this comment.
Refreshed my knowledge on user-data. It is used to add arbitrary data to checkpoint image, in criuengine case, by adding files to the image. This is suitable for checkpoint->restore data passing but not the other way around because a particular restore should not mutate an image possibly used for many restores.
There was a problem hiding this comment.
as @TimPushkin suggested at the first message, does it make sense to look at process memory mapping in /proc/158992/map and find if there the same path as image_location has?
By this, we can verify
- if the process mapped directly
- if the new CheckpointTo path differs from RestoreFrom.
Do you guys think the approach makes sense or not?
Some CRaC engines don't support the second checkpoint after a restore, or cannot do it right now.
The changes are focused to "improve" the failure mechanism.
This PR introduces 3 checkpointable status.
The statuses are managed by engine API library.
Testing locally:
Progress
Issue
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/crac.git pull/317/head:pull/317$ git checkout pull/317Update a local copy of the PR:
$ git checkout pull/317$ git pull https://git.openjdk.org/crac.git pull/317/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 317View PR using the GUI difftool:
$ git pr show -t 317Using diff file
Download this PR as a diff file:
https://git.openjdk.org/crac/pull/317.diff
Using Webrev
Link to Webrev Comment