Skip to content

Minor improvements to manage migrations#422

Open
tuplle wants to merge 3 commits intorelease/6.4.2from
6.4.2_shutdown_from_migration
Open

Minor improvements to manage migrations#422
tuplle wants to merge 3 commits intorelease/6.4.2from
6.4.2_shutdown_from_migration

Conversation

@tuplle
Copy link
Member

@tuplle tuplle commented Mar 9, 2026

Description

Improve management of migrations. After some migration on a customer projects restart of the app is required, so an automatic shutdown logic was added. An option to skip migration from properties was added to disable problematic migration at the deployment configuration level.

Dependencies

No new dependencies were introduced.

Blocking Pull requests

There are no dependencies on other PR .

How Has Been This Tested?

Manually testom on dummy migrations.

Test Configuration

Name Tested on
OS Linux
Runtime Eclipse Temurin JDK 11
Dependency Manager Maven 3
Framework version
Run parameters
Other configuration

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • My changes have been checked, personally or remotely, with @...
  • I have commented my code, particularly in hard-to-understand areas
  • I have resolved all conflicts with the target branch of the PR
  • I have updated and synced my code with the target branch
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing tests pass locally with my changes:
    • Lint test
    • Unit tests
    • Integration tests
  • I have checked my contribution with code analysis tools:
  • I have made corresponding changes to the documentation:
    • Developer documentation
    • User Guides
    • Migration Guides

Summary by CodeRabbit

  • New Features

    • Migration settings: skip specific migrations, optional cache eviction, and configurable shutdown after migration.
    • Centralized application shutdown mechanism.
    • Service API: task resolution now returns task details (list) and a user-existence check was added.
  • Refactor

    • Improved user/task lookup and null/empty handling for clearer, more efficient resolution logic.

tuplle added 2 commits March 9, 2026 14:34
…utdown.

- Add `existsById` method and improve task resolution
- Introduced a new `existsById` method in `IUserService` to check user existence.
- Modified `resolveUserRef` in `TaskService` to return a list of resolved tasks instead of void.
- Ensured consistency in validations by replacing `size()` checks with `isEmpty()`.
- Introduced `MigrationProperties` class to define configurable properties for migration, including skip list, cache eviction, and shutdown control.
- Enhanced `MigrationOrderedCommandLineRunner` to respect `MigrationProperties` settings.
- Improved logging for migration operations with additional conditions.
@tuplle tuplle requested review from Retoocs and machacjozef March 9, 2026 14:01
@tuplle tuplle self-assigned this Mar 9, 2026
@tuplle tuplle added the improvement A change that improves on an existing feature label Mar 9, 2026
@coderabbitai
Copy link

coderabbitai bot commented Mar 9, 2026

Walkthrough

Adds configurable migration behavior: skip list, conditional cache eviction, and optional post-migration shutdown via a new ApplicationShutdownProvider; introduces MigrationProperties; adds IUserService.existsById; changes ITaskService.resolveUserRef to return List and updates TaskService/WorkflowService accordingly.

Changes

Cohort / File(s) Summary
Migration shutdown mechanism
src/main/groovy/.../MigrationOrderedCommandLineRunner.groovy, src/main/java/.../ApplicationShutdownProvider.java, src/main/java/.../properties/MigrationProperties.java
Add MigrationProperties (skip, evictCaches, shutdownAfterMigration), ApplicationShutdownProvider component, and migration runner changes: skip logic, conditional cache eviction, configurable shutdownAfterFinish flag and protected toggles; runner invokes shutdown provider when configured.
User service API
src/main/java/.../auth/service/AbstractUserService.java, src/main/java/.../auth/service/interfaces/IUserService.java
Add boolean existsById(String id) to IUserService and implement in AbstractUserService (delegates to repository.existsById).
Task resolution API & impl
src/main/java/.../workflow/service/interfaces/ITaskService.java, src/main/java/.../workflow/service/TaskService.java
Change resolveUserRef(Case) return type from void to List<Task>; TaskService now queries tasks by IDs and returns resolved tasks; remove transactional import from interface.
Workflow service adjustments
src/main/java/.../workflow/service/WorkflowService.java
Replace null/size checks with isEmpty() and validate users via existsById(...) instead of null-checking resolveById results.
Minor formatting
src/main/groovy/.../startup/RunnerController.groovy
Add trailing newline; no logic changes.

Sequence Diagram(s)

sequenceDiagram
  participant Runner as MigrationRunner
  participant Props as MigrationProperties
  participant Cache as CacheManager
  participant Shutdown as ApplicationShutdownProvider

  Runner->>Props: load skip list, evictCaches, shutdownAfterMigration
  Runner->>Runner: perform migrations (skip entries in Props.skip)
  alt Props.evictCaches == true
    Runner->>Cache: evict relevant caches
  end
  alt Runner.shutdownAfterFinish || Props.shutdownAfterMigration
    Runner->>Shutdown: notify shutdown (class, exitCode)
    Shutdown->>Shutdown: SpringApplication.exit(...)
  else
    Runner->>Runner: continue without shutting down
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main objectives of the pull request, which focus on managing migrations through shutdown logic and skip functionality.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai bot added breaking change Fix or feature that would cause existing functionality doesn't work as expected Medium labels Mar 9, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/main/java/com/netgrif/application/engine/workflow/service/TaskService.java (1)

796-802: 🧹 Nitpick | 🔵 Trivial

Consider using existsById for consistency with WorkflowService.

WorkflowService.getExistingUsers (line 244) now uses userService.existsById(id), but this method still uses userService.resolveById(id, false) != null. The existsById approach is more efficient as it avoids loading the full user entity.

♻️ Proposed refactor for consistency
 private List<String> getExistingUsers(UserListFieldValue userListValue) {
     if (userListValue == null)
         return null;
     return userListValue.getUserValues().stream().map(UserFieldValue::getId)
-            .filter(id -> userService.resolveById(id, false) != null)
+            .filter(id -> userService.existsById(id))
             .collect(Collectors.toList());
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@src/main/java/com/netgrif/application/engine/workflow/service/TaskService.java`
around lines 796 - 802, The getExistingUsers method currently loads full user
entities via userService.resolveById(id, false) to filter valid IDs; change it
to use the more efficient existence check userService.existsById(id) instead
(keep the null check for userListValue and the stream mapping of
UserFieldValue::getId), so the stream becomes .filter(id ->
userService.existsById(id)) and still collects to a List<String>; update any
imports if necessary and ensure behavior remains identical except for the more
efficient existence check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@src/main/groovy/com/netgrif/application/engine/migration/MigrationOrderedCommandLineRunner.groovy`:
- Around line 57-63: The methods enableShutdownAfterFinish and
disableShutdownAfterFinish in MigrationOrderedCommandLineRunner currently lack
an explicit return type; change their signatures to declare an explicit void
return type (matching other methods like run() and migrate()) so they become
void enableShutdownAfterFinish() and void disableShutdownAfterFinish(), leaving
the bodies unchanged to preserve behavior.
- Around line 51-54: The sleep(100) call in MigrationOrderedCommandLineRunner
(inside the shutdown conditional checking shutdownAfterFinish or
migrationProperties.isShutdownAfterMigration()) is a magic number and needs
explanation; either replace it with a clearly named constant (e.g.,
LOG_FLUSH_WAIT_MS) or add a brief inline comment explaining why the delay is
required (to allow log buffers/async flushes to complete before calling
shutdownProvider.shutdown(this.class)), so update the code around sleep(100)
accordingly and keep the behavior unchanged.

In
`@src/main/java/com/netgrif/application/engine/configuration/properties/MigrationProperties.java`:
- Around line 11-14: Remove the `@Configuration` annotation from the
MigrationProperties class and keep only `@ConfigurationProperties`(prefix =
"nae.migration") (and `@Data` if desired); then register the properties bean via
either a central configuration class annotated with
`@EnableConfigurationProperties`(MigrationProperties.class) or enable scanning
with `@ConfigurationPropertiesScan` on your application/config so
MigrationProperties is discovered without mixing property binding and
bean-definition roles.

In
`@src/main/java/com/netgrif/application/engine/workflow/service/interfaces/ITaskService.java`:
- Line 108: The current batch resolveUserRef(Case) lacks a transaction boundary
so partial saves can leave DB inconsistent; annotate the outer
resolveUserRef(Case useCase) implementation with `@Transactional` (or add
`@Transactional` at the service layer method that orchestrates the batch) so all
taskRepository.save(task) calls commit or rollback as a unit, and ensure
WorkflowService.resolveUserRef(Case) either runs within the same transaction
(annotated) or delegates full persistence to the transactional
TaskService.resolveUserRef to include the subsequent save(useCase) in the same
atomic operation.

---

Outside diff comments:
In
`@src/main/java/com/netgrif/application/engine/workflow/service/TaskService.java`:
- Around line 796-802: The getExistingUsers method currently loads full user
entities via userService.resolveById(id, false) to filter valid IDs; change it
to use the more efficient existence check userService.existsById(id) instead
(keep the null check for userListValue and the stream mapping of
UserFieldValue::getId), so the stream becomes .filter(id ->
userService.existsById(id)) and still collects to a List<String>; update any
imports if necessary and ensure behavior remains identical except for the more
efficient existence check.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: dde53eb5-b0d8-491a-8f7b-be9efc002693

📥 Commits

Reviewing files that changed from the base of the PR and between 345ab2b and 938d94c.

📒 Files selected for processing (9)
  • src/main/groovy/com/netgrif/application/engine/migration/MigrationOrderedCommandLineRunner.groovy
  • src/main/groovy/com/netgrif/application/engine/startup/RunnerController.groovy
  • src/main/java/com/netgrif/application/engine/auth/service/AbstractUserService.java
  • src/main/java/com/netgrif/application/engine/auth/service/interfaces/IUserService.java
  • src/main/java/com/netgrif/application/engine/configuration/ApplicationShutdownProvider.java
  • src/main/java/com/netgrif/application/engine/configuration/properties/MigrationProperties.java
  • src/main/java/com/netgrif/application/engine/workflow/service/TaskService.java
  • src/main/java/com/netgrif/application/engine/workflow/service/WorkflowService.java
  • src/main/java/com/netgrif/application/engine/workflow/service/interfaces/ITaskService.java

DelegateTaskEventOutcome delegateTask(LoggedUser loggedUser, String delegatedId, String taskId, Map<String, String> params) throws TransitionNotExecutableException;

void resolveUserRef(Case useCase);
List<Task> resolveUserRef(Case useCase);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if `@Transactional` is present on resolveUserRef implementations
rg -nP '@Transactional|resolveUserRef\s*\(' --type=java -A 3 -B 3 | head -100

Repository: netgrif/application-engine

Length of output: 11712


🏁 Script executed:

sed -n '781,810p' src/main/java/com/netgrif/application/engine/workflow/service/TaskService.java

Repository: netgrif/application-engine

Length of output: 1512


🏁 Script executed:

sed -n '1,50p' src/main/java/com/netgrif/application/engine/workflow/service/TaskService.java | grep -A 5 "public class TaskService"

Repository: netgrif/application-engine

Length of output: 52


🏁 Script executed:

rg -n "public class TaskService" src/main/java/com/netgrif/application/engine/workflow/service/TaskService.java -A 10

Repository: netgrif/application-engine

Length of output: 338


🏁 Script executed:

rg -n "public class WorkflowService" src/main/java/com/netgrif/application/engine/workflow/service/WorkflowService.java -A 10

Repository: netgrif/application-engine

Length of output: 380


🏁 Script executed:

sed -n '215,230p' src/main/java/com/netgrif/application/engine/workflow/service/WorkflowService.java

Repository: netgrif/application-engine

Length of output: 705


🏁 Script executed:

rg -n "public Case save\(" src/main/java/com/netgrif/application/engine/workflow/service/WorkflowService.java -B 2 -A 5

Repository: netgrif/application-engine

Length of output: 319


🏁 Script executed:

sed -n '115,135p' src/main/java/com/netgrif/application/engine/workflow/service/WorkflowService.java

Repository: netgrif/application-engine

Length of output: 736


🏁 Script executed:

sed -n '210,230p' src/main/java/com/netgrif/application/engine/workflow/service/WorkflowService.java

Repository: netgrif/application-engine

Length of output: 847


🏁 Script executed:

rg -n "@Transactional" src/main/java/com/netgrif/application/engine/workflow/service/WorkflowService.java -B 1 -A 8 | head -50

Repository: netgrif/application-engine

Length of output: 52


🏁 Script executed:

rg -n "@Transactional" src/main/java/com/netgrif/application/engine/workflow/service/TaskService.java -B 1 -A 1 | head -100

Repository: netgrif/application-engine

Length of output: 52


🏁 Script executed:

git log --oneline -20 src/main/java/com/netgrif/application/engine/workflow/service/interfaces/ITaskService.java

Repository: netgrif/application-engine

Length of output: 137


🏁 Script executed:

git show 938d94c1f:src/main/java/com/netgrif/application/engine/workflow/service/interfaces/ITaskService.java | grep -A 2 "resolveUserRef"

Repository: netgrif/application-engine

Length of output: 215


Clarify transaction management in resolveUserRef call chain.

The return type change from void to List<Task> is appropriate as it enables callers to use the resolved tasks.

However, the transaction boundary concern is valid. The outer method resolveUserRef(Case useCase) at line 775-777 streams through multiple tasks, calling resolveUserRef(Task, Case) for each one. Each task is saved individually via taskRepository.save(task) (line 792). Without @Transactional on the outer method, if any save fails mid-iteration, the database may be left inconsistent with some tasks updated and others not.

Additionally, WorkflowService.resolveUserRef(Case) calls taskService.resolveUserRef(useCase) followed by save(useCase) as separate operations, also without transaction boundaries.

Consider adding @Transactional to the outer resolveUserRef(Case) method in TaskService if atomicity across the batch of task saves is required.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@src/main/java/com/netgrif/application/engine/workflow/service/interfaces/ITaskService.java`
at line 108, The current batch resolveUserRef(Case) lacks a transaction boundary
so partial saves can leave DB inconsistent; annotate the outer
resolveUserRef(Case useCase) implementation with `@Transactional` (or add
`@Transactional` at the service layer method that orchestrates the batch) so all
taskRepository.save(task) calls commit or rollback as a unit, and ensure
WorkflowService.resolveUserRef(Case) either runs within the same transaction
(annotated) or delegates full persistence to the transactional
TaskService.resolveUserRef to include the subsequent save(useCase) in the same
atomic operation.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@src/main/groovy/com/netgrif/application/engine/migration/MigrationOrderedCommandLineRunner.groovy`:
- Around line 52-55: The arbitrary sleep(333) in
MigrationOrderedCommandLineRunner before calling
shutdownProvider.shutdown(this.class) is unsafe; replace it by explicitly
waiting for the actual async work to finish: flush/refresh any Elasticsearch
client operations and gracefully shutdown/await termination of the executor(s)
used by your app (e.g., the executor/ThreadPoolTaskExecutor or ExecutorService
used for async tasks and the executor used by the elastic executor), with a
bounded timeout and proper error/logging, and only call
ApplicationShutdownProvider.shutdown() after those awaitTermination/termination
checks complete or time out; locate and change the code in
MigrationOrderedCommandLineRunner that currently calls sleep(333) and then
shutdownProvider.shutdown(this.class) so it instead performs the
flush/wait/awaitTermination logic against the real async/ES components before
invoking ApplicationShutdownProvider.shutdown().
- Around line 51-55: The shutdown logic in
MigrationOrderedCommandLineRunner.run() fires per migration and must be moved so
it runs once after the full migration sequence; remove the shutdown block that
checks shutdownAfterFinish or migrationProperties.isShutdownAfterMigration()
(the sleep(333) and shutdownProvider.shutdown(this.class) call) from run(), and
instead implement a single final hook that executes once after all migration
beans have run (for example a dedicated bean or an `@EventListener` triggered
after migration completion) which checks
migrationProperties.isShutdownAfterMigration(), performs the short sleep, and
then calls shutdownProvider.shutdown(...); keep the symbols
migrationProperties.isShutdownAfterMigration(), sleep(333) and
shutdownProvider.shutdown(...) but ensure they are only invoked from that
single-post-migration hook rather than from run().

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: ea3eb8b3-f46c-487a-b7c2-94b185e64eea

📥 Commits

Reviewing files that changed from the base of the PR and between 938d94c and 815e2f6.

📒 Files selected for processing (1)
  • src/main/groovy/com/netgrif/application/engine/migration/MigrationOrderedCommandLineRunner.groovy

Comment on lines +51 to +55
if (shutdownAfterFinish || migrationProperties.isShutdownAfterMigration()) {
// sleep is for elastic executor and other things to flush their work after migration.
// the number was chosen arbitrary by feeling 😅
sleep(333)
shutdownProvider.shutdown(this.class)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

shutdownAfterMigration is firing too early.

run() executes once per migration bean, so Line 51 will terminate the JVM after the first applied migration when migrationProperties.shutdownAfterMigration is enabled. That contradicts the documented semantics of MigrationProperties.shutdownAfterMigration in src/main/java/com/netgrif/application/engine/configuration/properties/MigrationProperties.java:31-37 (“once the migration process is completed”) and can leave later ordered migrations unapplied.

🛠️ Minimal fix in this class
-        if (shutdownAfterFinish || migrationProperties.isShutdownAfterMigration()) {
+        if (shutdownAfterFinish) {
             // sleep is for elastic executor and other things to flush their work after migration.
             // the number was chosen arbitrary by feeling 😅
             sleep(333)
             shutdownProvider.shutdown(this.class)
         }

Handle migrationProperties.shutdownAfterMigration once, from a single hook that runs after the full migration sequence finishes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@src/main/groovy/com/netgrif/application/engine/migration/MigrationOrderedCommandLineRunner.groovy`
around lines 51 - 55, The shutdown logic in
MigrationOrderedCommandLineRunner.run() fires per migration and must be moved so
it runs once after the full migration sequence; remove the shutdown block that
checks shutdownAfterFinish or migrationProperties.isShutdownAfterMigration()
(the sleep(333) and shutdownProvider.shutdown(this.class) call) from run(), and
instead implement a single final hook that executes once after all migration
beans have run (for example a dedicated bean or an `@EventListener` triggered
after migration completion) which checks
migrationProperties.isShutdownAfterMigration(), performs the short sleep, and
then calls shutdownProvider.shutdown(...); keep the symbols
migrationProperties.isShutdownAfterMigration(), sleep(333) and
shutdownProvider.shutdown(...) but ensure they are only invoked from that
single-post-migration hook rather than from run().

Comment on lines +52 to +55
// sleep is for elastic executor and other things to flush their work after migration.
// the number was chosen arbitrary by feeling 😅
sleep(333)
shutdownProvider.shutdown(this.class)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Arbitrary sleep is not a safe shutdown barrier.

The comment says the 333 ms delay was chosen arbitrarily, but ApplicationShutdownProvider.shutdown() ends in System.exit() (src/main/java/com/netgrif/application/engine/configuration/ApplicationShutdownProvider.java:16-21). That means slower environments can still lose in-flight Elasticsearch/executor work, while faster ones just pay the delay. Please wait on the actual async work/executor termination instead of guessing a timeout.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@src/main/groovy/com/netgrif/application/engine/migration/MigrationOrderedCommandLineRunner.groovy`
around lines 52 - 55, The arbitrary sleep(333) in
MigrationOrderedCommandLineRunner before calling
shutdownProvider.shutdown(this.class) is unsafe; replace it by explicitly
waiting for the actual async work to finish: flush/refresh any Elasticsearch
client operations and gracefully shutdown/await termination of the executor(s)
used by your app (e.g., the executor/ThreadPoolTaskExecutor or ExecutorService
used for async tasks and the executor used by the elastic executor), with a
bounded timeout and proper error/logging, and only call
ApplicationShutdownProvider.shutdown() after those awaitTermination/termination
checks complete or time out; locate and change the code in
MigrationOrderedCommandLineRunner that currently calls sleep(333) and then
shutdownProvider.shutdown(this.class) so it instead performs the
flush/wait/awaitTermination logic against the real async/ES components before
invoking ApplicationShutdownProvider.shutdown().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking change Fix or feature that would cause existing functionality doesn't work as expected improvement A change that improves on an existing feature Medium

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant