Enhancement Search Index by mohityadav766 · Pull Request #26669 · open-metadata/OpenMetadata

mohityadav766 · 2026-03-21T04:56:48Z

Describe your changes:

Fixes

I worked on ... because ...

Type of change:

Checklist:

I have read the CONTRIBUTING document.
My PR title is Fixes <issue-number>: <short explanation>
I have commented on my code, particularly in hard-to-understand areas.
For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

App logging system:
- New AppRunLogAppender and RunLogBuffer classes capture per-run app logs to JSON files with thread-name prefix matching and MDC-based filtering
- OmAppJobListener integrated to start/stop log capture with cleanup on job veto or completion
Search index improvements:
- SearchIndexApp.uninstall() now purges all related state tables and force-stops active distributed jobs
- SearchIndexExecutor adds periodic sink stats synchronization and adjusts entity totals when success+failed exceed initial counts
- DistributedIndexingStrategy and DistributedSearchIndexExecutor add proper shutdown handling and interrupt ordering for lock/heartbeat threads
- ReindexingOrchestrator persists run record before WebSocket broadcast to ensure UI sees terminal status
Database layer:
- Added deleteAll() methods to SearchIndexJobDAO, SearchIndexPartitionDAO, SearchIndexServerStatsDAO, and AppExtensionTimeSeriesDao

_{This will update automatically on new commits.}

openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/RunLogBuffer.java

...ice/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/SearchIndexExecutor.java

openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java

Copilot

Pull request overview

This PR introduces infrastructure to capture and persist per-application-run logs (notably for reindex/SearchIndex runs) and improves SearchIndex stats accuracy/refresh behavior, backed by new unit tests.

Changes:

Add a Logback Appender + buffered file writer to capture app-run logs into logs/app-runs/{appName}/{timestamp}-{serverId}.log.
Wire log-capture lifecycle into Quartz job execution via OmAppJobListener (start/stop capture, MDC setup/cleanup, thread-prefix capture for reindex workers).
Improve SearchIndex stats consistency by adjusting totals when processed counts exceed initial totals and periodically syncing sink stats; add tests for these cases.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
openmetadata-service/src/main/java/org/openmetadata/service/apps/scheduler/OmAppJobListener.java	Starts/stops app-run log capture around Quartz job execution and manages MDC/thread-prefix capture.
openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java	New Logback appender that routes matching events to per-run buffers/files and performs retention cleanup.
openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/RunLogBuffer.java	New buffered writer with periodic flushing and max-line cap for app-run log persistence.
openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/SearchIndexExecutor.java	Periodic sink stat sync and total-records adjustments to keep stats internally consistent.
openmetadata-service/src/test/java/org/openmetadata/service/apps/logging/AppRunLogAppenderTest.java	New tests covering capture behavior, cleanup, listing timestamps, and concurrency safety of app-run logging.
openmetadata-service/src/test/java/org/openmetadata/service/apps/logging/RunLogBufferTest.java	New tests for buffer append/flush/close behavior and max-line handling.
openmetadata-service/src/test/java/org/openmetadata/service/apps/bundles/searchIndex/SearchIndexStatsTest.java	New tests validating adjusted totals and reader/job totals consistency.

Copilot · 2026-03-21T05:04:31Z

openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java

+      for (ThreadPrefixBinding binding : threadPrefixBindings) {
+        if (threadName.startsWith(binding.prefix)) {
+          binding.buffer.append(formatLine(event));
+          return;
+        }
+      }


The thread-prefix capture path returns on the first matching ThreadPrefixBinding. If two captures are active with the same prefix (e.g., overlapping SearchIndex runs both using "reindex-"), all worker-thread logs will be routed to whichever binding was added first, and the other run will miss logs. Consider making prefixes unique per run or changing the logic to disambiguate (e.g., include run/job id in the prefix binding) rather than first-match wins.

Suggested change

for (ThreadPrefixBinding binding : threadPrefixBindings) {

if (threadName.startsWith(binding.prefix)) {

binding.buffer.append(formatLine(event));

return;

}

}

ThreadPrefixBinding bestMatch = null;

int bestPrefixLength = -1;

for (ThreadPrefixBinding binding : threadPrefixBindings) {

if (threadName.startsWith(binding.prefix)) {

int prefixLength = binding.prefix.length();

if (prefixLength > bestPrefixLength) {

bestMatch = binding;

bestPrefixLength = prefixLength;

}

}

}

if (bestMatch != null) {

bestMatch.buffer.append(formatLine(event));

}

Copilot · 2026-03-21T05:04:31Z

openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java

+  static void resetForTest() {
+    registered = false;
+    threadPrefixBindings.clear();
+  }


resetForTest() only clears registered and threadPrefixBindings but leaves other static state mutated by tests (e.g., logDirectory, maxRunsPerApp, maxLinesPerRun) and does not remove the appender from the root logger if it was registered. This can leak state across test classes within the same JVM; consider resetting static defaults and detaching/stopping the root appender in resetForTest().

Copilot · 2026-03-21T05:04:31Z

...metadata-service/src/main/java/org/openmetadata/service/apps/scheduler/OmAppJobListener.java

+      String appRunId = String.valueOf(runRecord.getTimestamp());
+      String serverId = ServerIdentityResolver.getInstance().getServerId();
+      MDC.put(AppRunLogAppender.MDC_APP_RUN_ID, appRunId);
+      MDC.put(AppRunLogAppender.MDC_APP_NAME, appName);
+      MDC.put(AppRunLogAppender.MDC_SERVER_ID, serverId);
+      MDC.put(AppRunLogAppender.MDC_APP_ID, jobApp.getId().toString());
+      String[] threadPrefixes = getThreadPrefixesForApp(appName);
+      AppRunLogAppender.startCapture(
+          appRunId, jobApp.getId().toString(), appName, serverId, threadPrefixes);
+      dataMap.put(APP_RUN_LOG_ID, appRunId);
+


appRunLogId is stored in jobExecutionContext.getJobDetail().getJobDataMap(), and later read back from the same JobDetail map in cleanupLogCapture(). If Quartz ever allows overlapping executions of the same scheduled job (same JobDetail), the value can be overwritten by a newer run and the older run will stop the wrong capture (leaking the old buffer and truncating the new one). Store this per-execution (e.g., jobExecutionContext.put(...) / get(...) or jobExecutionContext.getMergedJobDataMap()), not on the shared JobDetail map.

Copilot · 2026-03-21T05:04:31Z

...adata-service/src/test/java/org/openmetadata/service/apps/logging/AppRunLogAppenderTest.java

+  @AfterEach
+  void tearDown() {
+    AppRunLogAppender.getActiveBuffers().clear();
+    AppRunLogAppender.resetForTest();


setUp() mutates static state via AppRunLogAppender.setLogDirectoryForTest(...), but tearDown() does not restore it (and resetForTest() currently doesn’t reset logDirectory). This can leak configuration into other test classes in the same JVM. Restore the defaults in tearDown() (or enhance resetForTest() to reset logDirectory/limits and detach any root-logger appender registered by startCapture).

Suggested change

AppRunLogAppender.resetForTest();

AppRunLogAppender.resetForTest();

AppRunLogAppender.setLogDirectoryForTest(null);

Copilot · 2026-03-21T05:04:32Z

openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/RunLogBuffer.java

+    String batchText = String.join("\n", batch);
+    writeToFile(batchText);
+    AppRunLogMetrics.recordFlush(appName, batch.size());
+  }


flush() calls AppRunLogMetrics.recordFlush(...), but there is no AppRunLogMetrics class in the codebase (repo-wide search only finds these references). This will fail compilation; either add the missing metrics implementation or remove/replace these calls with an existing metrics facility.

Copilot · 2026-03-21T05:04:32Z

openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/RunLogBuffer.java

+  void startFlusher() {
+    try {
+      Files.createDirectories(logFile.getParent());
+      writer =
+          Files.newBufferedWriter(
+              logFile,
+              StandardCharsets.UTF_8,
+              StandardOpenOption.CREATE,
+              StandardOpenOption.APPEND);
+    } catch (IOException e) {
+      LOG.error("Failed to open log file {}: {}", logFile, e.getMessage());
+      return;
+    }


If startFlusher() fails to open the log file (IOException), it logs and returns, but the buffer remains usable and will keep accumulating pending lines (up to maxLines) without ever flushing. Consider returning a success/failure signal (or throwing) so the caller can stop capture/remove the buffer, or mark the buffer as closed/disabled on failure.

Copilot · 2026-03-21T05:04:32Z

openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java

+  public static RunLogBuffer startCapture(
+      String appRunId, String appId, String appName, String serverId, String... threadPrefixes) {
+    ensureRegistered();
+    cleanupOldRuns(appName);
+    Path logFile = resolveLogFile(appName, Long.parseLong(appRunId), serverId);
+    RunLogBuffer buffer =
+        new RunLogBuffer(
+            appId, appName, serverId, Long.parseLong(appRunId), maxLinesPerRun, logFile);
+    activeBuffers.put(bufferKey(appName, appRunId), buffer);
+
+    for (String prefix : threadPrefixes) {
+      threadPrefixBindings.add(new ThreadPrefixBinding(prefix, buffer));
+    }
+
+    buffer.startFlusher();
+    AppRunLogMetrics.recordRunStarted(appName, serverId);
+    return buffer;
+  }
+
+  public static void stopCapture(String appName, String appRunId) {
+    RunLogBuffer buffer = activeBuffers.remove(bufferKey(appName, appRunId));
+    if (buffer != null) {
+      threadPrefixBindings.removeIf(b -> b.buffer == buffer);
+      long durationMs = System.currentTimeMillis() - buffer.getRunTimestamp();
+      AppRunLogMetrics.recordRunCompleted(appName, buffer.getServerId(), durationMs);
+      AppRunLogMetrics.recordLinesCapture(appName, buffer.getTotalLineCount());
+      buffer.close();
+    }
+  }
+
+  public static RunLogBuffer getBuffer(String appName, String runTimestamp) {
+    return activeBuffers.get(bufferKey(appName, runTimestamp));
+  }
+
+  static void cleanupOldRuns(String appName) {
+    List<Long> timestamps = listRunTimestamps(appName);
+    if (timestamps.size() <= maxRunsPerApp) {
+      return;
+    }
+    List<Long> toDelete = timestamps.subList(maxRunsPerApp, timestamps.size());
+    Path appDir = resolveAppDir(appName);
+    for (long ts : toDelete) {
+      try (DirectoryStream<Path> stream = Files.newDirectoryStream(appDir, ts + "-*.log")) {
+        for (Path entry : stream) {
+          Files.deleteIfExists(entry);
+        }
+      } catch (IOException e) {
+        // best-effort cleanup
+      }
+    }
+    AppRunLogMetrics.recordCleanup(appName, toDelete.size());
+  }


startCapture/stopCapture/cleanupOldRuns invoke AppRunLogMetrics.*, but there is no AppRunLogMetrics class in the repository (only these references). This makes the module uncompilable; add the missing metrics class or remove/replace these calls.

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Copilot · 2026-03-23T04:30:51Z

openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java

+    buffer.startFlusher();
+    AppRunLogMetrics.recordRunStarted(appName, serverId);
+    return buffer;


AppRunLogAppender references AppRunLogMetrics (e.g., recordRunStarted) but there is no AppRunLogMetrics class in the codebase, so this will not compile. Either add the missing metrics implementation (and wiring) or remove these calls / replace with the existing metrics facility used elsewhere in the service.

Copilot · 2026-03-23T04:30:52Z

openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java

+ * <p>Configured via the {@code logging:} section in {@code openmetadata.yaml}. Uses two-tier
+ * matching: MDC for the scheduler thread, thread name prefixes for worker threads.


The class-level Javadoc says this appender is configured via openmetadata.yaml, but the implementation self-registers programmatically via ensureRegistered() and does not read Dropwizard logging config. Please update the Javadoc to reflect the actual configuration/registration mechanism (and how operators can disable/enable it).

Suggested change

* <p>Configured via the {@code logging:} section in {@code openmetadata.yaml}. Uses two-tier

* matching: MDC for the scheduler thread, thread name prefixes for worker threads.

* <p>This appender is registered programmatically and is not wired through Dropwizard's

* {@code logging:} section in {@code openmetadata.yaml}. Call {@link #ensureRegistered()} once

* during application startup to attach it to the appropriate {@link LoggerContext}. Operators

* who do not want per-run app logs can avoid invoking {@code ensureRegistered()} (or remove this

* appender from the {@link LoggerContext} at runtime).

Copilot · 2026-03-23T04:30:52Z

openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/RunLogBuffer.java

+  void flush() {
+    List<String> batch = drainPending();
+    if (batch.isEmpty()) {
+      return;
+    }
+    String batchText = String.join("\n", batch);
+    writeToFile(batchText);
+  }


flush() drains pending even when the file writer is not available (e.g., if startFlusher() failed to open the log file or was never called). Since writeToFile() returns early when writer == null, this causes buffered log lines to be dropped silently. Consider short-circuiting flush() when writer is null (or lazily opening/retrying the writer) so pending lines are preserved until they can be written.

Copilot · 2026-03-23T04:30:52Z

...ice/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/SearchIndexExecutor.java

+    if (readerStats != null && totalRecords > readerStats.getTotalRecords()) {
+      readerStats.setTotalRecords(totalRecords);


readerStats.getTotalRecords() can be null (e.g., if reader stats were created elsewhere without initializing totalRecords). The comparison totalRecords > readerStats.getTotalRecords() will then throw an NPE. Please null-guard (treat null as 0) before comparing/updating the reader total.

Suggested change

if (readerStats != null && totalRecords > readerStats.getTotalRecords()) {

readerStats.setTotalRecords(totalRecords);

if (readerStats != null) {

Integer readerTotalRecords = readerStats.getTotalRecords();

int safeReaderTotalRecords = readerTotalRecords != null ? readerTotalRecords : 0;

if (totalRecords > safeReaderTotalRecords) {

readerStats.setTotalRecords(totalRecords);

}

This reverts commit c160754.

Copilot

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Copilot · 2026-03-24T06:36:11Z

openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java

+  static void resetForTest() {
+    registered = false;
+    threadPrefixBindings.clear();
+  }


resetForTest() clears flags but doesn't remove the registered APP_RUN_LOG appender from the Logback root logger. Since startCapture() calls ensureRegistered() (using LoggerFactory's global context), unit tests can leave a global appender attached for subsequent tests, causing cross-test interference. Consider having resetForTest() also detach/stop the root appender (if present) and restore any mutated static settings (e.g., logDirectory/maxRunsPerApp).

Copilot · 2026-03-24T06:36:11Z

openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/RunLogBuffer.java

+  void flush() {
+    List<String> batch = drainPending();
+    if (batch.isEmpty()) {
+      return;
+    }


flush() can be invoked by the scheduled flusher thread while close() also calls flush() from the caller thread, but writer access isn't synchronized. This can lead to concurrent writes/flushes on the same BufferedWriter (and races with closeWriter()), potentially corrupting output or throwing IO errors. Consider guarding flush/writeToFile/closeWriter with a lock, or ensuring the final flush/close happens on the flusher thread after cancelling further executions.

sonarqubecloud · 2026-03-24T07:31:27Z

Quality Gate passed for 'open-metadata-ingestion'

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

gitar-bot · 2026-03-24T11:18:32Z

...-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/SearchIndexApp.java

+    for (String jobIdStr : runningJobIds) {
+      LOG.info("Stopping distributed job {} via coordinator fallback", jobIdStr);
+      coordinator.requestStop(java.util.UUID.fromString(jobIdStr));
+    }


⚠️ Edge Case: Exception in loop aborts stopping remaining distributed jobs

In tryStopOutsideQuartz(), the loop over runningJobIds calls UUID.fromString(jobIdStr) and coordinator.requestStop() without per-iteration exception handling. If any single job ID is malformed or requestStop throws, the remaining jobs in the list will not be stopped.

While the outer tryStopViaApp catch block prevents an unhandled exception from propagating further, it means a single failure silently prevents cleanup of all subsequent running jobs.

Suggested fix:

for (String jobIdStr : runningJobIds) { try { LOG.info("Stopping distributed job {} via coordinator fallback", jobIdStr); coordinator.requestStop(java.util.UUID.fromString(jobIdStr)); } catch (Exception e) { LOG.warn("Failed to stop distributed job {}", jobIdStr, e); } }

_{Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion}

openmetadata-service/src/main/java/org/openmetadata/service/apps/scheduler/AppScheduler.java

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.

Copilot · 2026-03-24T11:22:08Z

openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java

+  static String formatLine(ILoggingEvent event) {
+    return String.format(
+        "{\"timestamp\":%d,\"level\":\"%s\",\"thread\":\"%s\",\"logger\":\"%s\",\"message\":\"%s\"}",
+        event.getTimeStamp(),
+        event.getLevel(),
+        escapeJson(event.getThreadName()),
+        escapeJson(event.getLoggerName()),
+        escapeJson(event.getFormattedMessage()));
+  }
+
+  private static String escapeJson(String value) {
+    if (value == null) {
+      return "";
+    }
+    return value
+        .replace("\\", "\\\\")
+        .replace("\"", "\\\"")
+        .replace("\n", "\\n")
+        .replace("\r", "\\r")
+        .replace("\t", "\\t");
+  }


formatLine()/escapeJson() builds JSON manually but only escapes a small subset of characters. This can produce invalid JSON for other control characters (e.g., backspace/formfeed or other 0x00–0x1F chars) and is easy to drift from the server’s canonical JSON log format. Consider using the existing JsonUtils/ObjectMapper (or a Logback JSON encoder/layout) to serialize a structured object/map so escaping and formatting remain correct and consistent.

gitar-bot · 2026-03-24T11:23:25Z

...-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/SearchIndexApp.java

+      AppRunRecord latestRun = appRepository.getLatestAppRuns(app);
+      if (latestRun != null && latestRun.getStatus() == AppRunRecord.Status.RUNNING) {


💡 Edge Case: Null check on getLatestAppRuns is dead code — method throws

AppRepository.getLatestAppRuns(App) throws an exception (via orElseThrow) when no run record exists, rather than returning null. The latestRun != null guard on line 108 is therefore dead code — the absence case is instead handled by the catch (Exception e) block, which logs a warning.

This works correctly in practice since the catch block is resilient, but the null check is misleading to future readers who may assume the method can return null.

Suggested fix:

Either: 1. Use the Optional-returning variant (`getLatestExtensionByIdOptional`) instead so the null check is meaningful, or 2. Remove the null check and add a comment noting the exception is intentionally caught below.

_{Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion}

conf/openmetadata.yaml

gitar-bot · 2026-03-24T12:25:25Z

Code Review ⚠️ Changes requested 6 resolved / 9 findings

Search index enhancement with 8 commits addressing logging and job management, but 3 important issues remain unresolved: formatLine drops exception stack traces, exception in the job-stopping loop aborts remaining distributed jobs, and dead null check on getLatestAppRuns. Address these blocking issues before merge.

⚠️

Bug: formatLine drops exception stack traces from log events

📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java:70-78

The formatLine method captures timestamp, level, thread, logger, and message, but never calls event.getThrowableProxy(). When application code logs exceptions (e.g., LOG.error("Reindex failed", exception)), the stack trace—arguably the most important debugging information—is silently discarded from the captured log files.

For a logging system designed to help debug app run failures, losing stack traces significantly reduces its value.

Suggested fix

static String formatLine(ILoggingEvent event) {
  StringBuilder sb = new StringBuilder();
  sb.append(String.format(
      "{"timestamp":%d,"level":"%s","thread":"%s","logger":"%s","message":"%s"",
      event.getTimeStamp(), event.getLevel(),
      escapeJson(event.getThreadName()),
      escapeJson(event.getLoggerName()),
      escapeJson(event.getFormattedMessage())));
  IThrowableProxy tp = event.getThrowableProxy();
  if (tp != null) {
    ThrowableProxyConverter converter = new ThrowableProxyConverter();
    converter.start();
    sb.append(","exception":"").append(escapeJson(converter.convert(event))).append(""");
  }
  sb.append("}");
  return sb.toString();
}

⚠️

Edge Case: Exception in loop aborts stopping remaining distributed jobs

📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/SearchIndexApp.java:90-93

In tryStopOutsideQuartz(), the loop over runningJobIds calls UUID.fromString(jobIdStr) and coordinator.requestStop() without per-iteration exception handling. If any single job ID is malformed or requestStop throws, the remaining jobs in the list will not be stopped.

While the outer tryStopViaApp catch block prevents an unhandled exception from propagating further, it means a single failure silently prevents cleanup of all subsequent running jobs.

Suggested fix

for (String jobIdStr : runningJobIds) {
  try {
    LOG.info("Stopping distributed job {} via coordinator fallback", jobIdStr);
    coordinator.requestStop(java.util.UUID.fromString(jobIdStr));
  } catch (Exception e) {
    LOG.warn("Failed to stop distributed job {}", jobIdStr, e);
  }
}

💡 Edge Case: Null check on getLatestAppRuns is dead code — method throws

📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/SearchIndexApp.java:107-108

AppRepository.getLatestAppRuns(App) throws an exception (via orElseThrow) when no run record exists, rather than returning null. The latestRun != null guard on line 108 is therefore dead code — the absence case is instead handled by the catch (Exception e) block, which logs a warning.

This works correctly in practice since the catch block is resilient, but the null check is misleading to future readers who may assume the method can return null.

Suggested fix

Either:
1. Use the Optional-returning variant (`getLatestExtensionByIdOptional`) instead so the null check is meaningful, or
2. Remove the null check and add a comment noting the exception is intentionally caught below.

✅ 6 resolved

✅ Bug: AppRunLogMetrics class is missing — code will not compile

📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/RunLogBuffer.java:92 📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java:136 📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java:145-146 📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java:171
The AppRunLogMetrics class is referenced in both RunLogBuffer.java (line 92) and AppRunLogAppender.java (lines 136, 145, 146, 171) but does not exist anywhere in the codebase. This will cause a compile-time failure. All four static method calls (recordFlush, recordRunStarted, recordRunCompleted, recordLinesCapture, recordCleanup) have no backing implementation.

✅ Bug: periodicSyncSinkStats has check-then-act race on volatile field

📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/SearchIndexExecutor.java:135-136 📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/SearchIndexExecutor.java:1522-1528
The lastSinkSyncTime field is declared volatile, but periodicSyncSinkStats() performs a non-atomic read-compare-write: it reads the field, checks if enough time has passed, then writes the new timestamp. Multiple consumer threads (up to 40 from consumerExecutor) call this concurrently via processTask(), so several threads may simultaneously observe a stale value and all trigger syncSinkStatsFromBulkSink(). The practical impact is limited to redundant sync calls (no data corruption), but it defeats the throttling intent.

✅ Edge Case: Thread prefix matching can cross-capture logs between apps

📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java:65-73 📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java:131-133
Thread prefix bindings are stored in a flat global list (threadPrefixBindings). If two apps run concurrently and both register the same thread prefix (e.g., "reindex-"), the append() method iterates the list and the first matching binding wins, routing worker-thread logs to the wrong app's buffer. This could also happen if a previous capture's prefixes were not cleaned up before a new one starts.

✅ Edge Case: abbreviateLoggerName crashes on empty package segments

📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java:96-97
If loggerName contains consecutive dots (e.g., "org..service.Foo"), split("\.") produces an empty string element, and parts[i].charAt(0) throws StringIndexOutOfBoundsException. While malformed logger names are unlikely in practice, a guard would make the helper more robust.

✅ Bug: Developer-local config defaults committed to openmetadata.yaml

📄 conf/openmetadata.yaml:244 📄 conf/openmetadata.yaml:249 📄 conf/openmetadata.yaml:252 📄 conf/openmetadata.yaml:269 📄 conf/openmetadata.yaml:448 📄 conf/openmetadata.yaml:463 📄 conf/openmetadata.yaml:483-484 📄 conf/openmetadata.yaml:602
The conf/openmetadata.yaml template has multiple default values changed that appear to be developer-local settings rather than intentional changes for this PR:

DB driver/scheme/port: MySQL → PostgreSQL (lines 244, 249) — breaks the documented default for MySQL users

searchType: elasticsearch → opensearch (line 448)

semanticSearchEnabled: false → true (line 483)

embeddingProvider: bedrock → djl (line 484)

fernetKey: Changed to a different sample key (line 602)

Connection pool tuning: maxSize 100→150, leakDetectionThreshold 60s→5min, maxConnTotal 30→50

These changes will affect every developer and deployment that relies on the default configuration template. The database driver change alone would cause startup failures for anyone running MySQL (the documented default). If the pool tuning and search type changes are intentional, they should be in a separate commit with justification.

...and 1 more resolved from earlier reviews

🤖 Prompt for agents

Code Review: Search index enhancement with 8 commits addressing logging and job management, but 3 important issues remain unresolved: formatLine drops exception stack traces, exception in the job-stopping loop aborts remaining distributed jobs, and dead null check on getLatestAppRuns. Address these blocking issues before merge.

1. ⚠️ Bug: formatLine drops exception stack traces from log events
   Files: openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java:70-78

   The `formatLine` method captures timestamp, level, thread, logger, and message, but never calls `event.getThrowableProxy()`. When application code logs exceptions (e.g., `LOG.error("Reindex failed", exception)`), the stack trace—arguably the most important debugging information—is silently discarded from the captured log files.
   
   For a logging system designed to help debug app run failures, losing stack traces significantly reduces its value.

   Suggested fix:
   static String formatLine(ILoggingEvent event) {
     StringBuilder sb = new StringBuilder();
     sb.append(String.format(
         "{"timestamp":%d,"level":"%s","thread":"%s","logger":"%s","message":"%s"",
         event.getTimeStamp(), event.getLevel(),
         escapeJson(event.getThreadName()),
         escapeJson(event.getLoggerName()),
         escapeJson(event.getFormattedMessage())));
     IThrowableProxy tp = event.getThrowableProxy();
     if (tp != null) {
       ThrowableProxyConverter converter = new ThrowableProxyConverter();
       converter.start();
       sb.append(","exception":"").append(escapeJson(converter.convert(event))).append(""");
     }
     sb.append("}");
     return sb.toString();
   }

2. ⚠️ Edge Case: Exception in loop aborts stopping remaining distributed jobs
   Files: openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/SearchIndexApp.java:90-93

   In `tryStopOutsideQuartz()`, the loop over `runningJobIds` calls `UUID.fromString(jobIdStr)` and `coordinator.requestStop()` without per-iteration exception handling. If any single job ID is malformed or `requestStop` throws, the remaining jobs in the list will not be stopped.
   
   While the outer `tryStopViaApp` catch block prevents an unhandled exception from propagating further, it means a single failure silently prevents cleanup of all subsequent running jobs.

   Suggested fix:
   for (String jobIdStr : runningJobIds) {
     try {
       LOG.info("Stopping distributed job {} via coordinator fallback", jobIdStr);
       coordinator.requestStop(java.util.UUID.fromString(jobIdStr));
     } catch (Exception e) {
       LOG.warn("Failed to stop distributed job {}", jobIdStr, e);
     }
   }

3. 💡 Edge Case: Null check on getLatestAppRuns is dead code — method throws
   Files: openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/SearchIndexApp.java:107-108

   `AppRepository.getLatestAppRuns(App)` throws an exception (via `orElseThrow`) when no run record exists, rather than returning null. The `latestRun != null` guard on line 108 is therefore dead code — the absence case is instead handled by the `catch (Exception e)` block, which logs a warning.
   
   This works correctly in practice since the catch block is resilient, but the null check is misleading to future readers who may assume the method can return null.

   Suggested fix:
   Either:
   1. Use the Optional-returning variant (`getLatestExtensionByIdOptional`) instead so the null check is meaningful, or
   2. Remove the null check and add a comment noting the exception is intentionally caught below.

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

`Auto-apply`	`Compact`
`gitar auto-apply:on`	`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

Copilot

Pull request overview

Copilot reviewed 207 out of 207 changed files in this pull request and generated 2 comments.

Copilot · 2026-03-24T12:28:26Z

openmetadata-service/src/test/java/org/openmetadata/service/search/indexes/SearchIndexTest.java

+// Mirror of EXCLUDED_FIELDS — can't reference it directly
+// because SearchIndex has a static initializer that requires Entity to be bootstrapped.
+
 class SearchIndexTest {

+  private static final Set<String> EXCLUDED_FIELDS =
+      Set.of(
+          "changeDescription",
+          "incrementalChangeDescription",
+          "upstreamLineage.pipeline.changeDescription",
+          "upstreamLineage.pipeline.incrementalChangeDescription",
+          "connection",
+          "changeSummary");


This test hardcodes a local copy of the production excluded-field list. That duplication is likely to drift from the real list over time, making the test less reliable. Consider moving the excluded-field set to a shared location that can be referenced safely in tests (e.g., a constant in SearchIndexUtils), or exposing it from SearchIndex without triggering heavy static initialization.

Copilot · 2026-03-24T12:28:26Z

openmetadata-service/src/test/java/org/openmetadata/service/search/indexes/SearchIndexTest.java

+  private static final Set<String> EXCLUDED_FIELDS =
+      Set.of(
+          "changeDescription",
+          "incrementalChangeDescription",
+          "upstreamLineage.pipeline.changeDescription",
+          "upstreamLineage.pipeline.incrementalChangeDescription",
+          "connection",


EXCLUDED_FIELDS includes dotted-path entries like upstreamLineage.pipeline.changeDescription, but the current SearchIndexUtils.removeFieldByPath(...) implementation only navigates the first path element and then removes the last key from that same map, so deeper paths are not actually removed. Consider adding a test here that asserts dotted-path removal works and updating removeFieldByPath to traverse the full path before removing the final key (otherwise these exclusions won't take effect).

mohityadav766 and others added 14 commits March 19, 2026 18:29

Fix Stats for non-distributed

048f045

Add Logging

27345bd

Get Streaming logs

80b351b

Fix Logs

1d78148

Fix Logs Scrolling

4f33f66

Merge branch 'main' into reindex-logger

8f4dee5

Review Comments

1ffe113

Merge remote-tracking branch 'origin/reindex-logger' into reindex-logger

8acbd40

Review Comments by Copilot

1a1c54c

Review Comments by Gitar

fd13d17

add S3 support

ca2e260

Address review comments

816e371

Merge branch 'main' into reindex-logger

953258e

Remove UI , S3

620fe86

Copilot AI review requested due to automatic review settings March 21, 2026 04:56

github-actions bot added backend safe to test Add this label to run secure Github workflows on PRs labels Mar 21, 2026

Copilot started reviewing on behalf of mohityadav766 March 21, 2026 04:57 View session

gitar-bot bot reviewed Mar 21, 2026

View reviewed changes

openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/RunLogBuffer.java Outdated Show resolved Hide resolved

gitar-bot bot reviewed Mar 21, 2026

View reviewed changes

...ice/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/SearchIndexExecutor.java Outdated Show resolved Hide resolved

gitar-bot bot reviewed Mar 21, 2026

View reviewed changes

openmetadata-service/src/main/java/org/openmetadata/service/apps/logging/AppRunLogAppender.java Show resolved Hide resolved

Copilot AI reviewed Mar 21, 2026

View reviewed changes

mohityadav766 and others added 2 commits March 21, 2026 10:44

Remove AppLogMetrics

02ef212

Merge branch 'main' into reindex-logger

545f58a

Copilot AI review requested due to automatic review settings March 23, 2026 04:26

Copilot started reviewing on behalf of mohityadav766 March 23, 2026 04:26 View session

Copilot AI reviewed Mar 23, 2026

View reviewed changes

mohityadav766 and others added 2 commits March 23, 2026 15:09

Remove AppLogMetrics

137e4fe

Merge branch 'main' into reindex-logger

aee6f7a

Copilot AI review requested due to automatic review settings March 23, 2026 09:39

mohityadav766 had a problem deploying to test March 24, 2026 05:57 — with GitHub Actions Error

mohityadav766 added 2 commits March 24, 2026 11:59

Fix Failing Tests

c160754

Revert "Fix Failing Tests"

4e0b46f

This reverts commit c160754.

Copilot AI review requested due to automatic review settings March 24, 2026 06:30

Merge branch 'main' into reindex-logger

7692c91

Copilot started reviewing on behalf of mohityadav766 March 24, 2026 06:31 View session

Copilot AI reviewed Mar 24, 2026

View reviewed changes

mohityadav766 temporarily deployed to test March 24, 2026 06:41 — with GitHub Actions Inactive

Tune params

33bbec7

mohityadav766 had a problem deploying to test March 24, 2026 10:59 — with GitHub Actions Error

Stop application in case the quartz job isn't available

e87974a

gitar-bot bot reviewed Mar 24, 2026

View reviewed changes

openmetadata-service/src/main/java/org/openmetadata/service/apps/scheduler/AppScheduler.java Show resolved Hide resolved

Mark the job stopped

dedfbd1

Copilot AI reviewed Mar 24, 2026

View reviewed changes

gitar-bot bot reviewed Mar 24, 2026

View reviewed changes

ADd changeDescrption to not be indexed

94fd9fc

gitar-bot bot reviewed Mar 24, 2026

View reviewed changes

conf/openmetadata.yaml Outdated Show resolved Hide resolved

mohityadav766 added 2 commits March 24, 2026 17:51

Fix the usageDate

3f737ac

Revert yaml

8a79e7c

Copilot AI reviewed Mar 24, 2026

View reviewed changes

-      for (ThreadPrefixBinding binding : threadPrefixBindings) {
-        if (threadName.startsWith(binding.prefix)) {
-          binding.buffer.append(formatLine(event));
-          return;
-        }
-      }
+      ThreadPrefixBinding bestMatch = null;
+      int bestPrefixLength = -1;
+      for (ThreadPrefixBinding binding : threadPrefixBindings) {
+        if (threadName.startsWith(binding.prefix)) {
+          int prefixLength = binding.prefix.length();
+          if (prefixLength > bestPrefixLength) {
+            bestMatch = binding;
+            bestPrefixLength = prefixLength;
+          }
+        }
+      }
+      if (bestMatch != null) {
+        bestMatch.buffer.append(formatLine(event));
+      }

	AppRunLogAppender.resetForTest();
	AppRunLogAppender.resetForTest();
	AppRunLogAppender.setLogDirectoryForTest(null);

		* <p>Configured via the {@code logging:} section in {@code openmetadata.yaml}. Uses two-tier
		* matching: MDC for the scheduler thread, thread name prefixes for worker threads.

- * <p>Configured via the {@code logging:} section in {@code openmetadata.yaml}. Uses two-tier
- * matching: MDC for the scheduler thread, thread name prefixes for worker threads.
+ * <p>This appender is registered programmatically and is not wired through Dropwizard's
+ * {@code logging:} section in {@code openmetadata.yaml}. Call {@link #ensureRegistered()} once
+ * during application startup to attach it to the appropriate {@link LoggerContext}. Operators
+ * who do not want per-run app logs can avoid invoking {@code ensureRegistered()} (or remove this
+ * appender from the {@link LoggerContext} at runtime).

		if (readerStats != null && totalRecords > readerStats.getTotalRecords()) {
		readerStats.setTotalRecords(totalRecords);

-    if (readerStats != null && totalRecords > readerStats.getTotalRecords()) {
-      readerStats.setTotalRecords(totalRecords);
+    if (readerStats != null) {
+      Integer readerTotalRecords = readerStats.getTotalRecords();
+      int safeReaderTotalRecords = readerTotalRecords != null ? readerTotalRecords : 0;
+      if (totalRecords > safeReaderTotalRecords) {
+        readerStats.setTotalRecords(totalRecords);
+      }

		AppRunRecord latestRun = appRepository.getLatestAppRuns(app);
		if (latestRun != null && latestRun.getStatus() == AppRunRecord.Status.RUNNING) {

Conversation

mohityadav766 commented Mar 21, 2026 • edited by gitar-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes:

Type of change:

Checklist:

Summary by Gitar

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Mar 24, 2026

Quality Gate passed for 'open-metadata-ingestion'

Uh oh!

gitar-bot bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

gitar-bot bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

mohityadav766 commented Mar 21, 2026 •

edited by gitar-bot bot

Loading

gitar-bot bot commented Mar 24, 2026 •

edited

Loading