Skip to content

Conversation

@capistrant
Copy link
Contributor

Description

Bug Description

Tombstones are not properly created when in REPLACE indexing mode and ParallelIndexSupervisorTask range partitioning gets no reports for partition boundaries due to all rows being filtered out of the underlying source data. This results in existing segments remaining available that should have been overshadowed.

Reproduction Steps

Compaction with range partitioning and a compaction transform filter that filters all rows out of the segments being compacted.

Expected behavior: tombstones created for the interval, and the old segments being overshadowed.

Actual behavior: Task completes with a warning that there were no valid rows for range partitioning. This leaves the old segments in place and will lead to an infinite loop of compaction

Release note


Key changed/added classes in this PR
  • ParallelIndexSupervisorTask

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@capistrant
Copy link
Contributor Author

I thought the hashed partitions phase would face the same issue, but it does not. The test I added flexes the code path that would have been buggy if it did, but tombstones still end up being created.

if (getIngestionMode() == IngestionMode.REPLACE) {
// In REPLACE mode, publish segments (and tombstones, when called for) even when no new data was produced
publishSegments(toolbox, Collections.emptyMap());
TaskStatus taskStatus = TaskStatus.success(getId(), msg);

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note

Invoking
TaskStatus.success
should be avoided because it has been deprecated.
return taskStatus;
} else {
LOG.warn(msg);
return TaskStatus.success(getId(), msg);

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note

Invoking
TaskStatus.success
should be avoided because it has been deprecated.
@capistrant
Copy link
Contributor Author

I forgot that embedded tests coverage doesn't count for jacoco. will have to look at what existing test structure there is in the package

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant