Skip to content

Add IT for DELETE TIMESERIES replica consistency under IoTConsensusV2#17332

Open
Pengzna wants to merge 1 commit intoapache:masterfrom
Pengzna:IoTV2/deletion-IT
Open

Add IT for DELETE TIMESERIES replica consistency under IoTConsensusV2#17332
Pengzna wants to merge 1 commit intoapache:masterfrom
Pengzna:IoTV2/deletion-IT

Conversation

@Pengzna
Copy link
Collaborator

@Pengzna Pengzna commented Mar 21, 2026

Summary

  • Add testDeleteTimeSeriesReplicaConsistency() integration test to verify that DELETE TIMESERIES operations are properly replicated across all DataNode replicas in a 3C3D IoTConsensusV2 cluster
  • The test reproduces the scenario from the historical deletion replication bug (fixed in [IoTV2]: Pick deletion event for historical resend #17329): deletion events missing replicateIndex were silently dropped by IoTConsensusV2Processor, causing schema inconsistency across replicas
  • Unify INSERTION constants across all test methods to use 3 columns (speed, temperature, power), replacing the prior 2-column variants

Test Scenario

  1. Insert data with 3 measurements (speed, temperature, power) and flush
  2. Insert more data without flush (WAL-only entries, simulating in-flight data)
  3. DELETE TIMESERIES root.sg.d1.speed
  4. Flush to persist the deletion
  5. Wait for replication to complete (syncLag == 0) on all data region leaders
  6. Verify schema consistency on each DataNode independently via SHOW TIMESERIES
  7. Stop each DataNode one by one (triggers consensus pipe reconstruction + historical replay), verify all surviving nodes still show consistent schema — the deleted timeseries must be absent everywhere

Test Plan

  • testDeleteTimeSeriesReplicaConsistency added in both stream and batch mode subclasses
  • Run IoTDBIoTConsensusV2Stream3C3DBasicIT.testDeleteTimeSeriesReplicaConsistency
  • Run IoTDBIoTConsensusV2Batch3C3DBasicIT.testDeleteTimeSeriesReplicaConsistency
  • Verify existing tests test3C3DWriteFlushAndQuery and testReplicaConsistencyAfterLeaderStop still pass with unified 3-column insertions

🤖 Generated with Claude Code

Add testDeleteTimeSeriesReplicaConsistency() to verify that DELETE
TIMESERIES operations are properly replicated across all DataNode
replicas in a 3C3D IoTConsensusV2 cluster. This test reproduces the
scenario from the historical deletion replication bug where deletion
events lacking replicateIndex were silently dropped by the Processor.

The test inserts data with 3 measurements, leaves some data unflushed,
deletes one timeseries, then verifies schema consistency on every
DataNode — including after stopping and restarting each node in turn
to trigger consensus pipe reconstruction and historical replay.

Also unifies INSERTION constants to use 3 columns (speed, temperature,
power) across all test methods, removing the prior 2-column variants.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 21, 2026 06:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an integration test to ensure DELETE TIMESERIES is consistently replicated across replicas in a 3C3D IoTConsensusV2 cluster (stream + batch), targeting a historical deletion-replication inconsistency. Also standardizes test insertions to use 3 measurements (speed, temperature, power) and updates existing verification accordingly.

Changes:

  • Add testDeleteTimeSeriesReplicaConsistency() coverage in both stream and batch IoTConsensusV2 3C3D ITs via the shared base implementation.
  • Implement end-to-end replica schema verification after DELETE TIMESERIES (including node stop/restart cycles) in the 3C3D base test class.
  • Unify insert/query constants and expected verification results to use 3 measurements instead of prior 2-measurement variants.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
integration-test/src/test/java/org/apache/iotdb/db/it/iotconsensusv2/stream/IoTDBIoTConsensusV2Stream3C3DBasicIT.java Exposes the new delete-timeseries replica consistency test in stream mode.
integration-test/src/test/java/org/apache/iotdb/db/it/iotconsensusv2/batch/IoTDBIoTConsensusV2Batch3C3DBasicIT.java Exposes the new delete-timeseries replica consistency test in batch mode.
integration-test/src/test/java/org/apache/iotdb/db/it/iotconsensusv2/IoTDBIoTConsensusV23C3DBasicITBase.java Adds the shared delete-timeseries replication test logic and updates inserts/verification for 3 measurements.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

String stoppedDesc = "DataNode " + stoppedNode.getIp() + ":" + stoppedNode.getPort();
LOGGER.info("Stopping {}", stoppedDesc);
stoppedNode.stopForcibly();
Assert.assertFalse(stoppedDesc + " should be stopped", stoppedNode.isAlive());
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After calling stopForcibly(), the test immediately asserts !isAlive(). AbstractNodeWrapper.stopForcibly() only waits up to 10s and ignores the return value, so the process may still be alive and this assertion can be flaky. Prefer awaiting the node to actually stop (e.g., Awaitility.until(() -> !stoppedNode.isAlive())) before proceeding.

Suggested change
Assert.assertFalse(stoppedDesc + " should be stopped", stoppedNode.isAlive());
Awaitility.await()
.atMost(60, TimeUnit.SECONDS)
.untilAsserted(
() ->
Assert.assertFalse(
stoppedDesc + " should be stopped", stoppedNode.isAlive()));

Copilot uses AI. Check for mistakes.

// Step 7: Stop each DataNode one by one and verify remaining nodes still consistent
LOGGER.info(
"Step 7: Stopping each DataNode in turn and verifying remaining nodes show consistent schema...");
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log message line is likely to exceed the project's 100-character Checkstyle limit once indentation is included. Please wrap/split the string (or use multiple LOGGER.info calls) to keep each line within the limit.

Suggested change
"Step 7: Stopping each DataNode in turn and verifying remaining nodes show consistent schema...");
"Step 7: Stopping each DataNode in turn and verifying remaining nodes "
+ "show consistent schema...");

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants