Skip to content

RATIS-2395. Add Client Span for AyncImpl#1385

Open
taklwu wants to merge 2 commits intoapache:masterfrom
taklwu:RATIS-2395
Open

RATIS-2395. Add Client Span for AyncImpl#1385
taklwu wants to merge 2 commits intoapache:masterfrom
taklwu:RATIS-2395

Conversation

@taklwu
Copy link
Contributor

@taklwu taklwu commented Mar 23, 2026

What changes were proposed in this pull request?

After RATIS-2393 / #1341, this change aims to provide the first end-to-end tracing triggered from the any async implementation when sending a rpc request.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/RATIS-2395

How was this patch tested?

unit tests and extend existing tests in ratis-test module with feature enabled.

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taklwu , thanks for working on this! Please see the comments inlined and also https://issues.apache.org/jira/secure/attachment/13081378/1385_review.patch


private static final Logger LOG = LoggerFactory.getLogger(TraceUtils.class);

private static final RaftProperties PROPERTIES = new RaftProperties();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot new RaftProperties() here since it will get all the default values.

Ratis is a library and does not manage any conf files. The RaftProperties is passed from the application when building RaftClient/RaftServer


private static Span createClientOperationSpan(RaftClientRequest.Type type, RaftPeerId server,
String spanName) {
String name = (spanName == null || spanName.isEmpty()) ? DEFAULT_OPERATION : spanName;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the spanName is missing, should it be a bug? We probably should check null but not setting DEFAULT_OPERATION.

private static Span createClientOperationSpan(RaftClientRequest.Type type, RaftPeerId server,
String spanName) {
String name = (spanName == null || spanName.isEmpty()) ? DEFAULT_OPERATION : spanName;
String peerId = server == null ? UNKNOWN_PEER : String.valueOf(server);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When server == null, the client request will be sent to the leader. So, let's use

  private static final String LEADER = "LEADER";

Comment on lines +64 to +66
public static void setTracingEnabled(boolean enabled) {
TraceConfigKeys.setEnabled(PROPERTIES, enabled);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change TRACER to AtomicReference. Then, it is enabled if it is not null.

  private static final AtomicReference<Tracer> TRACER = new AtomicReference<>();
  
  public static void setEnabled(RaftProperties properties) {
    setEnabled(TraceConfigKeys.enabled(properties));
  }

  public static void setEnabled(boolean enabled) {
    if (enabled) {
      TRACER.updateAndGet(previous -> previous != null ? previous
          : GlobalOpenTelemetry.getTracer("org.apache.ratis", VersionInfo.getSoftwareInfoVersion()));
    } else {
      TRACER.set(null);
    }
  }

  static boolean isEnabled() {
    return TRACER.get() != null;
  }

public static <T, THROWABLE extends Throwable> CompletableFuture<T> traceAsyncImplSend(
CheckedSupplier<CompletableFuture<T>, THROWABLE> action,
RaftClientRequest.Type type, RaftPeerId server) throws THROWABLE {
return isTraceEnabled() ? traceAsyncClientSend(action, type, server, "AsyncImpl::send") : action.get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Let's check enabled in traceAsyncMethod(..).
@@ -62,6 +78,10 @@ public final class TraceUtils {
    */
   static <T, THROWABLE extends Throwable> CompletableFuture<T> traceAsyncMethod(
       CheckedSupplier<CompletableFuture<T>, THROWABLE> action, Supplier<Span> spanSupplier) throws THROWABLE {
+    if (!isEnabled()) {
+      return action.get();
+    }

Comment on lines +117 to +121
public static <T, THROWABLE extends Throwable> CompletableFuture<T> traceAsyncClientSend(
CheckedSupplier<CompletableFuture<T>, THROWABLE> action,
RaftClientRequest.Type type, RaftPeerId server, String spanName) throws THROWABLE {
return traceAsyncMethod(action, () -> createClientOperationSpan(type, server, spanName));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • We may combine traceAsyncClientSend and traceAsyncImplSend.
  • BTW, let move the client related code to a new TraceClient class. We should also move the server related code to a new TraceServer class. Then, TraceUtils will contain only the common methods.
  public static <T, THROWABLE extends Throwable> CompletableFuture<T> asyncSend(
      CheckedSupplier<CompletableFuture<T>, THROWABLE> action,
      RaftClientRequest.Type type, RaftPeerId server) throws THROWABLE {
    return TraceUtils.traceAsyncMethod(action, () -> createClientOperationSpan(type, server, "Async::send"));
  }

Comment on lines +32 to +34
public static final AttributeKey<String> PEER_ID = AttributeKey.stringKey("rpc.peer.id");
public static final AttributeKey<String> OPERATION_NAME = AttributeKey.stringKey("ratis.operation.name");
public static final AttributeKey<String> OPERATION_TYPE = AttributeKey.stringKey("ratis.operation.type");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep using "raft" prefix.

  public static final AttributeKey<String> PEER_ID = AttributeKey.stringKey("raft.peer.id");
  public static final AttributeKey<String> OPERATION_NAME = AttributeKey.stringKey("raft.operation.name");
  public static final AttributeKey<String> OPERATION_TYPE = AttributeKey.stringKey("raft.operation.type");

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants