Skip to content

Conversation

@featzhang
Copy link
Member

@featzhang featzhang commented Feb 12, 2026

What is the purpose of the change

FLINK-39079

This pull request adds a Top N metrics aggregation panel to the Flink Web UI Job Overview page to help users quickly identify performance bottlenecks. Currently, users need to manually navigate through multiple pages to check metrics for each TaskManager, operator, and subtask. The new Top N metrics panel displays Top N CPU Consumers, Top N Backpressure Operators, and Top N GC Intensive Tasks, providing immediate visibility into the most resource-intensive components.

Brief change log

  • Added TopNMetricsHeaders.java - REST API endpoint definition for /jobs/:jobid/metrics/top-n
  • Added TopNMetricsParameters.java - API parameter support for GET requests
  • Added TopNMetricsResponseBody.java - Response body structure with three metric arrays
  • Added TopNMetricsHandler.java - Handler that aggregates and returns top N metrics from all subtasks
  • Registered TopNMetricsHandler in WebMonitorEndpoint.java
  • Added top-n-metrics.ts - TypeScript interfaces for Top N metrics
  • Added TopNMetricsService.ts - Service to fetch Top N metrics from backend
  • Added TopNMetricsComponent.ts - Angular component displaying top N metrics with color-coded severity
  • Integrated TopNMetricsComponent into Job Overview page

Verifying this change

This change was manually verified by running a standalone cluster, submitting jobs with backpressure, and verifying the Top N panel displays correct metrics.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)
  • If yes, how is the feature documented? (not applicable)

…gnostics

This feature adds a Top N metrics panel to the Job Overview page that displays:
- Top N CPU consumers across all subtasks
- Top N backpressure operators with highest backpressure ratios
- Top N GC intensive tasks with highest GC time percentages

This helps users quickly identify performance bottlenecks without needing to
manually browse through multiple pages to check individual operator metrics.

Backend Changes:
- Add TopNMetricsHeaders.java - REST API endpoint definition for /jobs/:jobid/metrics/top-n
- Add TopNMetricsParameters.java - API parameter support for get and post requests
- Add TopNMetricsResponseBody.java - Response body structure with three metric arrays
- Add TopNMetricsHandler.java - Handler that aggregates and returns top N metrics
- Register TopNMetricsHandler in WebMonitorEndpoint.java

Frontend Changes:
- Add top-n-metrics.ts interface - TypeScript interfaces for Top N metrics
- Add TopNMetricsService - Service to fetch Top N metrics from backend
- Add TopNMetricsComponent - Angular component displaying top N metrics
- Integrate TopNMetricsComponent into Job Overview page
- Automatically load Top N metrics when job data changes
@featzhang featzhang force-pushed the feature/top-n-metrics-dashboard branch from 344768a to 4a7d92f Compare February 12, 2026 10:35
@flinkbot
Copy link
Collaborator

flinkbot commented Feb 12, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@featzhang featzhang changed the title [FLINK-XXXXX][Web UI] Add Top N Metrics Dashboard for performance diagnostics [FLINK-39097][Web UI] Add Top N Metrics Dashboard for performance diagnostics Feb 12, 2026
@featzhang featzhang changed the title [FLINK-39097][Web UI] Add Top N Metrics Dashboard for performance diagnostics [FLINK-39079][Web UI] Add Top N Metrics Dashboard for performance diagnostics Feb 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants