Skip to content

[Python] Honor disableCounterMetrics, disableStringSetMetrics, and disableBoundedTrieMetrics experiments#38749

Open
Anuragp22 wants to merge 4 commits into
apache:masterfrom
Anuragp22:py-disable-user-metrics-38746
Open

[Python] Honor disableCounterMetrics, disableStringSetMetrics, and disableBoundedTrieMetrics experiments#38749
Anuragp22 wants to merge 4 commits into
apache:masterfrom
Anuragp22:py-disable-user-metrics-38746

Conversation

@Anuragp22
Copy link
Copy Markdown

Adds Python SDK support for the disableCounterMetrics, disableStringSetMetrics, and disableBoundedTrieMetrics pipeline experiments, mirroring the Java SDK Metrics.MetricsFlag behavior. High-throughput Python pipelines previously had no way to opt out of user metric kinds that add pressure to metric backends; Java already honored these three experiments, this PR brings parity to Python.

What changes

  • sdks/python/apache_beam/metrics/metric.py: new internal MetricsFlag class (singleton, idempotent set_default_pipeline_options). DelegatingCounter.inc, DelegatingStringSet.add, and DelegatingBoundedTrie.add now consult the flag and early-return when disabled, so no MetricCell is touched and no monitoring info is emitted.
  • sdks/python/apache_beam/runners/worker/sdk_worker_main.py: calls MetricsFlag.set_default_pipeline_options(sdk_pipeline_options) during harness init, before the first metric update can flow.
  • sdks/python/apache_beam/metrics/metric_test.py: ports Java MetricsTest.testMetricsFlag plus three smoke tests confirming each disabled kind short-circuits without raising.
  • CHANGES.md: entry under 2.75.0 New Features / Improvements.

Fixes #38746.

R: @Abacn


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

Anuragp22 added 3 commits May 30, 2026 11:45
…bleBoundedTrieMetrics experiments

Mirrors the Java SDK Metrics.MetricsFlag behavior so high-throughput Python
pipelines can opt out of user metric kinds that add pressure to metric
backends.

Adds a process-wide MetricsFlag singleton initialized once at worker harness
startup from pipeline experiments. When a flag is set, the corresponding
Delegating* class short-circuits its update path so no MetricCell is touched
and no monitoring info is emitted.

For apache#38746.
Ports Java MetricsTest.testMetricsFlag and adds three smoke tests confirming
the disabled Counter / StringSet / BoundedTrie short-circuit without raising
when called on a no-current-tracker path.

For apache#38746.
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request adds support for three specific pipeline experiments—disableCounterMetrics, disableStringSetMetrics, and disableBoundedTrieMetrics—to the Python SDK. By enabling these flags, users can prevent the emission of specific metric kinds, which is particularly useful for optimizing performance in high-throughput pipelines. The changes ensure that the Python SDK mirrors the existing behavior found in the Java SDK, providing consistent control over metric backends.

Highlights

  • Metrics Control: Introduced a new MetricsFlag class to manage and honor pipeline experiments for disabling specific metric types (Counter, StringSet, and BoundedTrie).
  • SDK Parity: Aligned Python SDK behavior with the Java SDK by allowing high-throughput pipelines to opt out of metric emission, reducing backend pressure.
  • Initialization: Updated the worker harness initialization process to read and apply these flags from pipeline options before metrics are processed.
New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces MetricsFlag to the Python SDK, allowing it to honor pipeline experiments (disableCounterMetrics, disableStringSetMetrics, and disableBoundedTrieMetrics) to opt out of emitting specific user metric kinds, matching the Java SDK behavior. The feedback suggests optimizing performance on the hot path of metric increments by exposing the disabled flags as public class attributes directly on MetricsFlag instead of using @classmethod getter methods. This avoids method call overhead and aligns with PEP 8 recommendations. The suggestions cover refactoring the MetricsFlag class, updating the delegating metric classes, and adjusting the corresponding unit tests.

Comment on lines +70 to +110
_counter_disabled = False
_string_set_disabled = False
_bounded_trie_disabled = False
_initialized = False

@classmethod
def set_default_pipeline_options(cls, options: 'PipelineOptions') -> None:
"""Initialize the flags from pipeline experiments. Idempotent."""
if cls._initialized:
return
debug_options = options.view_as(DebugOptions)
if debug_options.lookup_experiment('disableCounterMetrics'):
cls._counter_disabled = True
_LOGGER.info('Counter metrics are disabled.')
if debug_options.lookup_experiment('disableStringSetMetrics'):
cls._string_set_disabled = True
_LOGGER.info('StringSet metrics are disabled.')
if debug_options.lookup_experiment('disableBoundedTrieMetrics'):
cls._bounded_trie_disabled = True
_LOGGER.info('BoundedTrie metrics are disabled.')
cls._initialized = True

@classmethod
def reset(cls) -> None:
"""Reset flags. Test-only."""
cls._counter_disabled = False
cls._string_set_disabled = False
cls._bounded_trie_disabled = False
cls._initialized = False

@classmethod
def counter_disabled(cls) -> bool:
return cls._counter_disabled

@classmethod
def string_set_disabled(cls) -> bool:
return cls._string_set_disabled

@classmethod
def bounded_trie_disabled(cls) -> bool:
return cls._bounded_trie_disabled
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To avoid the overhead of @classmethod descriptor lookup and function calls on the hot path of metric increments, we can simplify MetricsFlag by exposing the disabled flags as public class attributes directly. This aligns with PEP 8 recommendations to avoid redundant getter/setter methods in Python and improves performance.

  counter_disabled = False
  string_set_disabled = False
  bounded_trie_disabled = False
  _initialized = False

  @classmethod
  def set_default_pipeline_options(cls, options: 'PipelineOptions') -> None:
    """Initialize the flags from pipeline experiments. Idempotent."""
    if cls._initialized:
      return
    debug_options = options.view_as(DebugOptions)
    if debug_options.lookup_experiment('disableCounterMetrics'):
      cls.counter_disabled = True
      _LOGGER.info('Counter metrics are disabled.')
    if debug_options.lookup_experiment('disableStringSetMetrics'):
      cls.string_set_disabled = True
      _LOGGER.info('StringSet metrics are disabled.')
    if debug_options.lookup_experiment('disableBoundedTrieMetrics'):
      cls.bounded_trie_disabled = True
      _LOGGER.info('BoundedTrie metrics are disabled.')
    cls._initialized = True

  @classmethod
  def reset(cls) -> None:
    """Reset flags. Test-only."""
    cls.counter_disabled = False
    cls.string_set_disabled = False
    cls.bounded_trie_disabled = False
    cls._initialized = False
References
  1. PEP 8 recommends avoiding accessor/getter/setter methods for simple attribute access in Python, preferring direct attribute access instead. (link)

Comment on lines +265 to +268
def inc(self, n: int = 1) -> None:
if MetricsFlag.counter_disabled():
return
self._updater(n)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Access the counter_disabled class attribute directly instead of calling a classmethod getter to avoid method call overhead on the hot path.

Suggested change
def inc(self, n: int = 1) -> None:
if MetricsFlag.counter_disabled():
return
self._updater(n)
def inc(self, n: int = 1) -> None:
if MetricsFlag.counter_disabled:
return
self._updater(n)

Comment on lines +293 to +307
def add(self, value: str) -> None:
if MetricsFlag.string_set_disabled():
return
self._updater(value)

class DelegatingBoundedTrie(BoundedTrie):
"""Metrics StringSet that Delegates functionality to MetricsEnvironment."""
"""Metrics BoundedTrie that Delegates functionality to MetricsEnvironment."""
def __init__(self, metric_name: MetricName) -> None:
super().__init__(metric_name)
self.add = MetricUpdater(cells.BoundedTrieCell, metric_name) # type: ignore[method-assign]
self._updater = MetricUpdater(cells.BoundedTrieCell, metric_name)

def add(self, value) -> None:
if MetricsFlag.bounded_trie_disabled():
return
self._updater(value)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Access the string_set_disabled and bounded_trie_disabled class attributes directly instead of calling classmethod getters to avoid method call overhead on the hot path.

Suggested change
def add(self, value: str) -> None:
if MetricsFlag.string_set_disabled():
return
self._updater(value)
class DelegatingBoundedTrie(BoundedTrie):
"""Metrics StringSet that Delegates functionality to MetricsEnvironment."""
"""Metrics BoundedTrie that Delegates functionality to MetricsEnvironment."""
def __init__(self, metric_name: MetricName) -> None:
super().__init__(metric_name)
self.add = MetricUpdater(cells.BoundedTrieCell, metric_name) # type: ignore[method-assign]
self._updater = MetricUpdater(cells.BoundedTrieCell, metric_name)
def add(self, value) -> None:
if MetricsFlag.bounded_trie_disabled():
return
self._updater(value)
def add(self, value: str) -> None:
if MetricsFlag.string_set_disabled:
return
self._updater(value)
class DelegatingBoundedTrie(BoundedTrie):
"""Metrics BoundedTrie that Delegates functionality to MetricsEnvironment."""
def __init__(self, metric_name: MetricName) -> None:
super().__init__(metric_name)
self._updater = MetricUpdater(cells.BoundedTrieCell, metric_name)
def add(self, value) -> None:
if MetricsFlag.bounded_trie_disabled:
return
self._updater(value)

Comment on lines +126 to +164
def test_metrics_flag(self):
"""Mirrors Java MetricsTest.testMetricsFlag for the three disable* experiments."""
MetricsFlag.reset()
self.assertFalse(MetricsFlag.counter_disabled())
self.assertFalse(MetricsFlag.string_set_disabled())
self.assertFalse(MetricsFlag.bounded_trie_disabled())

options = PipelineOptions(['--experiments=disableCounterMetrics'])
MetricsFlag.set_default_pipeline_options(options)
self.assertTrue(MetricsFlag.counter_disabled())
self.assertFalse(MetricsFlag.string_set_disabled())
self.assertFalse(MetricsFlag.bounded_trie_disabled())

MetricsFlag.reset()
options = PipelineOptions(['--experiments=disableStringSetMetrics'])
MetricsFlag.set_default_pipeline_options(options)
self.assertFalse(MetricsFlag.counter_disabled())
self.assertTrue(MetricsFlag.string_set_disabled())
self.assertFalse(MetricsFlag.bounded_trie_disabled())

MetricsFlag.reset()
options = PipelineOptions(['--experiments=disableBoundedTrieMetrics'])
MetricsFlag.set_default_pipeline_options(options)
self.assertFalse(MetricsFlag.counter_disabled())
self.assertFalse(MetricsFlag.string_set_disabled())
self.assertTrue(MetricsFlag.bounded_trie_disabled())

MetricsFlag.reset()
options = PipelineOptions([
'--experiments=disableCounterMetrics',
'--experiments=disableStringSetMetrics',
'--experiments=disableBoundedTrieMetrics',
])
MetricsFlag.set_default_pipeline_options(options)
self.assertTrue(MetricsFlag.counter_disabled())
self.assertTrue(MetricsFlag.string_set_disabled())
self.assertTrue(MetricsFlag.bounded_trie_disabled())

MetricsFlag.reset()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Update the tests to assert on the public class attributes directly instead of calling the removed getter methods.

  def test_metrics_flag(self):
    """Mirrors Java MetricsTest.testMetricsFlag for the three disable* experiments."""
    MetricsFlag.reset()
    self.assertFalse(MetricsFlag.counter_disabled)
    self.assertFalse(MetricsFlag.string_set_disabled)
    self.assertFalse(MetricsFlag.bounded_trie_disabled)

    options = PipelineOptions(['--experiments=disableCounterMetrics'])
    MetricsFlag.set_default_pipeline_options(options)
    self.assertTrue(MetricsFlag.counter_disabled)
    self.assertFalse(MetricsFlag.string_set_disabled)
    self.assertFalse(MetricsFlag.bounded_trie_disabled)

    MetricsFlag.reset()
    options = PipelineOptions(['--experiments=disableStringSetMetrics'])
    MetricsFlag.set_default_pipeline_options(options)
    self.assertFalse(MetricsFlag.counter_disabled)
    self.assertTrue(MetricsFlag.string_set_disabled)
    self.assertFalse(MetricsFlag.bounded_trie_disabled)

    MetricsFlag.reset()
    options = PipelineOptions(['--experiments=disableBoundedTrieMetrics'])
    MetricsFlag.set_default_pipeline_options(options)
    self.assertFalse(MetricsFlag.counter_disabled)
    self.assertFalse(MetricsFlag.string_set_disabled)
    self.assertTrue(MetricsFlag.bounded_trie_disabled)

    MetricsFlag.reset()
    options = PipelineOptions([
        '--experiments=disableCounterMetrics',
        '--experiments=disableStringSetMetrics',
        '--experiments=disableBoundedTrieMetrics',
    ])
    MetricsFlag.set_default_pipeline_options(options)
    self.assertTrue(MetricsFlag.counter_disabled)
    self.assertTrue(MetricsFlag.string_set_disabled)
    self.assertTrue(MetricsFlag.bounded_trie_disabled)

    MetricsFlag.reset()

Drops the classmethod getter wrapper around each disabled flag and exposes
counter_disabled / string_set_disabled / bounded_trie_disabled directly as
public class attributes. The gate runs on every metric update, so swapping
a descriptor lookup + function call for a single attribute load matters in
exactly the high-throughput pipelines these experiments are designed for.
Idiomatic Python; java parity is about behavior, not internal API shape.

Addresses review feedback.

For apache#38746.
@github-actions
Copy link
Copy Markdown
Contributor

Assigning reviewers:

R: @jrmccluskey for label python.

Note: If you would like to opt out of this review, comment assign to next reviewer.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request]: Allow disable user metrics in Python SDK

1 participant