Skip to content

feat: Support Enum, @staticmethod, @classmethod mutations#476

Open
nicklafleur wants to merge 4 commits intoboxed:mainfrom
lyft:nicklafleur/enum_class_static_support
Open

feat: Support Enum, @staticmethod, @classmethod mutations#476
nicklafleur wants to merge 4 commits intoboxed:mainfrom
lyft:nicklafleur/enum_class_static_support

Conversation

@nicklafleur
Copy link
Contributor

@nicklafleur nicklafleur commented Mar 1, 2026

Mix of boy-scouting and new features along with docker improvements for non-linux systems.

Summary

  • Refactor Config to a singleton pattern — Extract configuration into a dedicated Config class with get()/reset()/ensure_loaded() API, replacing the global mutmut.config (kept behind a deprecation warning).
  • Add type annotations throughout and a safe_setproctitle wrapper for macOS Python 3.14+ fork crashes.
  • Fix timeout checker bug — Replace the pid-to-mutant lookup (which used a stale loop variable, causing hangs in parallel runs) with a min-heap based design that tracks (deadline, pid) tuples directly.
  • Add enum and static/classmethod mutation support — Introduce an external injection pattern for enum class mutation (avoiding metaclass conflicts), allow @staticmethod/@classmethod through the decorator filter, and add pragma: no mutate class/pragma: no mutate function pragmas.
  • Refactor mutation internals — Extract pragma_handling.py, enum_mutation.py, trampoline_templates.py, format_utils.py, and timeout.py into focused modules.
  • Improve Docker test image — Copy pyproject.toml/uv.lock before uv sync so source changes don't re-download dependencies.
  • /scripts/run_tests.sh now accepts multiple python versions for comprehensive testing workflows.

@nicklafleur nicklafleur marked this pull request as draft March 1, 2026 18:22
@nicklafleur nicklafleur force-pushed the nicklafleur/enum_class_static_support branch from c701b42 to 0ab4303 Compare March 1, 2026 19:03
@nicklafleur nicklafleur marked this pull request as ready for review March 1, 2026 19:36
@nicklafleur nicklafleur force-pushed the nicklafleur/enum_class_static_support branch 2 times, most recently from 072319e to b6c5fa3 Compare March 2, 2026 03:17
@Otto-AA
Copy link
Collaborator

Otto-AA commented Mar 2, 2026

Hi, thanks for the PR! I'll take a look this week, not sure yet when I'll find the time.

Fix timeout checker bug — Replace the pid-to-mutant lookup (which used a stale loop variable, causing hangs in parallel runs) with a min-heap based design that tracks (deadline, pid) tuples directly.

Can you explain which bug this solves? How did the pid lookup become stale (did someone else modify the dict in the same time?) and how does this cause a hang?

@nicklafleur
Copy link
Contributor Author

nicklafleur commented Mar 3, 2026

Fix timeout checker bug — Replace the pid-to-mutant lookup (which used a stale loop variable, causing hangs in parallel runs) with a min-heap based design that tracks (deadline, pid) tuples directly.

Can you explain which bug this solves? How did the pid lookup become stale (did someone else modify the dict in the same time?) and how does this cause a hang?

Hey sorry I should have included more details in the PR, here's what I added in the commit 40eb31d.


The old timeout_checker used the loop variable `mutant_name`
from the outer mutants iteration when looking up estimated_time_of_tests,
instead of looking up the mutant name associated with each PID, causing
some tests to hang and never be killed due to incorrect timeout
calculations.

This was easy to miss because the bug only manifested when multiple mutants
were being tested in parallel and the timing was just right for the wrong
mutant's timeout to be checked against another mutant's PID (in my case,
it only happened specifically when running the e2e tests in vscode
after my last set of changes 🤷)

Instead of trying to patch this specific bug (and probably introducing
one), I decided to go with a simpler design that avoids trying to find
the right pid at all, instead using a min-heap to track (timeout, pid)
tuples.

Copy link
Collaborator

@Otto-AA Otto-AA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. I looked through it and it looks pretty good :)

My main open questions are in regard to the trampoline setup:

  • why do we define a method multiple times? Once with the original signature, once with *args, **kwargs and then again with @wraps?
  • Could we always define the mutant dict outside the class and the mutated methods inside the class? Would be nice if we could use the same structure instead of special casing enums and staticmethod/classmethods

Judging from the E2E tests, it looks pretty backwards compatible, except for the type checking feature. Maybe I get to test the PR changes tomorrow on a small project.

I could imagine that in the short term we can add # type: ignores to mutated @staticmethod and @classmethod methods, if it's hard to get that right with typing. However, I'd prefer if the old E2E tests keep working.

"my_lib.xǁColorǁis_primary__mutmut_1": 1,
"my_lib.xǁColorǁdarken__mutmut_1": 1,
"my_lib.xǁColorǁdarken__mutmut_2": 1,
"my_lib.xǁColorǁfrom_name__mutmut_1": 1,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the E2E tests it's nice to have both killed and survived results. If we only test for killed mutants, the E2E tests will pass as long as either the tests correctly catch the mutant or our mutation setup raises any Error at runtime.

For instance, currently I could remove the elif is_enum_class(cls): branch in file_mutation.py and the E2E tests still pass. We could make the tests imperfect, to let some mutants slip, e.g.

def test_color_darken():
    assert Color.GREEN.darken() > 0

"type_checking.x_mutate_me__mutmut_3": 1,
"type_checking.x_mutate_me__mutmut_4": 1,
"type_checking.x_mutate_me__mutmut_3": 37,
"type_checking.x_mutate_me__mutmut_4": 37,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two should not be a type error, mutation 3 and 4 only modify the string:

def x_mutate_me__mutmut_3():
    p = Person()
    p.set_name('XXcharlieXX')
    # Verify that p.get_name keeps the return type str
    name: str = p.get_name()
    return name

Pyrefly outputs:

Argument `Literal['XXcharlieXX']` is not assignable to parameter `self` with type `Person` in function `functools._Wrapped.__call__`Pyrefly[bad-argument-type](https://pyrefly.org/en/docs/error-kinds/#bad-argument-type)

In this PR we changed from:

class Person:
    def set_name(self, name: str):
        args = [name]# type: ignore
        kwargs = {}# type: ignore
        return _mutmut_trampoline(object.__getattribute__(self, 'xǁPersonǁset_name__mutmut_orig'), object.__getattribute__(self, 'xǁPersonǁset_name__mutmut_mutants'), args, kwargs, self)
    def xǁPersonǁset_name__mutmut_orig(self, name: str):
        self.name = name
    def xǁPersonǁset_name__mutmut_1(self, name: str):
        self.name = None
    
    xǁPersonǁset_name__mutmut_mutants : ClassVar[MutantDict] = { # type: ignore
    'xǁPersonǁset_name__mutmut_1': xǁPersonǁset_name__mutmut_1
    }
    xǁPersonǁset_name__mutmut_orig.__name__ = 'xǁPersonǁset_name'

    def get_name(self):
        # return type should be inferred as "str"
        return self.name

to

class Person:
    def set_name(self, name: str):
        args = [name]# type: ignore
        kwargs = {}# type: ignore
        return _mutmut_trampoline(object.__getattribute__(self, 'xǁPersonǁset_name__mutmut_orig'), object.__getattribute__(self, 'xǁPersonǁset_name__mutmut_mutants'), args, kwargs, self)
    def xǁPersonǁset_name__mutmut_orig(self, name: str):
        self.name = name
    def xǁPersonǁset_name__mutmut_1(self, name: str):
        self.name = None
    
    xǁPersonǁset_name__mutmut_mutants : ClassVar[MutantDict] = { # mutmut: generated
    'xǁPersonǁset_name__mutmut_1': xǁPersonǁset_name__mutmut_1 # mutmut: generated
    } # mutmut: generated
    
    def set_name(self, *args, **kwargs): # mutmut: generated
        result = _mutmut_trampoline(object.__getattribute__(self, "xǁPersonǁset_name__mutmut_orig"), object.__getattribute__(self, "xǁPersonǁset_name__mutmut_mutants"), args, kwargs, self) # mutmut: generated
        return result # mutmut: generated
    
    set_name = _mutmut_wraps(xǁPersonǁset_name__mutmut_orig)(set_name) # mutmut: generated
    xǁPersonǁset_name__mutmut_orig.__name__ = 'xǁPersonǁset_name' # mutmut: generated

    def get_name(self):
        # return type should be inferred as "str"
        return self.name

What is the reason we define set_name multiple times, instead of only once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably remnants of a bad rebase, good catch I'll fix this!


CLASS_NAME_SEPARATOR = "ǁ"

GENERATED_MARKER = "# mutmut: generated"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case for this? And would we need this also on mutated code, or only on trampolines and co?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a (un-creative) solution to filtering the type errors to only those that are caused by mutants (which we want to consider as being killed if a type checker catches them).

Essentially the mutated code stays "unmarked", while all trampoline functions, mutants dicts, etc. get this tag. That way when a type-checker is run against the code, we can actively filter out errors related to this. I chose this over relying on type-checker ignore markers like # noqa because it was a "tool agnostic" way of handling the problem but I'm completely open to other solutions (see file_mutation.py:filter_mutants_with_type_checker for the active filtering part)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, then I'd suggest to use # type: ignore instead of # mutmut: generated, as # type: ignore is standardized (PEP 484) and supported by all type checkers. With this, the type checkers won't report the errors and we don't need to filter them out afterwards.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, I had considered something like that initially, but was thinking about trying to make it a bit more explicity why we are ignoring it. I think I'll move to something like #type: ignore # mutmut generated so that we adhere to PEP while also getting that additional signal.

README.rst Outdated
left off.

Incremental Testing
~~~~~~~~~~~~~~~~~~~
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet with this PR :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol getting ahead of myself, rebasing got a little messy as you can see thanks for catching it 😅

README.rst Outdated
to continue, but it's slower.


Enum Classes and Metaclass Compatibility
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels more like something for HISTORY.rst than the README. The README is already pretty long (we should split it up into multiple files at some point imo) and mentioning everything that mutmut supports seems unnecessary (we also don't mention that we support methods, classes, subclasses, etc.).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, probably a case of over-eager documentation here.

README.rst Outdated

# pyproject.toml
[tool.mutmut]
mutate_enums = false
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an option you need? I don't feel like disabling enum mutations is a use case we need, and would probably be better part of #47

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was me trying to adhere (probably a bit too much) to leaving at least the option of having the base behaviour be relatively unaffected.

A codebase which uses mutmut and a large number of dataclasses with basic data manipulation functions, which are of generally low mutant value, may see a very large increase in the number of mutants for little gain for example, so I wanted to give them an easy out of what would become the new default which I in general I tried to leave to current behaviour. I just figured core language features like Enums merit a more "opt out" type approach, though I can see the case to remove this entirely and leave it as a defacto truth that mutmut will try and support more language features over time.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the rationale, but for simplicity (both of our implementation and also number of configs+documentation) I'd remove this config. If we do this for all new features we will end up with very many options (for disabling enums, staticmethods, ternary operators, etc.). I think if we expect the mutation to work in all setups and be useful we don't need extra individual configs.

At some point, we'll have #47 and probably a config to enable/disable specific mutations, but with a common interface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense to me!

try:
os.environ["MUTANT_UNDER_TEST"] = "none"
namespace = {"__name__": "test_module"}
exec(mutated_code, namespace)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer the exec integration tests to be in another file. They are somewhere between the tests in this file and E2E tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, will do

@Otto-AA
Copy link
Collaborator

Otto-AA commented Mar 7, 2026

Regarding

why do we define a method multiple times? Once with the original signature, once with *args, **kwargs and then again with @wraps?

If we do something like #477 , we would have access to *args, **kwargs, likely still have a happy type checker with ParamSpec and also happy runtime type hints with the @wraps.

Copy link
Contributor Author

@nicklafleur nicklafleur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, I'll take care of the changes in the coming days.


CLASS_NAME_SEPARATOR = "ǁ"

GENERATED_MARKER = "# mutmut: generated"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a (un-creative) solution to filtering the type errors to only those that are caused by mutants (which we want to consider as being killed if a type checker catches them).

Essentially the mutated code stays "unmarked", while all trampoline functions, mutants dicts, etc. get this tag. That way when a type-checker is run against the code, we can actively filter out errors related to this. I chose this over relying on type-checker ignore markers like # noqa because it was a "tool agnostic" way of handling the problem but I'm completely open to other solutions (see file_mutation.py:filter_mutants_with_type_checker for the active filtering part)

"type_checking.x_mutate_me__mutmut_3": 1,
"type_checking.x_mutate_me__mutmut_4": 1,
"type_checking.x_mutate_me__mutmut_3": 37,
"type_checking.x_mutate_me__mutmut_4": 37,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably remnants of a bad rebase, good catch I'll fix this!

README.rst Outdated

# pyproject.toml
[tool.mutmut]
mutate_enums = false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was me trying to adhere (probably a bit too much) to leaving at least the option of having the base behaviour be relatively unaffected.

A codebase which uses mutmut and a large number of dataclasses with basic data manipulation functions, which are of generally low mutant value, may see a very large increase in the number of mutants for little gain for example, so I wanted to give them an easy out of what would become the new default which I in general I tried to leave to current behaviour. I just figured core language features like Enums merit a more "opt out" type approach, though I can see the case to remove this entirely and leave it as a defacto truth that mutmut will try and support more language features over time.

README.rst Outdated
to continue, but it's slower.


Enum Classes and Metaclass Compatibility
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, probably a case of over-eager documentation here.

new_body.append(method)
continue

ext_nodes, assignment, method_mutant_names = _external_method_injection(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the method itself (static/class/enum methods) can be defined in the body to simply call the trampoline instead of having something be set to the trampoline call (I think I'll add this as a change to the secondary PR). Since the enum solution was the first I came up with and its solution could be applied to static/class methods similarly, I did it without thinking about how to have it be more generally optimized.

I'll definitely look into a more generalized way to handle all mutations.

@nicklafleur nicklafleur force-pushed the nicklafleur/enum_class_static_support branch from a8cf8dd to 06f126b Compare March 19, 2026 12:23
Changes:
- Parametrizes the dockerfile and script so that tests can be run
  against any arbitrary python version.
- Improves dockerfile so that they can be better cached between runs
- Extract Config class to dedicated src/mutmut/config.py module with
  singleton pattern (Config.get(), Config.reset(), Config.ensure_loaded())
- Replace all mutmut.config global references with Config.get() calls
    - mutmut.config global was kept, but put behind a deprecation warning
- Add type annotations throughout codebase to satisfy mypy strict checking
- Add safe_setproctitle wrapper to handle macOS Python 3.14+ fork crashes
  where setproctitle's CoreFoundation usage causes segfaults after fork()
- Fix various type: ignore comments with proper casts and annotations
- Update tests to use new Config API

Bug Fix:

The old timeout_checker used the loop variable `mutant_name`
from the outer mutants iteration when looking up estimated_time_of_tests,
instead of looking up the mutant name associated with each PID, causing
some tests to hang and never be killed due to incorrect timeout
calculations.

This was easy to miss because the bug only manifested when multiple mutants
were being tested in parallel and the timing was just right for the wrong
mutant's timeout to be checked against another mutant's PID (in my case,
it only happened specifically when running the e2e tests in vscode
after my last set of changes 🤷)

Instead of trying to patch this specific bug (and probably introducing
one), I decided to go with a simpler design that avoids trying to find
the right pid at all, instead using a min-heap to track (timeout, pid)
tuples.

Flow:
- Registers timeouts at fork time with register_timeout(pid, timeout_s)
  - Calling this function lazily starts the timeout checker thread if not
    already started
- Uses a min-heap to track the next deadline
- Processes expired timeouts in order, sending SIGXCPU to each PID
  whose deadline has passed
- If (when) a PID exits before its timeout, its entry remains in the heap
  until it reaches the top, at which point we pop it and try to kill it
  with SIGXCPU (same as before), swallowing the ProcessLookupError that
  is raised if the PID is already gone.
  - a process hanging indefinitely due to a mutation will back up the
    heap with stale entries, but each entry is small (~72 bytes) so
    even 10,000 backed up timeouts before it gets killed is less than
    1MB of memory and saves a O(n) rebuild for each backed up timeout
    compared to cancelling.
- Clean up START_TIMES_BY_PID_LOCK since it's no longer needed

Deprecation Warning:
    mutmut.config global is deprecated, use mutmut.config.Config.get()
        instead
…unction

Features:
- Add enum class detection and external injection pattern for enum mutation
- Add staticmethod/classmethod support via external trampoline pattern
- Add parse_pragma_lines() for pragma: no mutate class/function
- Add build_enum_trampoline() template

Refactoring:
- Extract pragma_handling.py: parse_pragma_lines()
- Add utils/format_utils.py: make_mutant_key(), parse_mutant_key()
- Simplify orig_function_and_class_names_from_key() using parse_mutant_key()

Tests:
- Add test_enum_handling.py mirroring enum_handling module
- Add test_pragma_handling.py mirroring pragma_handling module

Config:
- Exclude AUTHORS.rst from merge conflict check in pre-commit
@nicklafleur nicklafleur force-pushed the nicklafleur/enum_class_static_support branch from 06f126b to 7707f44 Compare March 19, 2026 12:41
@nicklafleur nicklafleur requested a review from Otto-AA March 19, 2026 12:41
@nicklafleur
Copy link
Contributor Author

nicklafleur commented Mar 19, 2026

@Otto-AA sorry about the long turnaround on this, ended up finding a bug with how the external mutation handled forward reference type-hints that needed to be fixed before pushing. Also spent a little time doing type-fu to get the trampolines to properly display types in IDEs, admittedly didn't look too much into how this could be used to augment the type-based mutant invalidation as this PR is already big enough.

I tried to separate the moved stuff from __main__.py to the other modules but the changes ended up being a bit too intermingled to do so, I'll be sure to be more conscious of that with future PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants