Skip to content

Improve AI decisions using simulator#10559

Open
Madwand99 wants to merge 1 commit intoCard-Forge:masterfrom
Madwand99:ImproveAIdecisions
Open

Improve AI decisions using simulator#10559
Madwand99 wants to merge 1 commit intoCard-Forge:masterfrom
Madwand99:ImproveAIdecisions

Conversation

@Madwand99
Copy link
Copy Markdown
Contributor

Summary

This PR improves Forge AI decision-making around plays that look locally reasonable but become disastrous because of triggered/replacement/static effects on the board.

The motivating examples were draw-punisher interactions like casting Wheel of Fortune / Timetwister / Echo of Eons / Mindmoil into Xyris, Nekusar, Kederekt Parasite, or Impact Tremors. The implementation is broader than those specific cards: it adds a one-play safety check, score-delta evaluation hooks, and lightweight threat memory so the AI can avoid more classes of self-destructive plays and better prioritize removal against permanents involved in unsafe outcomes.

Main Changes

  • Added OnePlaySafetyChecker

    • Performs cheap static checks for known dangerous play patterns.
    • Runs a bounded one-play simulation for risky action types.
    • Caches results to avoid repeatedly simulating the same bad play.
    • Allows commanders through even when they resemble draw engines, to avoid paralyzing commander decks.
  • Added SwingyPlaySimulationEvaluator

    • Uses one-play simulation score deltas to reject clearly bad plays or allow clearly good ones.
    • Helps mass-removal and swingy board-state decisions rely more on simulation instead of narrow heuristics.
  • Added SafetyThreatMemory

    • Records opponent permanents that appear implicated when a simulated/safety-checked play is unsafe.
    • Feeds those learned facts into removal priority, so cards like Xyris, Impact Tremors, or Grave Pact can become higher-value removal targets after they cause unsafe outcomes.
  • Improved simulation reliability

    • Stack resolution now installs simulation controllers for non-AI players during simulated resolution.
    • Sacrifice choices in simulation are more deterministic/value-aware.
    • Simulation failure diagnostics now use Forge’s tagged AI logger instead of raw stderr.
  • Improved AI behavior around:

    • Wheel / Timetwister / Echo of Eons effects into active draw-punishers.
    • Mindmoil / Arjun / Teferi’s Puzzle Box style repeat-draw engines.
    • Optional free-cast effects such as suspend/cascade/discover casting into dangerous board states.
    • Grave Pact / Dictate of Erebos / Butcher of Malakir style sacrifice punishment.
    • Blood Artist / Zulaport Cutthroat lethal board wipes.
    • Zo-Zu the Punisher / Polluted Bonds land-entry punishment.
    • Wrath-style board wipes where the simulation score can distinguish good wipes from bad ones.

Tests

Added or expanded regression coverage for:

  • Draw-punisher safety:

    • Xyris
    • Nekusar
    • Kederekt Parasite
    • Impact Tremors
    • Wheel of Fortune
    • Timetwister
    • Echo of Eons
    • Mindmoil
    • Arjun, the Shifting Flame
    • Teferi’s Puzzle Box
  • Command-zone edge cases:

    • One-shot draw effects should not be blocked merely because Xyris is in the command zone.
    • Commander spells that look risky are still allowed to be cast.
  • Simulation-only score validation:

    • Grave Pact
    • Dictate of Erebos
    • Butcher of Malakir
    • Blood Artist
    • Zulaport Cutthroat
    • Zo-Zu the Punisher
    • Polluted Bonds
  • Sacrifice decision behavior:

    • Deterministic simulated sacrifice choices.
    • Least-valuable legal permanent/creature selection.
    • Lethal life-loss sacrifice decisions.
  • Board wipes:

    • Rejects bad Wrath-style wipes.
    • Allows favorable wipes with death-trigger upside.

Notes / Limitations

This does not fully unify AI decision-making into a single generic “how good is this play?” evaluator. Forge AI still uses a mix of card-specific AI logic, heuristic play selection, and simulation scoring. This PR moves more decisions toward simulation-based evaluation while keeping performance safeguards in place.

The expensive one-play simulation is intentionally gated during normal spell selection. It is only used for actions likely to expose dangerous hidden consequences, such as removal, mass effects, draw under active draw-punishers, sacrifice, token creation, zone changes, and similar reactive effects. This avoids broad performance regressions from simulating every harmless candidate spell.

Comment thread forge-ai/src/main/java/forge/ai/ability/ChangeZoneAllAi.java
@tool4ever
Copy link
Copy Markdown
Contributor

smells like AI with all these tests you've pumped out...? 🤔

anyway this is a bit too much all at once, some noteworthy points:

  • I did consider supporting a hybrid variant of simulation & heuristics in the past and this certainly has some interesting ideas
  • but at this point simulation hasn't been really worked on in quite a while and also isn't all that stable, so just suffocating it with a try-catch doesn't seem favorable
  • for cases where superior heuristics exist this risks making AI dumber because the boardstate scoring can imply the wrong truth, especially if its ability to look ahead further is denied (which can already be observed by some of the additional heuristics you've added to try and work around that)
  • such caching will always be fragile

@Madwand99
Copy link
Copy Markdown
Contributor Author

Yes, I use AI to create tests. Its very helpful for determining current failure cases that can be handled better.

For the rest: thanks, that makes sense. I agree this became too broad for one PR.

The intent was not to replace existing heuristics with simulation, but I can see that the current patch blurs that line. The try/catch was added to avoid game-breaking failures during investigation, not as a satisfying long-term answer for unstable simulation.

I’m going to split this up. First I’ll pull out the lower-risk simulation/controller and deterministic sacrifice-choice fixes, since those seem useful independently. Then I’ll make the original draw-punisher fix a much narrower heuristic PR, with tests for Wheel/Mindmoil/Puzzle Box style cases. I’ll leave the broader one-play safety checker, swingy score evaluator, and threat-memory cache out for now unless we agree on a smaller experimental shape.

I also agree that board-state scoring can imply the wrong answer when simulation is shallow. I’ll avoid using it to override existing superior heuristics in the narrower PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants