Feature/add alert filter by database by HannahVernon · Pull Request #427 · erikdarlingdata/PerformanceMonitor

HannahVernon · 2026-03-04T20:29:17Z

What does this PR do?

Adds the ability to filter certain alert features by database name:

Deadlocks
Long Running Queries

Which component(s) does this affect?

[✅] Full Dashboard
[✅] Lite Dashboard

How was this tested?

Tested deadlock alerts with a classic "lock resources in opposite order" deadlock test in a dedicated database. With the database name included in the exclusion list, the alert is not fired when a deadlock occurs. With the database not present in the exclusion list, the deadlock alert is fired as expected.

Tested long running queries with the execution-database both in the list and not in the list - exclusion works as expected.

Checklist

[✅] I have read the contributing guide
[✅] My code builds with zero warnings (dotnet build -c Debug)
[✅] I have tested my changes against at least one SQL Server version
[✅] I have not introduced any hardcoded credentials or server names

…_DIAGNOSTIC wait types, and WAITFOR wait types.

…to allow future configurability on the number of long-running queries returned. Added optional parameter to control display of WAITFOR types in future.

…n instead of janky string addition.

…clusions Feature/long running query exclusions

…c() method. Removed System.Collections.Generic using statement as it is unnecessary.

…KUPTHREAD-to-long-running-queries Added exclusion for backup-related waits to GetLongRunningQueriesAsync() method.

…dded miscWaitsFilter to exclude XE_LIVE_TARGET_TVF waits. Removed unused parameters. Corrected minimum value for maxLongRunningQueryCount (minimum 1 instead of 5).

…KUPTHREAD-to-long-running-queries Feature/add exclusion for backupthread to long running queries

…ITFOR, BACKUPTHREAD, BACKUPIO, and XE_LIVE_TARGET_TVF wait types in Lite/Services/LocalDataService.WaitStats.cs

…ries-monitor-changes-to-lite Added filtering for various wait types to long running query code in Lite.

Adds LongRunningQueryMaxResults to UserPreferences (default 5) and exposes it in the Settings UI alongside the existing duration threshold. Threads the value through GetAlertHealthAsync and GetLongRunningQueriesAsync to replace the hardcoded TOP(5) in the DMV query. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds Math.Clamp(1, int.MaxValue) guard to GetLongRunningQueriesAsync in both Dashboard and Lite, and updates the Dashboard settings validation to use an explicit range check with a descriptive error message. Mirrors the LongRunningQueryMaxResults setting across the Lite project: adds the App property, loads/saves it to settings.json, exposes it in the Lite Settings UI, and passes it through to GetLongRunningQueriesAsync to replace the hardcoded LIMIT 5. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Prevents an OverflowException if the value in settings.json is outside the int32 range. The value is read as long, clamped to [1, int.MaxValue], then cast back to int. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replaces hardcoded wait type exclusions in GetLongRunningQueriesAsync with user-configurable booleans for SP_SERVER_DIAGNOSTICS, WAITFOR / BROKER_RECEIVE_WAITFOR, backup waits, and miscellaneous waits. All four filters default to true (existing behavior preserved). Settings are exposed in the Notifications section of both Dashboard and Lite Settings UIs and persisted to UserPreferences / settings.json. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…config-settings Feature/long running queries config settings

Users can now exclude specific databases (e.g. DBAtools, msdb) from generating alerts. The exclusion list is a comma-separated text box in Settings > Notifications for both projects. - Dashboard: filters LRQ at trigger level, blocking via SQL NOT IN clause, deadlock context filtered before email is built; list persisted in UserPreferences - Lite: filters LRQ at trigger level; list persisted to alert_excluded_databases JSON array in settings.json Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@MaxResults

The merge with dev reverted the excludedDatabases parameter from GetAlertHealthAsync and GetBlockingValuesAsync. Re-applies those changes on top of the dev merge (which also landed parameterised TOP(@MaxResults) and the 1-1000 clamp in GetLongRunningQueriesAsync). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

erikdarlingdata

Thanks for this, Hannah! Really nice feature — database-level alert exclusions are something folks have been asking about. Clean implementation across both Dashboard and Lite, and the testing sounds thorough. A few things I noticed:

1. Deadlock alert fires on unfiltered count (Dashboard)

The deadlock tray notification is triggered by health.DeadlockCount, which comes from GetDeadlockCountAsync — that query is never filtered by excluded databases. The BuildDeadlockContextAsync filtering only affects the email context body. So a deadlock in an excluded database will still trigger the tray alert, but the email will correctly omit it. This is inconsistent with how blocking and long-running queries work, where the filtering happens before the threshold check.

2. Lite is missing deadlock and blocking filtering

The Lite side only filters long-running queries. If Lite has deadlock and blocking alert paths, those aren't getting the exclusion filter applied. The Dashboard covers all three.

3. Unrelated change to max results clamp

In Lite/App.xaml.cs, the alert_long_running_query_max_results clamp was changed from Math.Clamp(v.GetInt64(), 1, 1000) to Math.Clamp(v.GetInt64(), 1, int.MaxValue). That removes the safety cap and seems unrelated to this PR.

4. Settings help text is incomplete

Both Dashboard and Lite say "applies to long-running query and blocking alerts" but the Dashboard also applies it to deadlocks (partially). Should probably mention deadlocks too, or clarify the actual scope once the filtering is consistent.

HannahVernon · 2026-03-04T21:13:12Z

Sorry for the partially incorrect PR. Will fix, likely tomorrow, and update the PR.

The deadlock tray notification was firing based on health.DeadlockCount, which comes from GetDeadlockCountAsync. That query reads from sys.dm_os_performance_counters a server-wide counter with no database-level granularity so deadlocks in excluded databases still incremented the delta and triggered the alert, even though the email body correctly omitted them via BuildDeadlockContextAsync. This was inconsistent with blocking and long-running query alerts, where filtering by excluded databases happens before the threshold check. Fix: - Add FilteredDeadlockCount (nullable long) to AlertHealthResult to carry a database-filtered deadlock count alongside the raw counter value. - Add GetFilteredDeadlockCountAsync to DatabaseService.NocHealth, which sums deadlock_count_delta from collect.blocking_deadlock_stats over a 5-minute window (matching AlertCooldown), excluding the configured databases. This mirrors the approach used by GetBlockingValuesAsync. - In GetAlertHealthAsync, run GetFilteredDeadlockCountAsync in parallel when excluded databases are configured and populate FilteredDeadlockCount. - In EvaluateAlertConditionsAsync, use effectiveDeadlockDelta = FilteredDeadlockCount ?? deadlockDelta for the threshold check, tray notification count, RecordAlert, and email currentValue. The raw counter is still stored in _previousDeadlockCounts unchanged. Result: a deadlock in an excluded database no longer triggers the tray alert or email, consistent with how blocking alerts behave. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- In CheckPerformanceAlerts, when AlertExcludedDatabases is configured, fetch the raw blocked process reports and deadlock rows then count only those not belonging to excluded databases before comparing against the threshold mirroring the existing long-running query pattern. - Blocking: filter GetRecentBlockedProcessReportsAsync results by DatabaseName; use effectiveBlockingCount for threshold check, tray notification, and email currentValue. - Deadlocks: filter GetRecentDeadlocksAsync results via new IsDeadlockExcluded helper that parses deadlock_graph_xml and excludes a deadlock only when every process node's currentdbname is in the excluded list (conservative cross-database deadlocks still fire); use effectiveDeadlockCount throughout. - Apply the same exclusion filter inside BuildBlockingContextAsync and BuildDeadlockContextAsync so email alert bodies also omit events from excluded databases. - Add using System.Xml.Linq for XElement.Parse in the new helper. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Accidentally changed Math.Clamp upper bound from 1000 to int.MaxValue in a prior commit. Restores the safety cap to prevent unbounded result sets from the long-running query alert. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Both editions now filter blocking, deadlock, and long-running query alerts by the excluded databases list. Update the settings hint text to reflect all three alert types. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

HannahVernon · 2026-03-05T18:08:41Z

@erikdarlingdata - here's a summary of the changes I made today:

Four issues addressed across three commits:

Dashboard deadlock alert firing on unfiltered count (f41d087)

GetDeadlockCountAsync reads from sys.dm_os_performance_counters, which is server-wide and cannot be filtered by database. Added a new GetFilteredDeadlockCountAsync method that instead queries collect.blocking_deadlock_stats — summing
deadlock_count_delta over a 5-minute window (matching AlertCooldown), excluding configured databases. When excluded databases are configured, EvaluateAlertConditionsAsync uses this filtered count (effectiveDeadlockDelta) for the
threshold check, tray notification, RecordAlert, and email currentValue. The raw perf counter tracking (_previousDeadlockCounts) is left untouched.

Lite missing deadlock and blocking filtering (5e41b5d)

CheckPerformanceAlerts was using summary.BlockingCount and summary.DeadlockCount (both unfiltered server-wide counts) directly against thresholds. Fixed to mirror the existing long-running query pattern:

Blocking: when excluded databases are configured and the raw count is at or above threshold, fetches GetRecentBlockedProcessReportsAsync and filters in-memory by DatabaseName. Uses effectiveBlockingCount for the threshold check,
tray notification, and email value.
Deadlocks: same approach using GetRecentDeadlocksAsync, filtered via a new IsDeadlockExcluded helper that parses each deadlock's XML and excludes a deadlock only when every process node's currentdbname is in the excluded list
(cross-database deadlocks still fire).
BuildBlockingContextAsync and BuildDeadlockContextAsync also now filter by excluded databases so email alert bodies are consistent with tray notifications.

Restore safety cap on alert_long_running_query_max_results (746591a)

The Math.Clamp upper bound had been accidentally changed from 1000 to int.MaxValue. Restored to 1000.

Settings help text updated (781069f)

Both Dashboard and Lite SettingsWindow.xaml hint text updated from "applies to long-running query and blocking alerts" to "applies to blocking, deadlock, and long-running query alerts" to accurately reflect all three alert types.

erikdarlingdata

Code Review — 4 Items

1. `GetFilteredDeadlockCountAsync` catch block should return `null`, not `0`

File: Dashboard/Services/DatabaseService.NocHealth.cs, line ~260 in the new method

Problem: The catch block returns 0, which gets assigned to FilteredDeadlockCount. In MainWindow.xaml.cs, the alert logic does:

var effectiveDeadlockDelta = health.FilteredDeadlockCount ?? deadlockDelta;

When FilteredDeadlockCount is 0 (from the catch), it's not null — so the fallback to deadlockDelta never happens. This means real deadlock alerts get silently suppressed if the query fails (e.g., older schema without collect.blocking_deadlock_stats, permissions issue, timeout).

Fix: Change the return type to Task<long?> and return null from the catch block. This way a failure falls back to the raw performance counter delta and deadlock alerts still work:

catch (Exception ex)
{
    Logger.Warning($"Failed to get filtered deadlock count: {ex.Message}");
    return null;  // Fall back to raw delta
}

Then update the method signature and the assignment in GetAlertHealthAsync accordingly.

2. SQL string interpolation should use parameterized queries

Files:

Dashboard/Services/DatabaseService.NocHealth.cs — GetBlockingValuesAsync (line ~215)
Dashboard/Services/DatabaseService.NocHealth.cs — GetFilteredDeadlockCountAsync (line ~243)

Problem: Both methods build SQL with string interpolation:

var dbFilter = string.Join(", ", excludedDatabases.Select(db => $"N'{db.Replace("'", "''")}'"));

The Replace("'", "''") escaping works, but string-built SQL is fragile and doesn't follow the pattern used everywhere else in the codebase (parameterized SqlCommand with @param placeholders). The values come directly from user input in the Settings text box.

Fix: Use sp_executesql with a comma-separated parameter and STRING_SPLIT, or build individual @db0, @db1, @db2 parameters dynamically:

var dbParams = new List<string>();
for (int i = 0; i < excludedDatabases.Count; i++)
{
    var paramName = $"@exdb{i}";
    dbParams.Add(paramName);
    cmd.Parameters.AddWithValue(paramName, excludedDatabases[i]);
}
var dbFilter = $"AND DB_NAME(s.dbid) NOT IN ({string.Join(", ", dbParams)})";

This matches the parameterized pattern used by every other query in DatabaseService.*.cs.

3. Settings UI placement is misleading

Files:

Dashboard/SettingsWindow.xaml (line ~280)
Lite/Windows/SettingsWindow.xaml (line ~207)

Problem: The "Database Exclusions" section is placed inside the long-running query options area — between the LRQ exclude checkboxes and the TempDB space checkbox. But this setting applies to blocking, deadlocks, AND long-running queries. Users will either miss it or think it only applies to long-running queries.

Fix: Move the "Database Exclusions" section to its own visually distinct group, either:

Above the individual alert type settings (blocking, deadlock, LRQ), since it's a cross-cutting filter
Or below all alert type settings as a "Global Alert Filters" section

In both XAML files, move the TextBlock + TextBox + helper text block to a position that makes its scope obvious. The helper text already says "applies to blocking, deadlock, and long-running query alerts" which is good — it just needs to not be visually nested under the LRQ section.

4. Verify Dashboard `DeadlockItem.DatabaseName` is populated

File: Dashboard/MainWindow.xaml.cs — BuildDeadlockContextAsync (line ~106 in diff)

Problem: The deadlock filtering in Dashboard does:

deadlocks = deadlocks
    .Where(d => string.IsNullOrEmpty(d.DatabaseName) ||
        !excludedDatabases.Any(e =>
            string.Equals(e, d.DatabaseName, StringComparison.OrdinalIgnoreCase)))
    .ToList();

This assumes DeadlockItem (or whatever model GetDeadlocksAsync returns) has a DatabaseName property that's populated with the database where the deadlock occurred.

Check: Does the GetDeadlocksAsync query actually return a database_name column and map it to this property? If DatabaseName is always null/empty, the string.IsNullOrEmpty(d.DatabaseName) check means every deadlock passes the filter and nothing ever gets excluded.

Compare with the Lite implementation which correctly parses the deadlock graph XML to extract currentdbname from process nodes — that approach works regardless of whether the collector stores a separate database_name column. If Dashboard's model doesn't have this field populated, you'll need the same XML parsing approach (see IsDeadlockExcluded in Lite/MainWindow.xaml.cs).

How to verify: Check the query in DatabaseService.Blocking.cs or wherever GetDeadlocksAsync is defined — look for whether database_name is in the SELECT list and mapped to the model.

…null Returning 0 from the catch block meant FilteredDeadlockCount was set to a non-null value on failure, so the null-coalescing fallback in EvaluateAlertConditionsAsync never triggered: var effectiveDeadlockDelta = health.FilteredDeadlockCount ?? deadlockDelta; A 0 count silently suppressed deadlock alerts on schema errors (e.g. missing collect.blocking_deadlock_stats), permission issues, or timeouts. Returning null lets the fallback to the raw performance counter delta kick in so alerts still fire. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…adlockCountAsync Replace string-interpolated NOT IN lists with dynamically built @exdb0, @exdb1... parameters via cmd.Parameters.AddWithValue, matching the parameterized pattern used throughout DatabaseService.*.cs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The database exclusions TextBox was nested inside the Long Running Query Filters area, making it look like it only applied to LRQ alerts. Moved it below all per-alert-type settings into its own 'Global Alert Filters' section so its cross-cutting scope (blocking, deadlock, and long-running query alerts) is visually obvious. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The deadlock filter used DB_NAME(bds.dbid) but collect.blocking_deadlock_stats has no dbid column - it has database_name (sysname). This caused the query to fail silently, falling back to the unfiltered delta every time. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…racy Replace per-row DatabaseName filtering with deadlock graph XML parsing, matching Lite's IsDeadlockExcluded approach. A deadlock is only excluded when ALL process nodes' currentdbname are in the excluded list. This prevents cross-database deadlocks from being silently suppressed when only one participant's database is excluded. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…l body The blocking threshold check already excluded filtered databases via GetBlockingValuesAsync, but the email context could still include blocking events from excluded databases. Now matches Lite's BuildBlockingContextAsync which already filters. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

HannahVernon · 2026-03-05T22:27:18Z

Thanks for bearing with me through this review, Erik.

Fixes for items 1 through 4 have been implemented.

GetFilteredDeadlockCountAsync catch block returns null, not 0, and the method signature is updated.
SQL string interpolation has been changed out with explicit T-SQL parameterization in the full Dashboard. Checked how the query would look for cases with an empty excludedDatabases list, as well as one containing msdb against both GetFilteredDeadlockCountAsync() and GetBlockingValuesAsync() methods.
Settings UI placement is misleading -> added a section called "Global Alert Filters", and moved the "Database Exclusions" text box into that section. Fired up the Dashboard and inspected the layout of the settings window - looks much better to me.
checking excluded databases in GetDeadlocksAsync() is now filtering via the deadlock graph in BuildDeadlockContextAsync() instead of just the database_name column.

I asked Github Copilot to do a comprehensive review:

The user wants a final comprehensive re-audit. Let me get the complete diff again and systematically review everything. Since we just did a thorough audit and only added two small lines, I should still be thorough but can focus on
confirming the previous findings still hold plus verifying the new changes.

● Final audit: 32 PASS, 2 WARN, 0 FAIL.

The two WARNs are design notes, not bugs:

Deadlock delta window mismatch — FilteredDeadlockCount uses a fixed 5-minute SQL window vs. the polling-interval raw delta. Already documented in the code comments as intentional.
Dashboard vs Lite filtering architecture — Dashboard filters blocking at SQL level; Lite filters client-side. Both correct, just different due to their data access patterns.

No errors found. All SQL queries, null handling, filtering logic, edge cases, and settings persistence are correct across both editions.

HannahVernon and others added 25 commits February 27, 2026 14:26

Added exclusions in GetLongRunningQueriesAsync() method for SP_SERVER…

2213d86

…_DIAGNOSTIC wait types, and WAITFOR wait types.

Added TOP parameter for query in GetLongRunningQueriesAsync() method …

9b249fc

…to allow future configurability on the number of long-running queries returned. Added optional parameter to control display of WAITFOR types in future.

Replaced waitForFilter string constructor with C# string interpolatio…

adefd25

…n instead of janky string addition.

Merge pull request #2 from HannahVernon/feature/long-running-query-ex…

481a3a6

…clusions Feature/long running query exclusions

Added exclusion for backup-related waits to GetLongRunningQueriesAsyn…

f75bb2d

…c() method. Removed System.Collections.Generic using statement as it is unnecessary.

Merge pull request #4 from HannahVernon/feature/add-exclusion-for-BAC…

9c2c5f2

…KUPTHREAD-to-long-running-queries Added exclusion for backup-related waits to GetLongRunningQueriesAsync() method.

Reverted Controls/LandingPage.xaml.cs

fd9b497

Added BROKER_RECEIVE_WAITFOR wait type to waitforFilter exclusions. A…

a98ef7b

…dded miscWaitsFilter to exclude XE_LIVE_TARGET_TVF waits. Removed unused parameters. Corrected minimum value for maxLongRunningQueryCount (minimum 1 instead of 5).

Merge pull request #5 from HannahVernon/feature/add-exclusion-for-BAC…

b6d3d3f

…KUPTHREAD-to-long-running-queries Feature/add exclusion for backupthread to long running queries

Added filtering for SP_SERVER_DIAGNOSTICS, WAITFOR, BROKER_RECEIVE_WA…

ab1c47d

…ITFOR, BACKUPTHREAD, BACKUPIO, and XE_LIVE_TARGET_TVF wait types in Lite/Services/LocalDataService.WaitStats.cs

Merge pull request #6 from HannahVernon/feature/sync-long-running-que…

a061f99

…ries-monitor-changes-to-lite Added filtering for various wait types to long running query code in Lite.

Merge branch 'erikdarlingdata:dev' into dev

4962abd

Merge pull request #7 from HannahVernon/feature/long-running-queries-…

e10771e

…config-settings Feature/long running queries config settings

merged with incoming dev branch

2f54765

Merged with incoming dev branch.

f8998d1

Merge branch 'dev' into feature/long-running-queries-config-settings

4d55c32

Merge pull request #8 from HannahVernon/feature/long-running-queries-…

d7a7e41

…config-settings Feature/long running queries config settings

Merged with incoming dev branch.

0c6e320

Fix CA1859: use List<string>? for BuildDeadlockContextAsync parameter

255cadb

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

erikdarlingdata reviewed Mar 4, 2026

View reviewed changes

HannahVernon and others added 3 commits March 5, 2026 11:16

erikdarlingdata reviewed Mar 5, 2026

View reviewed changes

HannahVernon and others added 7 commits March 5, 2026 13:56

Clear excluded databases on Restore Defaults in both editions

58c6c3e

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/add alert filter by database#427

Feature/add alert filter by database#427
HannahVernon wants to merge 36 commits intoerikdarlingdata:devfrom
HannahVernon:feature/add-alert-filter-by-database

HannahVernon commented Mar 4, 2026

Uh oh!

erikdarlingdata left a comment

Uh oh!

HannahVernon commented Mar 4, 2026

Uh oh!

HannahVernon commented Mar 5, 2026

Uh oh!

erikdarlingdata left a comment

Uh oh!

HannahVernon commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

HannahVernon commented Mar 4, 2026

What does this PR do?

Which component(s) does this affect?

How was this tested?

Checklist

Uh oh!

erikdarlingdata left a comment

Choose a reason for hiding this comment

Uh oh!

HannahVernon commented Mar 4, 2026

Uh oh!

HannahVernon commented Mar 5, 2026

Uh oh!

erikdarlingdata left a comment

Choose a reason for hiding this comment

Code Review — 4 Items

1. GetFilteredDeadlockCountAsync catch block should return null, not 0

2. SQL string interpolation should use parameterized queries

3. Settings UI placement is misleading

4. Verify Dashboard DeadlockItem.DatabaseName is populated

Uh oh!

HannahVernon commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. `GetFilteredDeadlockCountAsync` catch block should return `null`, not `0`

4. Verify Dashboard `DeadlockItem.DatabaseName` is populated