Skip to content

MDEV-38305: Expose adaptive hash index statistics in ANALYZE FORMAT=JSON#5069

Draft
mariadb-TafzeelShams wants to merge 1 commit into
mainfrom
main-MDEV-38305
Draft

MDEV-38305: Expose adaptive hash index statistics in ANALYZE FORMAT=JSON#5069
mariadb-TafzeelShams wants to merge 1 commit into
mainfrom
main-MDEV-38305

Conversation

@mariadb-TafzeelShams
Copy link
Copy Markdown
Contributor

  • The Jira issue number for this PR is: MDEV-38305

Description

MDEV-38305: Expose adaptive hash index statistics in ANALYZE FORMAT=JSON

Expose InnoDB's Adaptive Hash Index (AHI) statistics through ANALYZE
FORMAT=JSON output to provide query-level visibility into AHI usage
and effectiveness. This allows DBAs and developers to monitor how well
the adaptive hash index is serving their workloads on a per-query basis.

The r_ahi_stats object (nested inside r_engine_stats) now reports four
key metrics: ahi_searches (successful AHI lookups), ahi_searches_btree
(AHI misses requiring B-tree fallback), ahi_rows_added (rows inserted
into AHI), and ahi_pages_added (pages indexed by AHI).

  • btr_ahi_inc_searches(): Increment counter when AHI lookup succeeds.
  • btr_ahi_inc_searches_btree(): Increment counter when AHI lookup fails
    and falls back to B-tree search.
  • btr_ahi_inc_rows_added(): Increment counter when rows are added to
    the adaptive hash index structure.
  • btr_ahi_inc_pages_added(): Increment counter when new pages are
    indexed by AHI.
  • btr_cur_t::search_leaf(): Call btr_ahi_inc_searches() on successful
    AHI hit and btr_ahi_inc_searches_btree() on AHI miss to track search
    outcomes at the point where AHI is utilized.
  • trace_engine_stats(): Output r_ahi_stats object with all four AHI
    counters in JSON format when any AHI activity is detected during query
    execution.
  • ha_handler_stats: Added ahi_searches, ahi_searches_btree, ahi_rows_added,
    and ahi_pages_added fields to track per-query AHI statistics.
  • ahi_stats.test: Comprehensive verification of AHI statistics reporting
    across different scenarios: insufficient accesses (no AHI build),
    threshold triggering (AHI construction), heavy warmup (full AHI
    utilization), and disabled AHI (verify zero statistics).
  • check_ahi_status.inc: Reusable include file for executing queries with
    configurable warmup repetitions and extracting AHI statistics from
    ANALYZE FORMAT=JSON output using JSON path expressions.

How can this PR be tested?

Added mysql-test/suite/innodb/t/ahi_stats.test to run queries at different stages of AHI and compare the stats of that query.

Basing the PR against the correct MariaDB version

  • This is a new feature or a refactoring, and the PR is based against the main branch.
  • This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

  • I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

Expose InnoDB's Adaptive Hash Index (AHI) statistics through ANALYZE
FORMAT=JSON output to provide query-level visibility into AHI usage
and effectiveness. This allows DBAs and developers to monitor how well
the adaptive hash index is serving their workloads on a per-query basis.

The r_ahi_stats object (nested inside r_engine_stats) now reports four
key metrics: ahi_searches (successful AHI lookups), ahi_searches_btree
(AHI misses requiring B-tree fallback), ahi_rows_added (rows inserted
into AHI), and ahi_pages_added (pages indexed by AHI).

- btr_ahi_inc_searches(): Increment counter when AHI lookup succeeds.
- btr_ahi_inc_searches_btree(): Increment counter when AHI lookup fails
  and falls back to B-tree search.
- btr_ahi_inc_rows_added(): Increment counter when rows are added to
  the adaptive hash index structure.
- btr_ahi_inc_pages_added(): Increment counter when new pages are
  indexed by AHI.
- btr_cur_t::search_leaf(): Call btr_ahi_inc_searches() on successful
  AHI hit and btr_ahi_inc_searches_btree() on AHI miss to track search
  outcomes at the point where AHI is utilized.
- trace_engine_stats(): Output r_ahi_stats object with all four AHI
  counters in JSON format when any AHI activity is detected during query
  execution.
- ha_handler_stats: Added ahi_searches, ahi_searches_btree, ahi_rows_added,
  and ahi_pages_added fields to track per-query AHI statistics.
- ahi_stats.test: Comprehensive verification of AHI statistics reporting
  across different scenarios: insufficient accesses (no AHI build),
  threshold triggering (AHI construction), heavy warmup (full AHI
  utilization), and disabled AHI (verify zero statistics).
- check_ahi_status.inc: Reusable include file for executing queries with
  configurable warmup repetitions and extracting AHI statistics from
  ANALYZE FORMAT=JSON output using JSON path expressions.
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request exposes InnoDB Adaptive Hash Index (AHI) statistics, including successful lookups, B-tree fallback searches, and rows/pages added, within the ANALYZE FORMAT=JSON output. The implementation adds new counters to the handler statistics and instruments the InnoDB storage engine to populate them. Feedback suggests initializing the new statistics fields to zero to prevent reporting garbage values and ensuring they are reset between queries to avoid data leakage. Additionally, there is a concern regarding the performance overhead of performing thread-local lookups for statistics on every AHI search, suggesting a potential optimization by passing or caching the transaction pointer in the search path.

Comment thread sql/ha_handler_stats.h
Comment on lines +44 to +47
ulonglong ahi_searches; /* Successful adaptive hash lookups */
ulonglong ahi_searches_btree; /* B-tree searches (AHI miss) */
ulonglong ahi_rows_added; /* Rows added to adaptive hash index */
ulonglong ahi_pages_added; /* Pages added to adaptive hash index */
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The new fields in ha_handler_stats should be initialized to zero to prevent reporting garbage values in ANALYZE FORMAT=JSON. Additionally, ensure that these fields are cleared in the ha_handler_stats::reset() method (and initialized in the constructor if applicable) so that statistics do not leak between different queries or table accesses within the same session.

Suggested change
ulonglong ahi_searches; /* Successful adaptive hash lookups */
ulonglong ahi_searches_btree; /* B-tree searches (AHI miss) */
ulonglong ahi_rows_added; /* Rows added to adaptive hash index */
ulonglong ahi_pages_added; /* Pages added to adaptive hash index */
ulonglong ahi_searches = 0; /* Successful adaptive hash lookups */
ulonglong ahi_searches_btree = 0; /* B-tree searches (AHI miss) */
ulonglong ahi_rows_added = 0; /* Rows added to adaptive hash index */
ulonglong ahi_pages_added = 0; /* Pages added to adaptive hash index */

Comment on lines +86 to +118
void btr_ahi_inc_searches() noexcept
{
if (THD *thd= current_thd)
if (trx_t *trx= thd_to_trx(thd))
if (ha_handler_stats *stats= trx->active_handler_stats)
stats->ahi_searches++;
}

void btr_ahi_inc_searches_btree() noexcept
{
if (THD *thd= current_thd)
if (trx_t *trx= thd_to_trx(thd))
if (ha_handler_stats *stats= trx->active_handler_stats)
stats->ahi_searches_btree++;
}

void btr_ahi_inc_rows_added(ulonglong count) noexcept
{
MONITOR_INC_VALUE(MONITOR_ADAPTIVE_HASH_ROW_ADDED, count);
if (THD *thd= current_thd)
if (trx_t *trx= thd_to_trx(thd))
if (ha_handler_stats *stats= trx->active_handler_stats)
stats->ahi_rows_added+= count;
}

void btr_ahi_inc_pages_added() noexcept
{
MONITOR_INC(MONITOR_ADAPTIVE_HASH_PAGE_ADDED);
if (THD *thd= current_thd)
if (trx_t *trx= thd_to_trx(thd))
if (ha_handler_stats *stats= trx->active_handler_stats)
stats->ahi_pages_added++;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While using current_thd and thd_to_trx is the standard way to access per-query statistics in MariaDB, doing so on every Adaptive Hash Index (AHI) search lookup (both hits and misses) introduces overhead in a very performance-critical path. For high-concurrency workloads where AHI is most beneficial, this overhead might be measurable. Consider if the trx_t pointer can be passed down or cached in the cursor to avoid repeated thread-local lookups, although this would require a more significant refactoring of the B-tree search interface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

1 participant