Skip to content

Conversation

@bbrtj
Copy link
Contributor

@bbrtj bbrtj commented Dec 9, 2025

This change refactors Perl_av_fetch function and moves the re-calculation of key, so that it happens only if the key is not found in range between 0 and max_index. Fetching existing non-negative elements should be much more common than the alternatives, and this change should reduce the number of instructions needed to achieve that.

I've run Porting/bench.pl script against all tests that start with expr::array, here are the average results:

       hacked  blead
       ------ ------
    Ir 100.00  97.63
    Dr 100.00 100.00
    Dw 100.00  97.14
  COND 100.00 101.30
   IND 100.00 100.00

COND_m 100.00 100.00
 IND_m 100.00  98.36

 Ir_m1 100.00 102.22
 Dr_m1 100.00 100.00
 Dw_m1 100.00 100.00

 Ir_mm 100.00 100.00
 Dr_mm 100.00 100.00
 Dw_mm 100.00 100.00

Tests which contributed to COND being higher were expr::array::pkg_1const_m1 and expr::array::lex_1const_m1 - not unexpected. Single L1 cache miss comes from expr::arrayhash::lex_3var test, and surprisingly, the pkg version of the same test did not miss the cache.


  • This set of changes does not require a perldelta entry.

Copy link
Contributor

@leonerd leonerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good aside from a code formatting adjustment

@bbrtj bbrtj changed the title Refactor Perl_av_fetch, optimize path for positive keys access Refactor Perl_av_fetch, optimize path for non-negative key access Dec 9, 2025
@iabyn
Copy link
Contributor

iabyn commented Dec 10, 2025 via email

@bbrtj
Copy link
Contributor Author

bbrtj commented Dec 10, 2025

I'm sorry, should've checked git blame to see the origin of this code.

Porting bench script showed up to 7.5% less data writes and up to 6% less instructions used in some cases. The basis for my change is assumption that negative key access is much less common, which could make this change worthwhile. Especially where it really matters (loop through indexes), I've never seen a loop that does that on negative keys. However, I am not very familiar with how the branch predictions work, and if randomly accessing a negative index of any array somewhere may cause branch prediction to fail and remove all the benefit.

If you believe in real-world scenarios this change gives no performance gain then we can scrap this PR, or I can remake it to only move the emptiness label to the end, where it probably is more readable (unless that can mess up branch prediction too).

@bbrtj
Copy link
Contributor Author

bbrtj commented Dec 10, 2025

@iabyn After thinking this through, I agree. Stable performance is good.

I changed the code to do things differently:

  • added some comments, especially the comment which marks early key recalculation as deliberate performance choice
  • reduced the number of branches even further by adjusting lval using multiplication
  • removed the need for a goto, so removed the label as well

I created this benchmark case which attempts to trigger all paths in av_fetch (may actually not trigger lval):

    'fetch_test' => {
        desc    => 'fetch test',
        setup   => 'my @a = 1..10;',
        code    => 'for (-9 .. 9) { $a[$_]; $a[$_ - 5]; $a[$_ + 5]; $a[$_] = 42; $a[$_+20] = 42; }',
    },

The results (new = what I commited now, hacked = what I had earlier):

       hacked   blead    new
       ------ ------- ------
    Ir 100.00   98.67  98.67
    Dr 100.00  100.00 100.00
    Dw 100.00   98.83  98.83
  COND 100.00  101.79 101.79
   IND 100.00  100.00 100.00

COND_m 100.00  135.64 169.14
 IND_m 100.00  100.00 100.00

 Ir_m1 100.00  100.00 100.00
 Dr_m1 100.00  100.00 100.00
 Dw_m1 100.00  100.00 100.00

 Ir_mm 100.00  100.00 100.00
 Dr_mm 100.00  100.00 100.00
 Dw_mm 100.00  100.00 100.00

So indeed blead had less conditional branch misses, but my after my changes it seems to have even less, while not worsening any other parameter compared to blead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants