perf: implement fast Get for integral types by TerrorJack · Pull Request #216 · haskell/binary

TerrorJack · 2025-11-02T14:23:40Z

This patch implements fast Get logic for integral types based on:

Use a single load operation when loading with same endianness of the
host, otherwise do a host load and a byteSwap. This avoids the
overhead of multiple single-byte loads in the previous
implementation.
Use the unaligned Addr# load/store primops added since GHC 9.10 when
available, otherwise do a plain peek. This ensures the GHC backends
see the right AlignmentSpec at the Cmm level and can correctly emit
unaligned load instructions.

There's no need for changing Put logic they're backed by FixedPrim
logic in Data.ByteString.Builder.Prim.Binary that already does
similar optimization.

Closes #215.

Bodigrim

(I'm not a maintainer here)

binary.cabal

Bodigrim · 2025-11-02T15:23:44Z

src/Data/Binary/Get.hs

-        (fromIntegral (s `B.unsafeIndex` 1))
-{-# INLINE[2] getWord16be #-}
-{-# INLINE word16be #-}
+#if defined(WORDS_BIGENDIAN)


Is it feasible to add a s390x job to CI? See https://github.com/haskell/bytestring/blob/master/.github/workflows/ci.yml#L121 for instance. Otherwise #if defined(WORDS_BIGENDIAN) tends to bit rot really quickly.

that'll be an extra source of flakiness before https://gitlab.haskell.org/ghc/ghc/-/issues/25541 is sorted out

I think it's a good idea to run CI on a big endian arch, but that can be done in a later PR.

This patch implements fast `Get` logic for integral types based on: - Use a single load operation when loading with same endianness of the host, otherwise do a host load and a byteSwap. This avoids the overhead of multiple single-byte loads in the previous implementation. - Use the unaligned Addr# load/store primops added since GHC 9.10 when available, otherwise do a plain peek. This ensures the GHC backends see the right AlignmentSpec at the Cmm level and can correctly emit unaligned load instructions. There's no need for changing `Put` logic they're backed by `FixedPrim` logic in `Data.ByteString.Builder.Prim.Binary` that already does similar optimization.

konsumlamm · 2026-03-11T09:07:28Z

I tried to benchmark the use of the unaligned load primops, but I couldn't notice any difference. Do you have any idea why that might be?

This is my benchmark:

import Control.Monad

import Data.Binary.Get
import Data.Binary.Put

import Test.Tasty.Bench

main :: IO ()
main = defaultMain
    [ bench "" $ whnf (runGet getData) bs
    ]
  where
    n = 100000
    getData = fmap sum $ replicateM n $ do
        w8 <- getWord8
        w16 <- getWord16host
        w32 <- getWord32host
        w64 <- getWord64host
        pure $! fromIntegral w8 + fromIntegral w16 + fromIntegral w32 + w64
    bs = runPut $ replicateM_ n $ do
        putWord8 42
        putWord16host 12
        putWord32host 0xff00ff00
        putWord64host 0x0123456789abcdef

Bodigrim reviewed Nov 2, 2025

View reviewed changes

TerrorJack force-pushed the wip/fast-binary branch from eeaa5ea to 652ee91 Compare March 10, 2026 11:11

TerrorJack requested a review from konsumlamm March 10, 2026 11:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: implement fast Get for integral types#216

perf: implement fast Get for integral types#216
TerrorJack wants to merge 1 commit intohaskell:masterfrom
haskell-wasm:wip/fast-binary

TerrorJack commented Nov 2, 2025

Uh oh!

Bodigrim left a comment

Uh oh!

Uh oh!

Bodigrim Nov 2, 2025

Uh oh!

TerrorJack Nov 2, 2025

Uh oh!

konsumlamm Mar 5, 2026

Uh oh!

konsumlamm commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

TerrorJack commented Nov 2, 2025

Uh oh!

Bodigrim left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Bodigrim Nov 2, 2025

Choose a reason for hiding this comment

Uh oh!

TerrorJack Nov 2, 2025

Choose a reason for hiding this comment

Uh oh!

konsumlamm Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

konsumlamm commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants