Fix possible invalid write and overflow issues in kmemmem, kstrstr and kstrnstr #1980

daviesrob · 2026-01-08T14:02:45Z

Fixes various issues in the kmemmem() interface and its
kstrstr() and kstrnstr() wrappers:

Calling kmemmem(), kstrstr() or kstrnstr() with a zero-length pattern could lead to a zero-length allocation in the ksBM_prep() function and, if that returns a non-NULL suff pointer, an invalid write to suff[-1]. Thanks to Harrison Green for reporting this.

Fixed by catching zero-length patterns and returning the correct result - a pointer to the start of the input.

As an additional optimisation, patterns of length 1 are now passed to memchr() as it's likely to be more efficient in that case.

Finally if the pattern is longer that the input, NULL is returned early as it can never match. This is also useful for some of the other changes, as it ensures subtracting the two lengths will never give a negative result.
The kmemmem() interface uses int for the sizes of the pattern and input buffers, but the kstrstr() wrapper passed in the output of strlen() which could overflow.

The implementation is renamed and altered to use size_t for the length of the str buffer. kmemmem() is now a simple wrapper around it that handles the possibility of n being negative. The length of the pat buffer is left as an int as otherwise the type and size of the prep array would need to be changed. This would be wasteful as the pattern is very likely to be short.
It was difficult to tell the difference between the algorithm failing because it couldn't allocate memory, or the pattern simply not being present (both returned NULL).

This is fixed by providing Karp-Rabin as a backup algorithm. It's fairly simple and doesn't need to allocate any memory so should always work. It's also used to cover the unlikely case of the kstrstr() pattern being longer than INT_MAX.
kstrnstr() did not check for a NUL in the first n bytes of its input. This could lead to it finding false matches beyond the strict end of its input, or possibly reads of uninitialised memory.

Fixed by checking for a NUL in the first n bytes and adjusting n if one is found. It also returns NULL if the pattern is longer than the revised n which both avoids unnecessary work and ensures the pattern length <= INT_MAX.

test/test_kstring.c is updated to add some simple kmemmem, kstrstr, and kstrnstr tests. It's also made to include kstring.c directly so that it can unit test non-exported internal functions. This allows some tests for the Karp-Rabin implementation to be added.

Fixes various issues in the kmemmem() interface and its kstrstr() and kstrnstr() wrappers: * Calling kmemmem(), kstrstr() or kstrnstr() with a zero-length pattern could lead to a zero-length allocation in the ksBM_prep() function and, if that returns a non-NULL `suff` pointer, an invalid write to suff[-1]. Thanks to Harrison Green for reporting this. Fixed by catching zero-length patterns and returning the correct result - a pointer to the start of the input. As an additional optimisation, patterns of length 1 are now passed to memchr() as it's likely to be more efficient in that case. Finally if the pattern is longer that the input, NULL is returned early as it can never match. This is also useful for some of the other changes, as it ensures subtracting the two lengths will never give a negative result. * The kmemmem() interface uses `int` for the sizes of the pattern and input buffers, but the kstrstr() wrapper passed in the output of strlen() which could overflow. The implementation is renamed and altered to use `size_t` for the length of the `str` buffer. kmemmem() is now a simple wrapper around it that handles the possibility of `n` being negative. The length of the `pat` buffer is left as an int as otherwise the type and size of the `prep` array would need to be changed. This would be wasteful as the pattern is very likely to be short. * It was difficult to tell the difference between the algorithm failing because it couldn't allocate memory, or the pattern simply not being present (both returned NULL). This is fixed by providing Karp-Rabin as a backup algorithm. It's fairly simple and doesn't need to allocate any memory so should always work. It's also used to cover the unlikely case of the kstrstr() pattern being longer than INT_MAX. * kstrnstr() did not check for a NUL in the first n bytes of its input. This could lead to it finding false matches beyond the strict end of its input, or possibly reads of uninitialised memory. Fixed by checking for a NUL in the first n bytes and adjusting n if one is found. It also returns NULL if the pattern is longer than the revised n which both avoids unnecessary work and ensures n <= INT_MAX.

Adjusts test_kstring to directly include kstring.c so that it can unit test non-exported internal functions.

daviesrob added 2 commits January 7, 2026 11:34

Add simple kmemmem, kstrstr and kstrnstr tests

77c711f

Adjusts test_kstring to directly include kstring.c so that it can unit test non-exported internal functions.

daviesrob assigned whitwham Jan 15, 2026

whitwham merged commit 1274083 into samtools:develop Jan 19, 2026
9 checks passed

daviesrob deleted the kmemmem branch January 19, 2026 12:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix possible invalid write and overflow issues in kmemmem, kstrstr and kstrnstr #1980

Fix possible invalid write and overflow issues in kmemmem, kstrstr and kstrnstr #1980

Uh oh!

daviesrob commented Jan 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix possible invalid write and overflow issues in kmemmem, kstrstr and kstrnstr #1980

Fix possible invalid write and overflow issues in kmemmem, kstrstr and kstrnstr #1980

Uh oh!

Conversation

daviesrob commented Jan 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants