Skip to content

Conversation

@iamleot
Copy link

@iamleot iamleot commented Aug 31, 2025

isascii(3), isspace(3), isupper(3), tolower(3) and toupper(3) functions only accepts values that are representable as unsigned char or is EOF. For all other values the behavior is undefined.

Cast to unsigned char to avoid any undefined behavior.


This was noticed on NetBSD.


Trying to build commit a61f178 in NetBSD/amd64 -current (11.99.1) we have the following compiler warnings:

$ ./build.sh
[...]
--- src/ignore.o ---
In file included from /usr/include/ctype.h:100,
                 from src/ignore.c:1:
src/ignore.c: In function ‘add_ignore_pattern’:
src/ignore.c:110:29: warning: array subscript has type ‘char’ [-Wchar-subscripts]
  110 |         if (!isspace(pattern[pattern_len - 1])) {
      |                             ^
[...]
--- src/util.o ---
  CC       src/util.o
In file included from /usr/include/ctype.h:100,
                 from src/util.c:1:
src/util.c: In function ‘generate_alpha_skip’:
src/util.c:82:52: warning: array subscript has type ‘char’ [-Wchar-subscripts]
   82 |             skip_lookup[(unsigned char)tolower(find[i])] = f_len - i;
      |                                                    ^
src/util.c:83:52: warning: array subscript has type ‘char’ [-Wchar-subscripts]
   83 |             skip_lookup[(unsigned char)toupper(find[i])] = f_len - i;
      |                                                    ^
src/util.c: In function ‘is_prefix’:
src/util.c:97:26: warning: array subscript has type ‘char’ [-Wchar-subscripts]
   97 |             if (tolower(s[i]) != tolower(s[i + pos])) {
      |                          ^
src/util.c:97:43: warning: array subscript has type ‘char’ [-Wchar-subscripts]
   97 |             if (tolower(s[i]) != tolower(s[i + pos])) {
      |                                           ^
src/util.c: In function ‘suffix_len’:
src/util.c:115:26: warning: array subscript has type ‘char’ [-Wchar-subscripts]
  115 |             if (tolower(s[pos - i]) != tolower(s[s_len - i - 1])) {
      |                          ^
src/util.c:115:49: warning: array subscript has type ‘char’ [-Wchar-subscripts]
  115 |             if (tolower(s[pos - i]) != tolower(s[s_len - i - 1])) {
      |                                                 ^
src/util.c: In function ‘boyer_moore_strnstr’:
src/util.c:193:68: warning: array subscript has type ‘char’ [-Wchar-subscripts]
  193 |         for (i = f_len - 1; i >= 0 && (case_insensitive ? tolower(s[pos]) : s[pos]) == find[i]; pos--, i--) {
      |                                                                    ^
src/util.c: In function ‘hash_strnstr’:
src/util.c:220:55: warning: array subscript has type ‘char’ [-Wchar-subscripts]
  220 |                 if ((case_sensitive ? R[i] : tolower(R[i])) != find[i])
      |                                                       ^
src/util.c:232:57: warning: array subscript has type ‘char’ [-Wchar-subscripts]
  232 |             char s_c = case_sensitive ? R[i] : tolower(R[i]);
      |                                                         ^
src/util.c: In function ‘is_lowercase’:
src/util.c:462:40: warning: array subscript has type ‘char’ [-Wchar-subscripts]
  462 |         if (!isascii(s[i]) || isupper(s[i])) {
      |                                        ^
[...]
--- src/main.o ---
  CC       src/main.o
In file included from /usr/include/ctype.h:100,
                 from src/main.c:1:
src/main.c: In function ‘main’:
src/main.c:120:36: warning: array subscript has type ‘char’ [-Wchar-subscripts]
  120 |                 *c = (char)tolower(*c);
      |                                    ^

For possible more information regarding ctype(3) abuse please give a look to CAVEATS section of ctype(3) NetBSD man page.

isascii(3), isspace(3), isupper(3), tolower(3) and toupper(3) functions
only accepts values that are representable as unsigned char or is EOF.
For all other values the behavior is undefined.

Cast to unsigned char to avoid any undefined behavior.
@iamleot iamleot marked this pull request as ready for review August 31, 2025 12:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant