Skip to content

Fix off-by-one in label/key malloc β€” missing NUL terminator#374

Open
aaranmcguire wants to merge 1 commit intoWizardMac:devfrom
aaranmcguire:fix/label-malloc-missing-nul
Open

Fix off-by-one in label/key malloc β€” missing NUL terminator#374
aaranmcguire wants to merge 1 commit intoWizardMac:devfrom
aaranmcguire:fix/label-malloc-missing-nul

Conversation

@aaranmcguire
Copy link
Copy Markdown

PR 3: Fix off-by-one in label/key malloc β€” missing NUL terminator

Summary

Two malloc calls in readstat_writer.c allocate strlen(s) bytes and then
memcpy exactly strlen(s) bytes β€” omitting the NUL terminator. This
produces a heap allocation with no null terminator, so any code that later
calls strlen on the stored string reads past the end of the allocation.

Both sites are in the public writer API and are exercised by every caller that
registers a value label or a string-keyed label.


Site 1: readstat_copy_label β€” label string for value labels

Vulnerable code

static void readstat_copy_label(readstat_value_label_t *value_label,
        const char *label) {
    if (label && strlen(label)) {
        value_label->label_len = strlen(label);
        value_label->label = malloc(value_label->label_len);        /* allocates N bytes */
        memcpy(value_label->label, label, value_label->label_len);  /* copies N bytes, no NUL */
    }
}

What goes wrong

strlen("Male") returns 4. malloc(4) returns a 4-byte buffer.
memcpy(..., 4) fills all 4 bytes with M, a, l, e.
There is no NUL at byte 4.

If any code later does strlen(value_label->label), it reads into the next
heap chunk until it finds a zero byte β€” undefined behaviour and a potential
information leak.

The label_len field is used to avoid strlen in most internal paths, so
this is silent in the common case but breaks if the label is ever treated as a
C string (e.g. in error messages, debug output, or downstream consumers of the
readstat_value_label_t struct).

Reproduction

readstat_label_set_t *ls = readstat_add_label_set(writer,
        READSTAT_TYPE_DOUBLE, "gender");
readstat_label_double_value(ls, 1.0, "Male");
/* value_label->label is now 4 bytes with no NUL.
   Any strlen(value_label->label) reads into adjacent heap memory. */

/* Detectable with AddressSanitizer:
   compile with -fsanitize=address and write any SAV/DTA with value labels. */

Fix

-value_label->label = malloc(value_label->label_len);
-memcpy(value_label->label, label, value_label->label_len);
+value_label->label = malloc(value_label->label_len + 1);
+if (value_label->label == NULL) return READSTAT_ERROR_MALLOC;
+memcpy(value_label->label, label, value_label->label_len + 1);

The function signature changes from void to readstat_error_t to propagate
the allocation failure.


Site 2: readstat_label_string_value β€” string key for string-keyed value labels

Vulnerable code

void readstat_label_string_value(readstat_label_set_t *label_set,
        const char *value, const char *label) {
    readstat_value_label_t *new_value_label = readstat_add_value_label(label_set, label);
    if (value && strlen(value)) {
        new_value_label->string_key_len = strlen(value);
        new_value_label->string_key = malloc(new_value_label->string_key_len);        /* N bytes */
        memcpy(new_value_label->string_key, value, new_value_label->string_key_len);  /* no NUL */
    }
}

What goes wrong

Identical issue: string_key is allocated without room for a NUL terminator.
String-keyed value labels are used in SPSS SAV files for string variables.

Fix

-new_value_label->string_key = malloc(new_value_label->string_key_len);
-memcpy(new_value_label->string_key, value, new_value_label->string_key_len);
+new_value_label->string_key = malloc(new_value_label->string_key_len + 1);
+if (new_value_label->string_key == NULL) return; /* or propagate error */
+memcpy(new_value_label->string_key, value, new_value_label->string_key_len + 1);

readstat_copy_label() allocates strlen(label) bytes and memcpy's
strlen(label) bytes β€” no NUL terminator. Any consumer calling
strlen() on the stored label reads past the end of the allocation.

readstat_label_string_value() has the same bug for string_key: it
stores the key string without a NUL terminator.

Both allocations now use strlen + 1 and copy strlen + 1 bytes.
readstat_copy_label() changes return type to readstat_error_t to
propagate allocation failure.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant