Skip to content

Fix out-of-bounds array write in POR missing-value reader#376

Open
aaranmcguire wants to merge 1 commit intoWizardMac:devfrom
aaranmcguire:fix/por-oob-array-write
Open

Fix out-of-bounds array write in POR missing-value reader#376
aaranmcguire wants to merge 1 commit intoWizardMac:devfrom
aaranmcguire:fix/por-oob-array-write

Conversation

@aaranmcguire
Copy link
Copy Markdown

PR 5: Fix out-of-bounds array write in POR missing-value reader

Summary

In spss/readstat_por_read.c, the bounds check on the
missing_double_values[3] array is performed after the write, not before
it. When n_missing_values equals 3, the code writes to
missing_double_values[3] — one slot past the end of the array — before
detecting the error.


Vulnerable code

/* spss/readstat_por_read.c — read_missing_value_record (approx.) */

if (varinfo->type == READSTAT_TYPE_DOUBLE) {
    if ((retval = read_double(ctx,
            &varinfo->missing_double_values[varinfo->n_missing_values])) != READSTAT_OK) {
        goto cleanup;
    }
} else {
    if ((retval = read_string(ctx,
            varinfo->missing_string_values[varinfo->n_missing_values], ...)) != READSTAT_OK) {
        goto cleanup;
    }
}
if (varinfo->n_missing_values > 2) {   /* ← check AFTER the write */
    retval = READSTAT_ERROR_PARSE;
    goto cleanup;
}
varinfo->n_missing_values++;

The struct declaration (spss/readstat_spss.h):

typedef struct spss_varinfo_s {
    ...
    double  missing_double_values[3];   /* indices 0, 1, 2 */
    ...
} spss_varinfo_t;

Why it is a real bug

SPSS allows up to 3 discrete missing values per variable. The POR reader
calls read_missing_value_record once per missing-value token in the file.
After the third call, varinfo->n_missing_values == 3.

On the fourth call (a malformed file that encodes a fourth missing value):

  • varinfo->missing_double_values[3] is written — one element past the
    3-element array
    .
  • The subsequent if (n_missing_values > 2) check fires — but too late.

The write lands in whatever field follows missing_double_values in
spss_varinfo_t. Looking at the struct:

double   missing_double_values[3];   /* bytes  0–23 */
char     missing_string_values[3][9]; /* bytes 24–50 */

So the fourth double write (8 bytes) overwrites the first 8 bytes of
missing_string_values[0] — corrupting the string-missing storage. Whether
this matters in practice depends on what the code reads from
missing_string_values afterward, but the memory corruption is unconditional.


Reproduction

# Craft a POR file where a variable declares 4 discrete missing values.
# POR missing-value records use the character '8' followed by type.
# A conforming file has at most 3; set 4 to trigger the bug.
#
# Simplest test: any POR fuzzer will find this immediately since the
# check is right there and the array bounds are small.
Minimal trigger bytes in the POR missing-value section:
  8  (missing value record type)
  <value1>\  8  <value2>\  8  <value3>\  8  <value4>\
On the 4th record, n_missing_values == 3 and the write goes OOB.

With AddressSanitizer:

ERROR: AddressSanitizer: heap-buffer-overflow
WRITE of size 8 at offset 24 in spss_varinfo_t.missing_double_values

Fix

Move the bounds check before the write:

+if (varinfo->n_missing_values >= 3) {
+    retval = READSTAT_ERROR_PARSE;
+    goto cleanup;
+}
 if (varinfo->type == READSTAT_TYPE_DOUBLE) {
     if ((retval = read_double(ctx,
             &varinfo->missing_double_values[varinfo->n_missing_values])) != READSTAT_OK) {
         goto cleanup;
     }
 } else { ... }
-if (varinfo->n_missing_values > 2) {
-    retval = READSTAT_ERROR_PARSE;
-    goto cleanup;
-}
 varinfo->n_missing_values++;

The condition changes from > 2 (post-write, after index 3 has been used) to
>= 3 (pre-write, rejecting index 3 before it is accessed).

The bounds check on missing_double_values[3] was performed after the
write, not before. When n_missing_values == 3, the code wrote to
missing_double_values[3] — one element past the end of the array —
before detecting the error.

The struct layout (spss/readstat_spss.h):
  double  missing_double_values[3];    /* bytes  0-23 */
  char    missing_string_values[3][9]; /* bytes 24-50 */

So the fourth write (8 bytes at offset 24) overwrites the first 8
bytes of missing_string_values[0], corrupting string-missing storage.

Trigger: a POR file with a variable declaring 4 discrete missing values
(malformed, but trivially crafted; any fuzzer finds this immediately).

Fix: move the n_missing_values >= 3 check before the array write.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant