Skip to content

Limit console output in CharsetsTest to prevent OOM in resultsSum.pl#7000

Merged
smlambert merged 2 commits intoadoptium:masterfrom
annaibm:updateVerbose
Apr 17, 2026
Merged

Limit console output in CharsetsTest to prevent OOM in resultsSum.pl#7000
smlambert merged 2 commits intoadoptium:masterfrom
annaibm:updateVerbose

Conversation

@annaibm
Copy link
Copy Markdown
Contributor

@annaibm annaibm commented Apr 9, 2026

Without this fix, MBCS_Tests_charsets_0 -> CharsetsTest prints all encoding/decoding errors to stdout, which gets captured into TestTargetResult and TAP files. With 33M+ errors on AIX ppc64, this causes resultsSum.pl to OOM when building the TAP file.

  • Cap console error output to MAX_CONSOLE_ERRORS_PER_CHARSET (10) per charset
  • Write all errors to charset_errors.txt for full debug details
  • Add verbose flag (EXTRA_OPTIONS=-Dcharsets.verbose=true`) for debug runs - prints all errors as default to console but should not be used in CI.

related: https://github.ibm.com/runtimes/automation/issues/921

Without this fix, CharsetsTest prints all encoding/decoding errors to
stdout, which gets captured into TestTargetResult and TAP files. With
33M+ errors on AIX ppc64, this causes resultsSum.pl to OOM when
building the TAP file.

- Cap console error output to MAX_CONSOLE_ERRORS_PER_CHARSET (10) per charset
- Write all errors to charset_errors.txt for full debug details
- Add verbose flag (-Dcharsets.verbose=true) for debug runs - prints
  all errors to console but should not be used in CI

related: https://github.ibm.com/runtimes/automation/issues/921

Signed-off-by: Anna Babu Palathingal <anna.bp@ibm.com>
@annaibm
Copy link
Copy Markdown
Contributor Author

annaibm commented Apr 9, 2026

Tested:

  • Grinder #59529: without verbose, no OOM ✅

    • charset_errors.txt archived in functional_test_output.tar.gzArtifactory
  • Grinder #59531: with EXTRA_OPTIONS: -Dcharsets.verbose=true, OOM reproduced as expected ❌
    verbose mode should not be used in CI runs — use only in personal Grinder runs for debugging

Starting Mics Charsets Test  (rerbose sode enabled,

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Reduces stdout volume from CharsetsTest to prevent downstream TAP/result aggregation (resultsSum.pl) from running out of memory when extremely large numbers of charset mismatches occur (e.g., on AIX ppc64).

Changes:

  • Caps per-charset console error output (unless -Dcharsets.verbose=true is enabled).
  • Writes mismatch details to charset_errors.txt under test.output.dir (or . by default).
  • Adds summary counters for tested/skipped charsets and total errors.
Show a summary per file
File Description
functional/MBCS_Tests/charsets/src/CharsetsTest.java Limits console logging of mismatches, adds file logging + verbose mode, and prints a run summary to reduce TAP/result size.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comments suppressed due to low confidence (1)

functional/MBCS_Tests/charsets/src/CharsetsTest.java:314

  • The console message says further errors are "written to charset_errors.txt" even when errorFile could not be created (errorFile == null). Guard this message on errorFile != null or adjust the wording so it’s not misleading.
                    System.out.println(":"+sb0.toString()+"<>"+sb1.toString());
                } else if (charset_err_cnt == MAX_CONSOLE_ERRORS_PER_CHARSET + 1) {
                    System.out.println("  ... further errors for " + ce.charset().name() + " written to charset_errors.txt");
                }
  • Files reviewed: 1/1 changed files
  • Comments generated: 5

static int tested_cnt = 0;
static int skipped_cnt = 0;
static boolean verbose = Boolean.getBoolean("charsets.verbose");
static int MAX_CONSOLE_ERRORS_PER_CHARSET = 10;
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAX_CONSOLE_ERRORS_PER_CHARSET is a constant but is declared as a mutable static int. To match the existing constant style in this class (e.g., sbcs_space/dbcs_space/mix_space), make it static final (and ideally private) so it can’t be modified at runtime.

Suggested change
static int MAX_CONSOLE_ERRORS_PER_CHARSET = 10;
private static final int MAX_CONSOLE_ERRORS_PER_CHARSET = 10;

Copilot uses AI. Check for mistakes.
Comment on lines +226 to +245
err_cnt++;
charset_err_cnt++;

// Always write to error file
if (errorFile != null) {
errorFile.print(cd.charset().name()+":");
for(byte b : ba) errorFile.printf("\\x%02X",(int)b&0xFF);
errorFile.println(":"+sb0.toString()+"<>"+sb1.toString());
}

// Print to console based on verbose flag or error limit
if (verbose || charset_err_cnt <= MAX_CONSOLE_ERRORS_PER_CHARSET) {
if (charset_err_cnt == 1) {
System.out.println("\n=== Decoding Errors in " + cd.charset().name() + " ===");
}
System.out.print(" ");
for(byte b : ba) System.out.printf("\\x%02X",(int)b&0xFF);
System.out.println(":"+sb0.toString()+"<>"+sb1.toString());
} else if (charset_err_cnt == MAX_CONSOLE_ERRORS_PER_CHARSET + 1) {
System.out.println(" ... further errors for " + cd.charset().name() + " written to charset_errors.txt");
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

charset_err_cnt is shared between decoding and encoding. If a charset has decoding mismatches first, the later encoding mismatches won’t print the "=== Encoding Errors ===" header (and may be suppressed earlier than intended) because charset_err_cnt is already > 1. Consider tracking separate counters (or header flags) for decode vs encode, or resetting before the encoder phase.

Copilot uses AI. Check for mistakes.
for(byte b : ba) System.out.printf("\\x%02X",(int)b&0xFF);
System.out.println(":"+sb0.toString()+"<>"+sb1.toString());
} else if (charset_err_cnt == MAX_CONSOLE_ERRORS_PER_CHARSET + 1) {
System.out.println(" ... further errors for " + cd.charset().name() + " written to charset_errors.txt");
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The console message says further errors are "written to charset_errors.txt" even when errorFile could not be created (errorFile == null). Guard this message on errorFile != null or adjust the wording so it’s not misleading.

This issue also appears on line 311 of the same file.

Suggested change
System.out.println(" ... further errors for " + cd.charset().name() + " written to charset_errors.txt");
if (errorFile != null) {
System.out.println(" ... further errors for " + cd.charset().name() + " written to charset_errors.txt");
} else {
System.out.println(" ... further errors for " + cd.charset().name() + " not shown");
}

Copilot uses AI. Check for mistakes.
Comment on lines +408 to +411
System.out.println("Errors found: " + err_cnt);
if (err_cnt > 0) {
System.out.println("Full error details written to: charset_errors.txt");
}
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If charset_errors.txt cannot be created, the summary still prints "Full error details written to: charset_errors.txt". This should be conditional on errorFile being successfully opened (or print the earlier warning again / include the actual path).

Copilot uses AI. Check for mistakes.
Comment on lines +229 to +234
// Always write to error file
if (errorFile != null) {
errorFile.print(cd.charset().name()+":");
for(byte b : ba) errorFile.printf("\\x%02X",(int)b&0xFF);
errorFile.println(":"+sb0.toString()+"<>"+sb1.toString());
}
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This writes every mismatch to charset_errors.txt unconditionally. With the reported 33M+ errors, this can create a multi-GB file and significantly slow the test or exhaust disk in CI. Consider capping file output by default (similar to console), and only enabling full file output under an explicit debug flag.

Copilot uses AI. Check for mistakes.
…rset

-Dcharsets.verbose=true: Uncapped console output (local debugging only)
-Dcharsets.debug.file=true: Uncapped file output, limited console (CI-safe debugging)
- Update on review comments from copilot
System.out.println(" ... further errors for " + cd.charset().name() + " written to charset_errors.txt");
} else if (charset_decode_err_cnt == MAX_CONSOLE_ERRORS_PER_CHARSET + 1) {
if (errorFile != null) {
System.out.println(" ... further decoding errors for " + cd.charset().name() + " written to charset_errors.txt");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want a proper logger as opposed to s.o.p?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is a test file, System.out.println is the approach used throughout the existing test suite. Used it this way toward keeping it consistent with the rest of the test file.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can look at other improvements to these tests in separate PRs.

@smlambert smlambert requested a review from pshipton April 13, 2026 17:41
@annaibm
Copy link
Copy Markdown
Contributor Author

annaibm commented Apr 13, 2026

TESTED RESULTS

  1. Grinders: No flags (default)
    https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/59546/ (no verbose) ⚠️
    Console: shows first 10 errors per charset only
    File (charset_errors.txt): writes up to 1000 errors per charset Artifactory
    Good for normal CI runs — quiet output, limited file size

  2. EXTRA_OPTIONS: -Dcharsets.verbose=true
    https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/59547/ ❌ OOM
    Console: shows ALL errors per charset (can be huge)
    File: same as default (1000 per charset)
    Only use locally when debugging a specific charset

  3. EXTRA_OPTIONS: -Dcharsets.debug.file=true
    https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/59548/ ⚠️
    Console: same as default (first 10 only)
    File: writes ALL errors with no cap Artifactory
    CI-safe — doesn't affect console output, but file can get very large

@annaibm annaibm requested a review from karianna April 16, 2026 17:39
System.out.println(" ... further errors for " + cd.charset().name() + " written to charset_errors.txt");
} else if (charset_decode_err_cnt == MAX_CONSOLE_ERRORS_PER_CHARSET + 1) {
if (errorFile != null) {
System.out.println(" ... further decoding errors for " + cd.charset().name() + " written to charset_errors.txt");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can look at other improvements to these tests in separate PRs.

@smlambert smlambert merged commit 3b2a0eb into adoptium:master Apr 17, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants