Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,20 @@ PerlOnJava does **not** implement the following Perl features:

**NEVER modify or delete existing tests.** Tests are the source of truth. If a test fails, fix the code, not the test. When in doubt, verify expected behavior with system Perl (`perl`, not `jperl`).

**ALWAYS capture full test output to a file.** Test output can be very long and gets truncated in the terminal. Always redirect output to a file and read from there:
```bash
# For prove-based tests
prove src/test/resources/unit > /tmp/prove_output.txt 2>&1; echo "EXIT: $?" >> /tmp/prove_output.txt

# For jperl tests
./jperl test.t > /tmp/test_output.txt 2>&1

# For perl_test_runner.pl
perl dev/tools/perl_test_runner.pl perl5_t/t/op/ > /tmp/test_output.txt 2>&1

# Then read the results from the file
```

**ALWAYS use `make` commands. NEVER use raw mvn/gradlew commands.**

| Command | What it does |
Expand Down
110 changes: 110 additions & 0 deletions dev/design/fix_unit_test_parity.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# Fix Unit Test Parity with System Perl

## Problem

Three unit test files in `src/test/resources/unit/` produce different results under jperl vs system Perl (v5.42). The goal is to fix jperl to match system Perl behavior, then update the tests accordingly.

## Failing Tests Summary

| Test File | Failing Tests | Root Cause |
|-----------|--------------|------------|
| `string_interpolation.t` | Test 4: `$\` interpolation | `"$\\"` in jperl produces only the value of `$\`, but system Perl produces `value_of($\)` + literal `\` |
| `sysread_syswrite.t` | Test 3: sysread from in-memory handle | jperl returns bytes read; system Perl returns `undef` |
| `ipc_open3.t` | Tests 3, 5, 7: stderr capture with false `$err` | jperl creates a separate stderr handle; system Perl merges stderr into stdout when `$err` is false (`""`) |

## Detailed Analysis

### 1. `$\` interpolation in double-quoted strings

**Observed behavior:**
```
# System Perl:
perl -e '$\ = "ABC"; print "[$\\]";' # → [ABC\]ABC
# jperl:
./jperl -e '$\ = "ABC"; print "[$\\]";' # → [ABC]ABC
```

In `"$\\"`, system Perl parses this as: interpolate `$\` + literal backslash (from `\\`). jperl only interpolates `$\` and loses the trailing `\`.

The issue is in how jperl's string interpolation scanner handles the `\\` escape sequence after recognizing the `$\` variable. After consuming `$` + `\` for the variable name, the second `\` should still be processed as part of the string (forming `\\` → `\` with the next char... but the next char is `"` which ends the string). Actually, the correct parse is: `$\` consumes only one `\`, then `\` + `"` → but that would be an escaped quote...

System Perl must be parsing `"$\\"` as: `$\` (variable, consumes one `\`) then `\"` → no, because the string would be unterminated.

**More likely**: system Perl parses `"$\\"` by recognizing `$\` as the variable AND recognizing `\\` as an escape producing `\`, with the variable consuming only the `$` + first `\`, and the escape consuming `\` + `\`. This means the variable name `$\` and the escape `\\` share the middle `\` character... which seems unlikely.

**Investigation needed:** Write additional unit tests to clarify exact parsing behavior for `$\` in various DQ string contexts.

### 2. sysread on in-memory file handles

System Perl's `sysread()` does not work on in-memory file handles (opened via `open(my $fh, '<', \$scalar)`). It returns `undef`. jperl currently supports this, returning actual bytes read.

**Fix:** jperl's sysread should return `undef` (and warn) when the filehandle is an in-memory handle, matching system Perl.

### 3. IPC::Open3 with false `$err` argument

Per IPC::Open3 documentation: "If CHLD_ERR is false, or the same file descriptor as CHLD_OUT, then STDOUT and STDERR of the child are on the same filehandle."

When `$err = ""` (which is false), system Perl merges stderr with stdout. jperl incorrectly creates a separate stderr handle.

**Fix:** jperl's IPC::Open3 implementation should check if the `$err` argument is false and merge stderr into stdout accordingly.

## Plan

For each issue:
1. Fix jperl to match system Perl behavior (jperl will then fail the same tests system Perl fails)
2. Fix the tests so they pass on both system Perl and jperl

Only when uncertain about system Perl behavior, add new unit tests to clarify before fixing jperl.

### Issue 1: `$\` interpolation in DQ strings

**Fix jperl:** In the string interpolation scanner, `$\` greedily captures the `\` as part of the variable name. After `$\` is consumed, the next characters are processed normally as string content (not as escape sequences starting with the consumed `\`). So `"$\\"` = value_of(`$\`) + literal `\` (from the remaining `\\` → escaped backslash). jperl currently drops the trailing `\`.

Key system Perl behavior (confirmed by exploratory tests):
- `"$\\"` with `$\ = "SEP"` → `SEP\`
- `"$\n"` → value_of(`$\`) + literal `n` (NOT newline — the `\` was consumed by `$\`)
- `"$\t"` → value_of(`$\`) + literal `t` (NOT tab)

**Then fix tests:** Update `string_interpolation.t` test 4 expected value.

### Issue 2: sysread on in-memory file handles

**Fix jperl:** `sysread()` on in-memory file handles (opened via `open($fh, '<', \$scalar)`) should return `undef`, matching system Perl.

**Then fix tests:** Update `sysread_syswrite.t` test 3 expected value.

### Issue 3: IPC::Open3 with false `$err` argument

**Fix jperl:** When the `$err` argument to `open3()` is false (`""`, `0`, `undef`), stderr should be merged into stdout, not captured separately. This matches system Perl and the IPC::Open3 documentation.

**Then fix tests:** Update `ipc_open3.t` tests 3, 5, 7 expected values.

### Final verification

- Run `make` to ensure no regressions
- Run `prove src/test/resources/unit` to confirm all tests pass

## Progress Tracking

### Current Status: Complete (2026-04-09)

PR: https://github.com/fglock/PerlOnJava/pull/476

### Completed Phases
- [x] Analysis: identified 3 issues via diff of perl vs jperl output
- [x] Exploratory tests: confirmed `$\` parsing behavior in system Perl
- [x] Issue 1: Fixed `$\` interpolation in `StringDoubleQuoted.java` — removed `\` from non-interpolating chars
- [x] Issue 1: Updated `string_interpolation.t` test 4
- [x] Issue 2: Fixed sysread in `IOOperator.java` — return undef for in-memory handles
- [x] Issue 2: Updated `sysread_syswrite.t` test 3
- [x] Issue 3: Fixed `IPCOpen3.java` `isUsableHandle()` — check truthiness not definedness
- [x] Issue 3: Updated `ipc_open3.t` tests 3, 5, 7
- [x] `make` passes, `prove src/test/resources/unit` all tests successful

### Files Modified
- `src/main/java/org/perlonjava/frontend/parser/StringDoubleQuoted.java`
- `src/main/java/org/perlonjava/runtime/operators/IOOperator.java`
- `src/main/java/org/perlonjava/runtime/perlmodule/IPCOpen3.java`
- `src/test/resources/unit/string_interpolation.t`
- `src/test/resources/unit/sysread_syswrite.t`
- `src/test/resources/unit/ipc_open3.t`
4 changes: 2 additions & 2 deletions src/main/java/org/perlonjava/core/Configuration.java
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ public final class Configuration {
* Automatically populated by Gradle/Maven during build.
* DO NOT EDIT MANUALLY - this value is replaced at build time.
*/
public static final String gitCommitId = "d057b7f17";
public static final String gitCommitId = "c065e5f5f";

/**
* Git commit date of the build (ISO format: YYYY-MM-DD).
Expand All @@ -48,7 +48,7 @@ public final class Configuration {
* Parsed by App::perlbrew and other tools via: perl -V | grep "Compiled at"
* DO NOT EDIT MANUALLY - this value is replaced at build time.
*/
public static final String buildTimestamp = "Apr 9 2026 20:20:55";
public static final String buildTimestamp = "Apr 9 2026 21:31:23";

// Prevent instantiation
private Configuration() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -468,6 +468,14 @@ private void parseDoubleQuotedEscapes() {
// Consume the character after the backslash
var escape = TokenUtils.consumeChar(parser);

// Trailing backslash at end of string content — treat as literal \
// This happens when $\ consumes a \ for the variable name, leaving
// a lone \ before end-of-string with no escape partner.
if (escape.isEmpty()) {
appendToCurrentSegment("\\");
return;
}

switch (escape) {
// Standard escapes - convert to actual characters
case "\\" -> appendToCurrentSegment("\\");
Expand Down
27 changes: 4 additions & 23 deletions src/main/java/org/perlonjava/runtime/operators/IOOperator.java
Original file line number Diff line number Diff line change
Expand Up @@ -1051,32 +1051,13 @@ public static RuntimeScalar sysread(int ctx, RuntimeBase... args) {
}

// Check for in-memory handles (ScalarBackedIO)
// System Perl does not support sysread on in-memory file handles —
// it returns undef and sets $! to "Bad file descriptor".
IOHandle baseHandle = getBaseHandle(fh.ioHandle);

if (baseHandle instanceof ScalarBackedIO scalarIO) {
RuntimeScalar result;
try {
result = scalarIO.sysread(length);
} catch (Exception e) {
getGlobalVariable("main::!").set("Bad file descriptor");
return new RuntimeScalar();
}
if (!result.getDefinedBoolean()) {
return new RuntimeScalar(0);
}
String readData = result.toString();
String existing = target.toString();
if (offset > 0) {
while (existing.length() < offset) existing += "\0";
setByteString(target, existing.substring(0, offset) + readData);
} else if (offset < 0) {
int effectiveOffset = existing.length() + offset;
if (effectiveOffset < 0) effectiveOffset = 0;
setByteString(target, existing.substring(0, effectiveOffset) + readData);
} else {
setByteString(target, readData);
}
return new RuntimeScalar(readData.length());
getGlobalVariable("main::!").set("Bad file descriptor");
return new RuntimeScalar(); // undef
}

// Try to perform the system read
Expand Down
9 changes: 6 additions & 3 deletions src/main/java/org/perlonjava/runtime/perlmodule/IPCOpen3.java
Original file line number Diff line number Diff line change
Expand Up @@ -201,16 +201,19 @@ private static boolean isInputRedirection(RuntimeScalar handleRef) {
}

/**
* Check if a handle parameter is usable (not undef or a reference to undef)
* Check if a handle parameter is usable (not false or a reference to a false value).
* Per IPC::Open3 docs: "If CHLD_ERR is false, or the same file descriptor as
* CHLD_OUT, then STDOUT and STDERR of the child are on the same filehandle."
* A false value includes undef, "", and 0.
*/
private static boolean isUsableHandle(RuntimeScalar handleRef) {
if (!handleRef.getDefinedBoolean()) {
return false;
}
// If it's a reference, check if the inner value is defined
// If it's a reference, check if the inner value is true (not just defined)
if (handleRef.type == RuntimeScalarType.REFERENCE && handleRef.value instanceof RuntimeScalar) {
RuntimeScalar inner = (RuntimeScalar) handleRef.value;
return inner.getDefinedBoolean();
return inner.getBoolean();
}
return true;
}
Expand Down
49 changes: 24 additions & 25 deletions src/test/resources/unit/ipc_open3.t
Original file line number Diff line number Diff line change
Expand Up @@ -35,23 +35,24 @@ subtest 'IPC::Open3 stdout capture with sysread' => sub {
waitpid($pid, 0);
};

subtest 'IPC::Open3 stderr capture (separate handle)' => sub {
subtest 'IPC::Open3 stderr merged with stdout (empty string err)' => sub {
# When $err is "" (false), stderr merges with stdout (same as undef)
use IPC::Open3;
my ($wtr, $rdr, $err);
$rdr = ""; $err = "";
my $pid = open3($wtr, $rdr, $err, "sh", "-c", "echo out-msg; echo err-msg >&2");
close($wtr);
# Small delay to let both streams fill
select(undef, undef, undef, 0.3);
my $out = <$rdr>;
my $errout = <$err>;
chomp $out if defined $out;
chomp $errout if defined $errout;
is($out, "out-msg", "stdout captured separately");
is($errout, "err-msg", "stderr captured separately");
my @lines;
while (my $line = <$rdr>) {
chomp $line;
push @lines, $line;
}
close($rdr);
close($err);
waitpid($pid, 0);
# Both stdout and stderr should appear in the reader (merged)
ok(scalar(@lines) >= 2, "got at least 2 lines (merged streams)");
ok(grep({ $_ eq "out-msg" } @lines), "stdout present in merged output");
ok(grep({ $_ eq "err-msg" } @lines), "stderr present in merged output");
};

subtest 'IPC::Open3 stderr merged with stdout (undef err)' => sub {
Expand All @@ -74,8 +75,9 @@ subtest 'IPC::Open3 stderr merged with stdout (undef err)' => sub {

subtest 'IPC::Open3 fileno returns defined value' => sub {
use IPC::Open3;
my ($wtr, $rdr, $err);
$rdr = ""; $err = "";
use Symbol 'gensym';
my ($wtr, $rdr);
my $err = gensym;
my $pid = open3($wtr, $rdr, $err, "cat");

my $fn_wtr = fileno($wtr);
Expand Down Expand Up @@ -118,7 +120,9 @@ subtest 'IPC::Open3 with IO::Select - single handle' => sub {
waitpid($pid, 0);
};

subtest 'IPC::Open3 with IO::Select - stdout + stderr (Net::SSH pattern)' => sub {
subtest 'IPC::Open3 with IO::Select - stdout + stderr merged (empty string err)' => sub {
# When $err is "" (false), stderr merges with stdout, so IO::Select
# should see only 1 handle with both streams' data
use IPC::Open3;
use IO::Select;
my ($wtr, $rdr, $err);
Expand All @@ -128,10 +132,10 @@ subtest 'IPC::Open3 with IO::Select - stdout + stderr (Net::SSH pattern)' => sub

my $sel = IO::Select->new();
$sel->add($rdr);
$sel->add($err);
is($sel->count, 2, "IO::Select has 2 handles");
# $err is not a real handle (false value merged stderr), so only 1 handle
is($sel->count, 1, "IO::Select has 1 handle (stderr merged)");

my ($out, $errout) = ("", "");
my $out = "";
my $iterations = 0;
while ($sel->count && $iterations < 20) {
$iterations++;
Expand All @@ -144,20 +148,15 @@ subtest 'IPC::Open3 with IO::Select - stdout + stderr (Net::SSH pattern)' => sub
$sel->remove($fh);
next;
}
if (fileno($fh) == fileno($rdr)) {
$out .= $buf;
} else {
$errout .= $buf;
}
$out .= $buf;
}
}

chomp $out; chomp $errout;
is($out, "stdout-data", "IO::Select captured stdout");
is($errout, "stderr-data", "IO::Select captured stderr");
# Both stdout and stderr data should be in the merged output
like($out, qr/stdout-data/, "merged output contains stdout");
like($out, qr/stderr-data/, "merged output contains stderr");

close($rdr);
close($err);
waitpid($pid, 0);
};

Expand Down
4 changes: 3 additions & 1 deletion src/test/resources/unit/string_interpolation.t
Original file line number Diff line number Diff line change
Expand Up @@ -76,8 +76,10 @@ subtest 'Special punctuation variable interpolation' => sub {
is("$%", "0", "\$% interpolates correctly");

# $\ - output record separator
# In DQ strings, $\ greedily captures \ as the variable name.
# "$\\" = value_of($\) + literal backslash (from remaining \)
local $\ = "";
is("$\\", "", "\$\\ interpolates correctly");
is("$\\", "\\", "\$\\ interpolates correctly");

# $( - real group ID
my $gid = "$(";
Expand Down
4 changes: 3 additions & 1 deletion src/test/resources/unit/sysread_syswrite.t
Original file line number Diff line number Diff line change
Expand Up @@ -65,11 +65,13 @@ subtest 'In-memory file handles' => sub {
ok(!defined($bytes), 'syswrite to in-memory handle returns undef (known limitation)');
close($mem_out);

# sysread on in-memory handles is not supported in Perl —
# it returns undef and sets $! to "Bad file descriptor".
$mem_content = "Test content";
open(my $mem_in, '<', \$mem_content) or die "Cannot open in-memory handle: $!";
my $buffer;
my $read = sysread($mem_in, $buffer, 1024);
is($read, length($mem_content), 'sysread from in-memory handle works');
ok(!defined($read), 'sysread from in-memory handle returns undef (not supported)');
close($mem_in);
};

Expand Down
Loading