Skip to content

test(python): cover CLI harness log helpers#574

Open
akmhatey-ai wants to merge 3 commits into
Agent-Field:mainfrom
akmhatey-ai:tests/python-node-logs-harness-cli-399
Open

test(python): cover CLI harness log helpers#574
akmhatey-ai wants to merge 3 commits into
Agent-Field:mainfrom
akmhatey-ai:tests/python-node-logs-harness-cli-399

Conversation

@akmhatey-ai
Copy link
Copy Markdown

Summary

  • Adds focused Python SDK tests for agentfield.harness._cli helper functions requested in [Python SDK] Add tests for node_logs.py + harness/_cli.py #399.
  • Extends test_node_logs.py coverage for tee write-through/buffering, stdio tee installation, and follow-mode NDJSON behavior.
  • Keeps the change scoped to tests; source files are unchanged.

Closes #399

Validation

  • uv run pytest tests/test_harness_cli.py tests/test_node_logs.py from sdk/python (47 passed, 1 warning)
  • uv run --project sdk/python pytest sdk/python/tests/test_harness_cli.py sdk/python/tests/test_node_logs.py (47 passed, 1 warning)
  • uv run --with ruff ruff check tests/test_harness_cli.py tests/test_node_logs.py
  • uv run --with ruff ruff format --check tests/test_harness_cli.py tests/test_node_logs.py
  • git diff --cached --check
  • git diff --cached -- sdk/python/tests/test_harness_cli.py sdk/python/tests/test_node_logs.py | gitleaks detect --pipe --redact --verbose --no-color

Notes

  • The pytest warning is an existing tests/conftest.py:65 deprecation warning about no current event loop.
  • AI-assisted contribution; the tests were reviewed and validated locally before submission.

@akmhatey-ai akmhatey-ai requested review from a team and AbirAbbas as code owners May 16, 2026 20:24
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 16, 2026

CLA assistant check
All committers have signed the CLA.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 16, 2026

Performance

SDK Memory Δ Latency Δ Tests Status
Python 9.4 KB +4% 0.29 µs -17%

✓ No regressions detected

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 16, 2026

📊 Coverage gate

Thresholds from .coverage-gate.toml: per-surface ≥ 86%, aggregate ≥ 88%, max per-surface regression ≤ 1.0 pp, max aggregate regression ≤ 0.50 pp.

Surface Current Baseline Δ
control-plane 87.50% 87.30% ↑ +0.20 pp 🟡
sdk-go 91.90% 90.70% ↑ +1.20 pp 🟢
sdk-python 93.73% 93.63% ↑ +0.10 pp 🟢
sdk-typescript 92.68% 92.56% ↑ +0.12 pp 🟢
web-ui 89.91% 90.01% ↓ -0.10 pp 🟡
aggregate 89.02% 89.01% ↑ +0.01 pp 🟡

✅ Gate passed

No surface regressed past the allowed threshold and the aggregate stayed above the floor.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 16, 2026

📐 Patch coverage gate

Threshold: 80% on lines this PR touches vs origin/main (from .coverage-gate.toml:thresholds.min_patch).

Surface Touched lines Patch coverage Status
control-plane 0 ➖ no changes
sdk-go 0 ➖ no changes
sdk-python 0 ➖ no changes
sdk-typescript 0 ➖ no changes
web-ui 0 ➖ no changes

✅ Patch gate passed

Every surface whose lines were touched by this PR has patch coverage at or above the threshold.

Copy link
Copy Markdown
Member

@santoshkumarradha santoshkumarradha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review from the AgentField PR Review Harness (v5 architecture, opus-4.7-fast). 7 findings posted across 2 files: 5 HIGH coverage gaps and 2 MEDIUM quality concerns. One finding flags a likely production gap in _TeeTextIO (missing TextIO methods), not just a test gap. Source files were verified against the diff; false alarms and nitpicks have been filtered out before posting.


🤖 Reviewed by AgentField PR Review Harness

# ---------------------------------------------------------------------------


class TestTeeTextIO:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟥 [HIGH] _TeeTextIO is missing core TextIO methods (fileno, writelines, close, readable, writable, seekable)

typing.TextIO is a typing stub, not a usable base class with default method implementations. The class in agentfield/node_logs.py:122-158 only proxies write, flush, encoding, and isatty. After install_stdio_tee() runs, any caller of sys.stdout.fileno() — pytest's capfd fixture, Click, rich, subprocess.Popen(stdin=sys.stdin) — will hit AttributeError/NotImplementedError at runtime.

This is a production gap exposed by the test omission, not just a coverage hole. Suggested fix:

  1. Inherit from io.TextIOBase to get sane defaults, OR
  2. Explicitly proxy the missing methods to self._original
  3. Add a test that calls each TextIO method through sys.stdout after install_stdio_tee()

🤖 Reviewed by AgentField PR Review Harness

assert events == [{"type": "a"}, {"type": "b"}]


def test_extract_final_text_codex_style():
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟧 [HIGH] extract_final_text only tests 1 of 4 event-type branches

extract_final_text in agentfield/harness/_cli.py:68-98 handles four event types:

  • item.completed (Codex) ← tested here
  • result ← untested
  • turn.completed ← untested
  • message / assistant ← untested

Untested branches in a "final answer extraction" path can silently return None or wrong text in production for non-Codex CLI providers. Suggest adding a case per event type plus an empty-events case.


🤖 Reviewed by AgentField PR Review Harness

assert extract_final_text(events) == "final answer"


def test_estimate_cli_cost_calls_litellm():
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟧 [HIGH] estimate_cli_cost except path is untested

Source (_cli.py:113-123) has try: import litellm ... except Exception: return None. Only the success path returning 0.05 is exercised; the "litellm-missing" or "completion_cost raises" branches silently return None. Callers MUST treat None as "unknown" rather than "free" — exactly the kind of defensive code that rots untested and quietly corrupts billing/accounting if broken.

Suggested additions:

  • ImportError when litellm is absent
  • completion_cost returning 0 or None
  • completion_cost raising

🤖 Reviewed by AgentField PR Review Harness

assert entries[0].stream == "stderr"
assert entries[0].line == "partial line"

def test_install_stdio_tee_replaces_sys_stdout(self, monkeypatch):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟧 [HIGH] install_stdio_tee disabled-env early-return is untested

Source (node_logs.py:196) is if _tee_installed or not logs_enabled(): return. This test sets AGENTFIELD_LOGS_ENABLED=true and verifies wrapping happens. The disabled case — which is the default-user invocation when the env var is unset to false/0/no/off — has no coverage. A regression here silently mis-wires stdout for every default install.

Suggest adding a test that sets AGENTFIELD_LOGS_ENABLED=false and asserts sys.stdout is not wrapped.


🤖 Reviewed by AgentField PR Review Harness

monkeypatch.setattr(nl, "_tee_installed", False)

try:
install_stdio_tee()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟧 [HIGH] install_stdio_tee idempotency (_tee_installed flag) is untested

The function guards with if _tee_installed or not logs_enabled(): return. This test sets _tee_installed=False then calls install_stdio_tee() once — it does NOT call it twice to verify the second call is a no-op. Without this test, a refactor that breaks the idempotency guard would double-wrap sys.stdout, producing duplicated log entries — a silent correctness bug.

Suggest: call install_stdio_tee() twice and assert sys.stdout._original is the original (not another _TeeTextIO).


🤖 Reviewed by AgentField PR Review Harness



class TestIterTailNdjsonFollow:
def test_iter_tail_ndjson_follow_mode(self, monkeypatch):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟨 [MEDIUM] iter_tail_ndjson follow + tail_lines>0 prelude path untested

All three follow-mode tests use tail_lines=0. The combined "emit existing tail entries, then enter follow loop" path (node_logs.py:222-233) is uncovered. Most real CLI invocations of --follow combine an initial tail with follow — a broken prelude would silently drop history without anyone noticing.

Suggest a test that pre-populates the ring with N entries, opens iter_tail_ndjson(tail_lines=2, since_seq=0, follow=True), asserts the 2 tail entries are yielded, then exercises the follow loop with a new append.


🤖 Reviewed by AgentField PR Review Harness

Comment thread sdk/python/tests/test_node_logs.py Outdated

thread = threading.Thread(target=read_next)
thread.start()
deadline = time.monotonic() + 2
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟨 [MEDIUM] Follow-mode test is timing-sensitive / flaky

The busy-wait pattern while not nl._follow_queues and thread.is_alive() and time.monotonic() < deadline: time.sleep(0.001) (with a 2 s deadline) plus thread.join(timeout=2) is racy under slow CI hosts or coverage instrumentation. Both deadlines can be exceeded — the test may pass without exercising the new-entry path (zero chunks, no error) or flake outright.

Suggest replacing busy-wait with a deterministic synchronization primitive: a threading.Event set inside the generator after register_follow_queue() (or wrap register_follow_queue to notify).


🤖 Reviewed by AgentField PR Review Harness

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a4a4ea7408

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sdk/python/agentfield/node_logs.py Outdated
Comment on lines +179 to +180
super().close()
self._original.close()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid closing underlying stdio in tee wrapper

_TeeTextIO.close() now closes self._original, which is typically sys.__stdout__/sys.__stderr__. Because this class now subclasses io.TextIOBase, its finalizer can call close() when the wrapper is replaced or garbage-collected (for example during output redirection in tests/runtimes), unintentionally closing the real process stdio and causing later writes to fail with closed-stream errors. The tee should flush/log buffered text without owning or closing the underlying global stdio handles.

Useful? React with 👍 / 👎.

@akmhatey-ai
Copy link
Copy Markdown
Author

Follow-up pushed in a4a4ea74 and 199c9c7b to address the review items here.

Covered changes:

  • _TeeTextIO now exposes the expected TextIO methods and flushes buffered text on close without closing the underlying process stdio.
  • extract_final_text() now has coverage for result, turn.completed, message, and assistant event shapes, plus the empty-events case.
  • estimate_cli_cost() now covers missing litellm, non-positive/empty cost values, and exceptions as unknown-cost paths.
  • install_stdio_tee() now has disabled-env and idempotency coverage.
  • iter_tail_ndjson(..., follow=True) now covers the initial-tail prelude plus new followed entries, with event-based synchronization instead of timing-sensitive queue polling.

Fresh local verification:

  • python -m pytest tests/test_harness_cli.py tests/test_node_logs.py -> 61 passed
  • uvx ruff check agentfield/node_logs.py tests/test_harness_cli.py tests/test_node_logs.py -> passed
  • uvx ruff format --check agentfield/node_logs.py tests/test_harness_cli.py tests/test_node_logs.py -> passed
  • git diff --check -> passed
  • diff gitleaks scan -> no leaks found

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Python SDK] Add tests for node_logs.py + harness/_cli.py

3 participants