Scope
In-depth analysis of src/praisonai/praisonai (the wrapper layer). I focused only on key architectural gaps — not docs, tests, coverage, file sizes, or line counts.
After a full pass over the wrapper, three gaps stand out because they actively violate the project's stated philosophy ("simpler · DRY · multi-agent safe · async-safe · no global singletons · protocol-driven"). Each one is reproducible from the current main, has been read end-to-end, and the proposed fix is concrete.
Gap 1 — agents_generator.py carries ~470 lines of dead, duplicate, broken framework-runner methods
Where
src/praisonai/praisonai/agents_generator.py
Five private methods defined on AgentsGenerator:
| Method |
Lines |
_run_autogen |
673–751 |
_run_autogen_v4 |
753–859 |
_run_ag2 |
861–1005 |
_run_crewai |
1006–1143 |
_run_praisonai |
1144–EOF |
Why this is a problem
-
They are never called. The execution path is generate_crew_and_kickoff → self.framework_adapter.run(...) at line 607:
# agents_generator.py:602-615
from .framework_adapters.validators import assert_framework_available
assert_framework_available(framework)
self.logger.info(f"Using framework: {framework}")
return self.framework_adapter.run(
config,
self.config_list,
topic,
tools_dict=tools_dict,
agent_callback=getattr(self, 'agent_callback', None),
task_callback=getattr(self, 'task_callback', None),
cli_config=getattr(self, 'cli_config', None),
)
A repo-wide grep confirms no caller for _run_autogen, _run_autogen_v4, _run_ag2, _run_crewai, or _run_praisonai anywhere in src/. The only matching hit is the unrelated _run_praisonai_agents in jobs/executor.py.
-
They cannot run even if called — names are not imported in this module:
# agents_generator.py:688 (inside _run_autogen)
user_proxy = autogen.UserProxyAgent( # NameError: 'autogen' is not defined
name="User",
...
)
# agents_generator.py:770 (inside _run_autogen_v4)
model_client = OpenAIChatCompletionClient( # NameError
model=model_config.get('model', 'gpt-5-nano'),
...
)
# 796: AutoGenV4AssistantAgent — undefined
# 815: TextMentionTermination — undefined
# 816: MaxMessageTermination — undefined
# 820: RoundRobinGroupChat — undefined
-
They duplicate the canonical adapter implementations. Compare _run_autogen (lines 673–751) to framework_adapters/autogen_adapter.py:29–100:
# framework_adapters/autogen_adapter.py:55-89 (the real, working impl)
import autogen
user_proxy = autogen.UserProxyAgent(
name="User",
human_input_mode="NEVER",
is_termination_msg=lambda x: ...,
code_execution_config={"work_dir": "coding", "use_docker": False},
)
...
agents[role] = autogen.AssistantAgent(
name=agent_name,
llm_config=llm_config_dict,
system_message=...,
)
This is character-for-character the same code as agents_generator._run_autogen, except the adapter version actually imports autogen and is wired into the registry. The same pattern holds for v0.4, AG2, CrewAI, and PraisonAI.
This directly contradicts the philosophy: "DRY: reuse abstractions, no duplication" and "Protocol-driven core: protocols/hooks in core; heavy impl in wrapper/tools". Today, anyone fixing a CrewAI bug has to guess which copy is live (the adapter), and the dead copy slowly drifts and rots.
Fix
Delete the five methods. No call sites, no public API impact, no shim needed. Concretely, in src/praisonai/praisonai/agents_generator.py:
# DELETE: lines 673-EOF (the five _run_* methods)
# KEEP: generate_crew_and_kickoff(...) which already delegates to
# self.framework_adapter.run(...) at line 607
That removes ~470 lines of dead code, eliminates the drift hazard, and lets agents_generator.py shrink to what it actually does: parse the YAML, validate, pick the adapter, hand off.
Gap 2 — async_agent_scheduler.py claims to be a "backward-compatible re-export" but is a full duplicate implementation with silent feature drift
Where
src/praisonai/praisonai/async_agent_scheduler.py (424 lines — async)
src/praisonai/praisonai/scheduler/agent_scheduler.py (554 lines — sync, canonical)
src/praisonai/praisonai/scheduler/shared.py (shared helpers)
Why this is a problem
The header of async_agent_scheduler.py advertises itself as a thin shim:
# async_agent_scheduler.py:1-15
"""Backward-compatible re-export. Prefer `praisonai.scheduler`.
This module is deprecated. Use the canonical implementation in the
scheduler package for full functionality including async support.
"""
import warnings
warnings.warn(
"praisonai.async_agent_scheduler is pending deprecation; it will be moved to "
"praisonai.scheduler.async_agent_scheduler in a future release. ...",
PendingDeprecationWarning, stacklevel=2,
)
# TODO: Once AsyncAgentScheduler is moved to scheduler package, import from there
# For now, re-export the existing implementation to avoid breaking changes
from .scheduler.shared import ScheduleParser, backoff_delay, safe_call
…but everything after that TODO is a full, independent implementation of AsyncAgentExecutorInterface, AsyncPraisonAgentExecutor, and AsyncAgentScheduler (lines 30–425). Meanwhile the sync AgentScheduler lives in scheduler/agent_scheduler.py. The two share ScheduleParser / backoff_delay, and nothing else.
That has already produced silent feature drift. The async scheduler is missing features the sync one ships today:
Feature (sync, scheduler/agent_scheduler.py) |
Async (async_agent_scheduler.py) |
timeout per-execution kill switch (lines 44, 244–259) |
❌ absent |
max_cost budget guard (lines 45, 216–220) |
❌ absent |
Cost tracking (self._total_cost, lines 269–272) |
❌ absent |
_update_state_if_daemon() (lines 175–210) — daemon state file updates |
❌ absent |
from_yaml(...) constructor (lines 337–419) |
❌ absent |
from_recipe(...) constructor (lines 441–535) |
❌ absent |
start_from_yaml_config() (lines 421–438) |
❌ absent |
So praisonai.AgentScheduler and praisonai.AsyncAgentScheduler look symmetric in the public surface, but the async path silently drops budget protection, recipe support, YAML support, daemon state, and timeouts. The retry/execute loops are also separately written:
# sync — scheduler/agent_scheduler.py:233-309
def _execute_with_retry(self, max_retries: int):
self._execution_count += 1
...
for attempt in range(max_retries):
try:
if self.timeout:
signal.alarm(self.timeout)
result = self._executor.execute(self.task)
...
self._total_cost += estimated_cost
self._success_count += 1
...
except Exception as e:
...
wait_time = backoff_delay(attempt, initial=30.0, cap=300.0)
if self._stop_event.wait(wait_time):
return
# async — async_agent_scheduler.py:353-390
async def _execute_with_retry(self, max_retries: int):
self._ensure_async_primitives()
async with self._stats_lock:
self._execution_count += 1
...
for attempt in range(max_retries):
try:
result = await self._executor.execute(self.task)
async with self._stats_lock:
self._success_count += 1
safe_call(self.on_success, result)
return
except Exception as e:
last_exc = e
if attempt < max_retries - 1:
wait_time = backoff_delay(attempt)
await asyncio.sleep(wait_time)
Two implementations of the same retry/stats lifecycle, differing only in await vs blocking calls. This is exactly what the philosophy calls out: "Multi-agent + async safe by default" and "DRY: reuse abstractions" — the wrapper should have one scheduler that exposes both sync and async entry points over a shared core.
Fix
-
Move the file to its canonical home and complete the in-flight migration the TODO calls out:
- Create
src/praisonai/praisonai/scheduler/async_agent_scheduler.py containing the class.
- Reduce the root
async_agent_scheduler.py to a thin re-export, matching the existing pattern in agent_scheduler.py:1-18:
# src/praisonai/praisonai/async_agent_scheduler.py (after fix)
"""Backward-compatible re-export. Prefer `praisonai.scheduler`."""
import warnings
warnings.warn(
"praisonai.async_agent_scheduler is deprecated; "
"use 'from praisonai.scheduler import AsyncAgentScheduler' instead.",
DeprecationWarning, stacklevel=2,
)
from praisonai.scheduler.async_agent_scheduler import ( # noqa: F401
AsyncAgentScheduler, AsyncPraisonAgentExecutor, create_async_agent_scheduler,
)
-
Factor the execute/retry/stats lifecycle into a shared mixin in scheduler/shared.py so the two classes diverge only on await boundaries. Sketch:
# scheduler/shared.py (new)
class _ScheduleCore:
"""Shared lifecycle: stats, budget, timeout, retry policy."""
def _begin_attempt(self): self._execution_count += 1
def _on_success(self, result):
self._success_count += 1
self._total_cost += self._estimate_cost(result)
safe_call(self.on_success, result)
def _on_failure(self, exc):
self._failure_count += 1
safe_call(self.on_failure, exc)
def _budget_exhausted(self) -> bool:
return bool(self.max_cost and self._total_cost >= self.max_cost)
Then AgentScheduler provides sync _execute_with_retry and AsyncAgentScheduler provides async _execute_with_retry, both calling into _ScheduleCore. Port timeout, max_cost, from_yaml, from_recipe, and _update_state_if_daemon into the shared layer so the async path picks them up for free.
Gap 3 — Process-level mutable singletons in interactive_tools.py and tool_resolver.py break multi-agent and multi-project use
The philosophy is explicit: "no global singletons, no heavy module-level work" and "multi-agent safe by default". Two concrete violations in the wrapper today.
3a. _shared_runtime is global, unlocked, and shared across all concurrent agents
src/praisonai/praisonai/cli/features/interactive_tools.py:196-253
# ── Shared runtime singleton ────────────────────────────────────────────────
# ACP and LSP tools share ONE InteractiveRuntime to avoid duplicate LSP servers.
_shared_runtime = None
_shared_agent_tools = None
def _get_shared_runtime(config: ToolConfig):
"""Get or create a shared InteractiveRuntime instance."""
global _shared_runtime, _shared_agent_tools
if _shared_runtime is not None and _shared_agent_tools is not None:
return _shared_runtime, _shared_agent_tools
from .interactive_runtime import InteractiveRuntime, RuntimeConfig
from .agent_tools import create_agent_centric_tools
runtime_config = RuntimeConfig(
workspace=config.workspace,
lsp_enabled=config.lsp_enabled,
acp_enabled=config.acp_enabled,
approval_mode=config.approval_mode,
)
_shared_runtime = InteractiveRuntime(runtime_config)
_shared_agent_tools = create_agent_centric_tools(_shared_runtime)
...
Two real problems:
-
No lock. Two agents instantiating tools concurrently both pass the is not None check, both call InteractiveRuntime(...), the second overwrites the first — leaking an LSP server process and an ACP session that will never be stopped because cleanup_runtime only sees the latest reference.
-
Config is captured on first call and silently ignored after. Agent A calls with workspace="/foo", approval_mode="auto". Agent B later calls with workspace="/bar", approval_mode="manual" — B gets back A's runtime, bound to /foo and auto, with no warning. The config argument becomes a lie for everyone after the first caller.
3b. _default_resolver is anchored to the CWD at first call and never invalidated
src/praisonai/praisonai/tool_resolver.py:458-477
# Process-level lazy singleton for performance (matches profiler.py pattern)
_default_resolver: Optional[ToolResolver] = None
_default_resolver_lock = threading.Lock()
def _get_default_resolver() -> ToolResolver:
"""Process-default ToolResolver (double-checked lazy init).
Returns cached ToolResolver that is anchored to the working directory
at first call. Local tools.py resolution is CWD-dependent and cached
for the lifetime of the process.
For test isolation or multi-project CLIs, create explicit resolver
instances instead of using this cached default.
"""
global _default_resolver
if _default_resolver is None:
with _default_resolver_lock:
if _default_resolver is None:
_default_resolver = ToolResolver()
return _default_resolver
ToolResolver.__init__ defaults tools_py_path to the literal string "tools.py" (resolver line 62) and caches whatever it loaded on first access (lines 78–114). In any long-lived process — daemon, IDE plugin, MCP server, Jupyter session, the SDK as a library — the first resolve_tool("foo") call locks in that CWD's tools.py. Switch projects, change directory, run a second praisonai command in the same Python process, and you still get the original project's tools. The docstring acknowledges the hazard but recommends "create explicit resolver instances" — which nothing else in the wrapper does, so all internal call sites silently inherit the bug.
Fix
For both: replace process-level globals with contextvars.ContextVar so each agent / task / request gets its own value, and the singleton becomes the explicit fallback only at the CLI edge.
interactive_tools.py:
# Replace lines 196-253 with:
import contextvars
import threading
_runtime_var: contextvars.ContextVar = contextvars.ContextVar("interactive_runtime", default=None)
_runtime_lock = threading.Lock() # only guards the first creation per context
def _get_shared_runtime(config: ToolConfig):
bundle = _runtime_var.get()
if bundle is not None:
return bundle
with _runtime_lock:
bundle = _runtime_var.get()
if bundle is None:
from .interactive_runtime import InteractiveRuntime, RuntimeConfig
from .agent_tools import create_agent_centric_tools
runtime = InteractiveRuntime(RuntimeConfig(
workspace=config.workspace,
lsp_enabled=config.lsp_enabled,
acp_enabled=config.acp_enabled,
approval_mode=config.approval_mode,
))
bundle = (runtime, create_agent_centric_tools(runtime))
_runtime_var.set(bundle)
return bundle
def cleanup_runtime():
bundle = _runtime_var.get()
if bundle is None:
return
runtime, _ = bundle
# ... existing shutdown logic, then:
_runtime_var.set(None)
Concurrent agents now each get their own runtime when running under their own contextvars.copy_context(); the config argument is honoured on every first-in-context call instead of being silently dropped.
tool_resolver.py:
# Replace lines 458-477 with:
import contextvars
_resolver_var: contextvars.ContextVar[Optional[ToolResolver]] = contextvars.ContextVar(
"tool_resolver", default=None,
)
def _get_default_resolver() -> ToolResolver:
"""Per-context resolver. Falls back to a fresh ToolResolver per context,
so changing CWD between agents / requests is honoured."""
resolver = _resolver_var.get()
if resolver is None:
resolver = ToolResolver()
_resolver_var.set(resolver)
return resolver
def reset_default_resolver() -> None:
"""Explicit invalidation hook for daemons / IDE plugins switching projects."""
_resolver_var.set(None)
This keeps the fast path (cache once per context) but stops binding the entire process to whatever CWD the first caller happened to be in. Callers that already pass an explicit resolver= argument (the resolve_tool / resolve_tools / has_tool / validate_yaml_tools family at lines 481–547) are unaffected.
Why these three
- Gap 1 is dead — fixing it is pure removal, zero risk, and immediately stops new code from being added to the wrong copy.
- Gap 2 is drift-in-progress — the sync scheduler is gaining features (
max_cost, from_recipe, daemon state) the async one quietly lacks, so the longer it sits, the more the two diverge.
- Gap 3 is latent-but-load-bearing — the singletons work fine for one-shot CLI runs (which is most of the test surface) but quietly break the exact deployment shape the project is moving toward: long-lived agents, multi-project daemons, MCP/UI servers running many concurrent sessions.
All three are scoped to src/praisonai/praisonai. Happy to send PRs for any/all of these if useful.
Scope
In-depth analysis of
src/praisonai/praisonai(the wrapper layer). I focused only on key architectural gaps — not docs, tests, coverage, file sizes, or line counts.After a full pass over the wrapper, three gaps stand out because they actively violate the project's stated philosophy ("simpler · DRY · multi-agent safe · async-safe · no global singletons · protocol-driven"). Each one is reproducible from the current
main, has been read end-to-end, and the proposed fix is concrete.Gap 1 —
agents_generator.pycarries ~470 lines of dead, duplicate, broken framework-runner methodsWhere
src/praisonai/praisonai/agents_generator.pyFive private methods defined on
AgentsGenerator:_run_autogen_run_autogen_v4_run_ag2_run_crewai_run_praisonaiWhy this is a problem
They are never called. The execution path is
generate_crew_and_kickoff→self.framework_adapter.run(...)at line 607:A repo-wide grep confirms no caller for
_run_autogen,_run_autogen_v4,_run_ag2,_run_crewai, or_run_praisonaianywhere insrc/. The only matching hit is the unrelated_run_praisonai_agentsinjobs/executor.py.They cannot run even if called — names are not imported in this module:
They duplicate the canonical adapter implementations. Compare
_run_autogen(lines 673–751) toframework_adapters/autogen_adapter.py:29–100:This is character-for-character the same code as
agents_generator._run_autogen, except the adapter version actually importsautogenand is wired into the registry. The same pattern holds for v0.4, AG2, CrewAI, and PraisonAI.This directly contradicts the philosophy: "DRY: reuse abstractions, no duplication" and "Protocol-driven core: protocols/hooks in core; heavy impl in wrapper/tools". Today, anyone fixing a CrewAI bug has to guess which copy is live (the adapter), and the dead copy slowly drifts and rots.
Fix
Delete the five methods. No call sites, no public API impact, no shim needed. Concretely, in
src/praisonai/praisonai/agents_generator.py:That removes ~470 lines of dead code, eliminates the drift hazard, and lets
agents_generator.pyshrink to what it actually does: parse the YAML, validate, pick the adapter, hand off.Gap 2 —
async_agent_scheduler.pyclaims to be a "backward-compatible re-export" but is a full duplicate implementation with silent feature driftWhere
src/praisonai/praisonai/async_agent_scheduler.py(424 lines — async)src/praisonai/praisonai/scheduler/agent_scheduler.py(554 lines — sync, canonical)src/praisonai/praisonai/scheduler/shared.py(shared helpers)Why this is a problem
The header of
async_agent_scheduler.pyadvertises itself as a thin shim:…but everything after that
TODOis a full, independent implementation ofAsyncAgentExecutorInterface,AsyncPraisonAgentExecutor, andAsyncAgentScheduler(lines 30–425). Meanwhile the syncAgentSchedulerlives inscheduler/agent_scheduler.py. The two shareScheduleParser/backoff_delay, and nothing else.That has already produced silent feature drift. The async scheduler is missing features the sync one ships today:
scheduler/agent_scheduler.py)async_agent_scheduler.py)timeoutper-execution kill switch (lines 44, 244–259)max_costbudget guard (lines 45, 216–220)self._total_cost, lines 269–272)_update_state_if_daemon()(lines 175–210) — daemon state file updatesfrom_yaml(...)constructor (lines 337–419)from_recipe(...)constructor (lines 441–535)start_from_yaml_config()(lines 421–438)So
praisonai.AgentSchedulerandpraisonai.AsyncAgentSchedulerlook symmetric in the public surface, but the async path silently drops budget protection, recipe support, YAML support, daemon state, and timeouts. The retry/execute loops are also separately written:Two implementations of the same retry/stats lifecycle, differing only in
awaitvs blocking calls. This is exactly what the philosophy calls out: "Multi-agent + async safe by default" and "DRY: reuse abstractions" — the wrapper should have one scheduler that exposes both sync and async entry points over a shared core.Fix
Move the file to its canonical home and complete the in-flight migration the TODO calls out:
src/praisonai/praisonai/scheduler/async_agent_scheduler.pycontaining the class.async_agent_scheduler.pyto a thin re-export, matching the existing pattern inagent_scheduler.py:1-18:Factor the execute/retry/stats lifecycle into a shared mixin in
scheduler/shared.pyso the two classes diverge only onawaitboundaries. Sketch:Then
AgentSchedulerprovides sync_execute_with_retryandAsyncAgentSchedulerprovides async_execute_with_retry, both calling into_ScheduleCore. Porttimeout,max_cost,from_yaml,from_recipe, and_update_state_if_daemoninto the shared layer so the async path picks them up for free.Gap 3 — Process-level mutable singletons in
interactive_tools.pyandtool_resolver.pybreak multi-agent and multi-project useThe philosophy is explicit: "no global singletons, no heavy module-level work" and "multi-agent safe by default". Two concrete violations in the wrapper today.
3a.
_shared_runtimeis global, unlocked, and shared across all concurrent agentssrc/praisonai/praisonai/cli/features/interactive_tools.py:196-253Two real problems:
No lock. Two agents instantiating tools concurrently both pass the
is not Nonecheck, both callInteractiveRuntime(...), the second overwrites the first — leaking an LSP server process and an ACP session that will never be stopped becausecleanup_runtimeonly sees the latest reference.Config is captured on first call and silently ignored after. Agent A calls with
workspace="/foo",approval_mode="auto". Agent B later calls withworkspace="/bar",approval_mode="manual"— B gets back A's runtime, bound to/fooandauto, with no warning. Theconfigargument becomes a lie for everyone after the first caller.3b.
_default_resolveris anchored to the CWD at first call and never invalidatedsrc/praisonai/praisonai/tool_resolver.py:458-477ToolResolver.__init__defaultstools_py_pathto the literal string"tools.py"(resolver line 62) and caches whatever it loaded on first access (lines 78–114). In any long-lived process — daemon, IDE plugin, MCP server, Jupyter session, the SDK as a library — the firstresolve_tool("foo")call locks in that CWD'stools.py. Switch projects, change directory, run a secondpraisonaicommand in the same Python process, and you still get the original project's tools. The docstring acknowledges the hazard but recommends "create explicit resolver instances" — which nothing else in the wrapper does, so all internal call sites silently inherit the bug.Fix
For both: replace process-level globals with
contextvars.ContextVarso each agent / task / request gets its own value, and the singleton becomes the explicit fallback only at the CLI edge.interactive_tools.py:Concurrent agents now each get their own runtime when running under their own
contextvars.copy_context(); theconfigargument is honoured on every first-in-context call instead of being silently dropped.tool_resolver.py:This keeps the fast path (cache once per context) but stops binding the entire process to whatever CWD the first caller happened to be in. Callers that already pass an explicit
resolver=argument (theresolve_tool/resolve_tools/has_tool/validate_yaml_toolsfamily at lines 481–547) are unaffected.Why these three
max_cost,from_recipe, daemon state) the async one quietly lacks, so the longer it sits, the more the two diverge.All three are scoped to
src/praisonai/praisonai. Happy to send PRs for any/all of these if useful.