Skip to content

Commit c13ced4

Browse files
authored
Make limits as settings instead of implicit middlewares (#769)
1 parent ce0c7c1 commit c13ced4

9 files changed

Lines changed: 353 additions & 426 deletions

File tree

splunklib/ai/README.md

Lines changed: 30 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -696,7 +696,7 @@ triggers the retry logic described above. A custom `model_middleware` can interc
696696
to observe, log, or override the retry behavior. A custom `model_middleware` can also raise
697697
the `StructuredOutputGenerationException` manually to reject structured output and force a re-generation.
698698

699-
The maximal number of re-tries is limited per agent loop invocation see [Default limit middlewares](#default-limit-middlewares).
699+
The maximal number of re-tries is limited per agent loop invocation see [Default limits](#default-limits).
700700

701701
### Subagents with structured output/input
702702

@@ -977,103 +977,87 @@ model = OpenAIModel(...)
977977
service = connect(...)
978978

979979
@before_model
980-
def log_usage(req: ModelRequest) -> None:
981-
logger.debug(f"Steps: {req.state.total_steps}, Tokens: {req.state.token_count}")
980+
def log_steps(req: ModelRequest) -> None:
981+
logger.debug(f"Steps: {len(req.state.messages)}")
982982

983983

984984
async with Agent(
985985
model=model,
986986
service=service,
987987
system_prompt="...",
988-
middleware=[log_usage],
988+
middleware=[log_steps],
989989
) as agent: ...
990990
```
991991

992-
The hooks can stop the Agentic Loop under custom conditions by raising exceptions.
993-
The logic of the hook can be more advanced and include multiple conditions, for example, based on both token usage and execution time:
992+
The hooks can stop the Agentic Loop under custom conditions by raising exceptions, for example:
994993

995994
```py
996995
from splunklib.ai.hooks import before_model
997996
from splunklib.ai.middleware import AgentMiddleware, ModelRequest
998997

999-
def token_and_step_limit(token_limit: float, step_limit: int) -> AgentMiddleware:
998+
def message_limit(message_limit: int) -> AgentMiddleware:
1000999
@before_model
10011000
def _hook(req: ModelRequest) -> None:
1002-
if req.state.token_count > token_limit or req.state.total_steps >= step_limit:
1001+
if len(req.state.messages) >= message_limit:
10031002
raise Exception("Stopping Agentic Loop")
10041003

10051004
return _hook
10061005

10071006

10081007
async with Agent(
10091008
...,
1010-
middleware=[token_and_step_limit(token_limit=10_000, step_limit=5)],
1009+
middleware=[message_limit(message_limit=5)],
10111010
) as agent: ...
10121011
```
10131012

1014-
## Default limit middlewares
1013+
## Default limits
10151014

10161015
Every `Agent` automatically applies sane default limits to prevent runaway execution
1017-
or excessive token usage. Default limit middlewares are appended after any user-supplied
1018-
middleware, so they always act on the final state of the request. If you override one of
1019-
the defaults by passing your own instance, you are responsible for its position in the
1020-
chain - place it last if you want the same behavior.
1016+
or excessive token usage.
10211017

1022-
| Middleware | Default | Measured |
1018+
| Limit | Default | Measured |
10231019
|---|---|---|
1024-
| `TokenLimitMiddleware` | 200 000 tokens | token count of messages passed to the model |
1025-
| `StepLimitMiddleware` | 100 steps | steps taken |
1026-
| `TimeoutLimitMiddleware` | 600 seconds (10 minutes) | per `invoke` call |
1027-
| `StructuredOutputRetryLimitMiddleware` | 3 retries | per `invoke` call |
1020+
| `max_tokens` | 200 000 tokens | token count of messages passed to the model |
1021+
| `max_steps` | 100 steps | number of messages in the conversation |
1022+
| `timeout` | 600 seconds (10 minutes) | per `invoke` call |
1023+
| `max_structured_output_retires` | 3 retries | per `invoke` call |
10281024

1029-
`TokenLimitMiddleware` and `StepLimitMiddleware` check the values from the messages passed to the
1030-
model on each call. `TimeoutLimitMiddleware` and `StructuredOutputRetryLimitMiddlewa` resets its
1031-
deadline/limit on each `invoke`, so effectively these limit only the agent loop.
1025+
`max_tokens` and `max_steps` are checked against the messages passed to the model on each call.
1026+
`timeout` and `max_structured_output_retires` reset on each `invoke`, so they limit only the
1027+
current agent loop invocation.
10321028

10331029
When a limit is exceeded, the agent raises the corresponding exception:
1034-
`TokenLimitExceededException`, `StepsLimitExceededException`, or `TimeoutExceededException`,
1030+
`TokenLimitExceededException`, `StepsLimitExceededException`, `TimeoutExceededException`, or
10351031
`StructuredOutputRetryLimitExceededException`.
10361032

10371033
### Overriding defaults
10381034

1039-
To override a specific limit, pass your own instance of the corresponding middleware
1040-
class. The default for that limit is suppressed automatically - the other defaults
1041-
remain active:
1035+
Limits are configured via the `AgentLimits` dataclass passed to the `Agent` constructor.
1036+
Only the fields you specify are overridden; the rest keep their defaults:
10421037

10431038
```py
1044-
from splunklib.ai.limits import (
1045-
TokenLimitMiddleware,
1046-
StepLimitMiddleware,
1047-
TimeoutLimitMiddleware,
1048-
StructuredOutputRetryLimitMiddleware,
1049-
)
1039+
from splunklib.ai.limits import AgentLimits
10501040

10511041
async with Agent(
10521042
...,
1053-
middleware=[
1054-
TokenLimitMiddleware(50_000), # overrides default 200 000; other defaults still apply
1055-
],
1043+
limits=AgentLimits(max_tokens=50_000), # overrides default 200 000; other defaults still apply
10561044
) as agent: ...
10571045
```
10581046

1059-
To override all defaults, pass all of these to Agent's middleware list:
1047+
To override all defaults:
10601048

10611049
```py
10621050
async with Agent(
10631051
...,
1064-
middleware=[
1065-
StructuredOutputRetryLimitMiddleware(0), # no-retries.
1066-
TokenLimitMiddleware(50_000),
1067-
StepLimitMiddleware(10),
1068-
TimeoutLimitMiddleware(30.0),
1069-
],
1052+
limits=AgentLimits(
1053+
max_tokens=50_000,
1054+
max_steps=10,
1055+
timeout=30.0,
1056+
max_structured_output_retires=0, # no retries
1057+
),
10701058
) as agent: ...
10711059
```
10721060

1073-
**Note**: When overriding limit middlewares, order matters. Place `StructuredOutputRetryLimitMiddleware`
1074-
first and `TokenLimitMiddleware`, `StepLimitMiddleware`, and `TimeoutLimitMiddleware` last,
1075-
otherwise the limits may not behave as expected.
1076-
10771061
There is no explicit opt-out - the intent is that agents should always have some guardrails.
10781062

10791063
## Logger

splunklib/ai/agent.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@
2626
from splunklib.ai.conversation_store import ConversationStore
2727
from splunklib.ai.core.backend import AgentImpl
2828
from splunklib.ai.core.backend_registry import get_backend
29+
from splunklib.ai.limits import AgentLimits
2930
from splunklib.ai.messages import AgentResponse, BaseMessage, HumanMessage, OutputT
3031
from splunklib.ai.middleware import AgentMiddleware
3132
from splunklib.ai.model import PredefinedModel
@@ -47,6 +48,8 @@
4748
_testing_app_id: str | None = None
4849

4950
DEFAULT_TOOL_SETTINGS = ToolSettings(local=False, remote=None)
51+
DEFAULT_AGENT_LIMITS = AgentLimits()
52+
5053
_SPLUNK_SYSTEM_USER = "splunk-system-user"
5154

5255

@@ -133,6 +136,10 @@ class Agent(BaseAgent[OutputT]):
133136
134137
Never invoke an Agent using the same thread_id more than once concurrently
135138
while using the same conversation_store.
139+
140+
limits:
141+
Optional `AgentLimits` instance controlling the built-in safety limits.
142+
When omitted, sane defaults are applied automatically.
136143
"""
137144

138145
_impl: AgentImpl[OutputT] | None
@@ -149,6 +156,7 @@ def __init__(
149156
output_schema: type[OutputT] | None = None,
150157
input_schema: type[BaseModel] | None = None, # Only used by Subagents
151158
middleware: Sequence[AgentMiddleware] | None = None,
159+
limits: AgentLimits = DEFAULT_AGENT_LIMITS,
152160
name: str = "", # Only used by Subagents
153161
description: str = "", # Only used by Subagents
154162
logger: Logger | None = None,
@@ -169,6 +177,7 @@ def __init__(
169177
logger=logger,
170178
conversation_store=conversation_store,
171179
thread_id=thread_id if thread_id is not None else str(uuid4()),
180+
limits=limits,
172181
)
173182

174183
self._service = service

splunklib/ai/base_agent.py

Lines changed: 9 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -22,14 +22,7 @@
2222

2323
from splunklib.ai.conversation_store import ConversationStore
2424
from splunklib.ai.limits import (
25-
DEFAULT_STEP_LIMIT,
26-
DEFAULT_STRUCTURED_OUTPUT_RETRY_LIMIT,
27-
DEFAULT_TIMEOUT_SECONDS,
28-
DEFAULT_TOKEN_LIMIT,
29-
StepLimitMiddleware,
30-
StructuredOutputRetryLimitMiddleware,
31-
TimeoutLimitMiddleware,
32-
TokenLimitMiddleware,
25+
AgentLimits,
3326
)
3427
from splunklib.ai.messages import AgentResponse, BaseMessage, OutputT
3528
from splunklib.ai.middleware import AgentMiddleware
@@ -53,6 +46,7 @@ class BaseAgent(Generic[OutputT], ABC): # noqa: UP046 TODO[BJ]
5346
_logger: logging.Logger
5447
_conversation_store: ConversationStore | None = None
5548
_thread_id: str
49+
_limits: AgentLimits
5650

5751
def __init__(
5852
self,
@@ -69,6 +63,7 @@ def __init__(
6963
logger: logging.Logger | None,
7064
conversation_store: ConversationStore | None,
7165
thread_id: str,
66+
limits: AgentLimits,
7267
) -> None:
7368
self._system_prompt = system_prompt
7469
self._model = model
@@ -79,26 +74,8 @@ def __init__(
7974
self._agents = tuple(agents) if agents else ()
8075
self._input_schema = input_schema
8176
self._output_schema = output_schema
82-
user_middleware = tuple(middleware) if middleware else ()
83-
user_middleware_types = {type(m) for m in user_middleware}
84-
85-
# NOTE: we're creating separate instances per agent - TimeoutLimitMiddleware is stateful
86-
# and sharing one would cause agents to overwrite each other's deadline.
87-
predefined_before: list[AgentMiddleware] = [
88-
StructuredOutputRetryLimitMiddleware(DEFAULT_STRUCTURED_OUTPUT_RETRY_LIMIT),
89-
]
90-
predefined_after: list[AgentMiddleware] = [
91-
TokenLimitMiddleware(DEFAULT_TOKEN_LIMIT),
92-
StepLimitMiddleware(DEFAULT_STEP_LIMIT),
93-
TimeoutLimitMiddleware(DEFAULT_TIMEOUT_SECONDS),
94-
]
95-
96-
self._middleware = (
97-
*[m for m in predefined_before if type(m) not in user_middleware_types],
98-
*user_middleware,
99-
*[m for m in predefined_after if type(m) not in user_middleware_types],
100-
)
101-
77+
self._limits = limits
78+
self._middleware = middleware
10279
self._trace_id = secrets.token_hex(16) # 32 Hex characters
10380
self._conversation_store = conversation_store
10481
self._thread_id = thread_id
@@ -177,3 +154,7 @@ def conversation_store(self) -> ConversationStore | None:
177154
@property
178155
def default_thread_id(self) -> str:
179156
return self._thread_id
157+
158+
@property
159+
def limits(self) -> AgentLimits:
160+
return self._limits

0 commit comments

Comments
 (0)