@@ -696,7 +696,7 @@ triggers the retry logic described above. A custom `model_middleware` can interc
696696to observe, log, or override the retry behavior. A custom ` model_middleware ` can also raise
697697the ` StructuredOutputGenerationException ` manually to reject structured output and force a re-generation.
698698
699- The maximal number of re-tries is limited per agent loop invocation see [ Default limit middlewares ] ( #default-limit-middlewares ) .
699+ The maximal number of re-tries is limited per agent loop invocation see [ Default limits ] ( #default-limits ) .
700700
701701### Subagents with structured output/input
702702
@@ -977,103 +977,87 @@ model = OpenAIModel(...)
977977service = connect(... )
978978
979979@before_model
980- def log_usage (req : ModelRequest) -> None :
981- logger.debug(f " Steps: { req.state.total_steps } , Tokens: { req.state.token_count } " )
980+ def log_steps (req : ModelRequest) -> None :
981+ logger.debug(f " Steps: { len ( req.state.messages) } " )
982982
983983
984984async with Agent(
985985 model = model,
986986 service = service,
987987 system_prompt = " ..." ,
988- middleware = [log_usage ],
988+ middleware = [log_steps ],
989989) as agent: ...
990990```
991991
992- The hooks can stop the Agentic Loop under custom conditions by raising exceptions.
993- The logic of the hook can be more advanced and include multiple conditions, for example, based on both token usage and execution time:
992+ The hooks can stop the Agentic Loop under custom conditions by raising exceptions, for example:
994993
995994``` py
996995from splunklib.ai.hooks import before_model
997996from splunklib.ai.middleware import AgentMiddleware, ModelRequest
998997
999- def token_and_step_limit ( token_limit : float , step_limit : int ) -> AgentMiddleware:
998+ def message_limit ( message_limit : int ) -> AgentMiddleware:
1000999 @before_model
10011000 def _hook (req : ModelRequest) -> None :
1002- if req.state.token_count > token_limit or req.state.total_steps >= step_limit :
1001+ if len ( req.state.messages) >= message_limit :
10031002 raise Exception (" Stopping Agentic Loop" )
10041003
10051004 return _hook
10061005
10071006
10081007async with Agent(
10091008 ... ,
1010- middleware = [token_and_step_limit( token_limit = 10_000 , step_limit = 5 )],
1009+ middleware = [message_limit( message_limit = 5 )],
10111010) as agent: ...
10121011```
10131012
1014- ## Default limit middlewares
1013+ ## Default limits
10151014
10161015Every ` Agent ` automatically applies sane default limits to prevent runaway execution
1017- or excessive token usage. Default limit middlewares are appended after any user-supplied
1018- middleware, so they always act on the final state of the request. If you override one of
1019- the defaults by passing your own instance, you are responsible for its position in the
1020- chain - place it last if you want the same behavior.
1016+ or excessive token usage.
10211017
1022- | Middleware | Default | Measured |
1018+ | Limit | Default | Measured |
10231019| ---| ---| ---|
1024- | ` TokenLimitMiddleware ` | 200 000 tokens | token count of messages passed to the model |
1025- | ` StepLimitMiddleware ` | 100 steps | steps taken |
1026- | ` TimeoutLimitMiddleware ` | 600 seconds (10 minutes) | per ` invoke ` call |
1027- | ` StructuredOutputRetryLimitMiddleware ` | 3 retries | per ` invoke ` call |
1020+ | ` max_tokens ` | 200 000 tokens | token count of messages passed to the model |
1021+ | ` max_steps ` | 100 steps | number of messages in the conversation |
1022+ | ` timeout ` | 600 seconds (10 minutes) | per ` invoke ` call |
1023+ | ` max_structured_output_retires ` | 3 retries | per ` invoke ` call |
10281024
1029- ` TokenLimitMiddleware ` and ` StepLimitMiddleware ` check the values from the messages passed to the
1030- model on each call. ` TimeoutLimitMiddleware ` and ` StructuredOutputRetryLimitMiddlewa ` resets its
1031- deadline/limit on each ` invoke ` , so effectively these limit only the agent loop.
1025+ ` max_tokens ` and ` max_steps ` are checked against the messages passed to the model on each call.
1026+ ` timeout ` and ` max_structured_output_retires ` reset on each ` invoke ` , so they limit only the
1027+ current agent loop invocation .
10321028
10331029When a limit is exceeded, the agent raises the corresponding exception:
1034- ` TokenLimitExceededException ` , ` StepsLimitExceededException ` , or ` TimeoutExceededException ` ,
1030+ ` TokenLimitExceededException ` , ` StepsLimitExceededException ` , ` TimeoutExceededException ` , or
10351031` StructuredOutputRetryLimitExceededException ` .
10361032
10371033### Overriding defaults
10381034
1039- To override a specific limit, pass your own instance of the corresponding middleware
1040- class. The default for that limit is suppressed automatically - the other defaults
1041- remain active:
1035+ Limits are configured via the ` AgentLimits ` dataclass passed to the ` Agent ` constructor.
1036+ Only the fields you specify are overridden; the rest keep their defaults:
10421037
10431038``` py
1044- from splunklib.ai.limits import (
1045- TokenLimitMiddleware,
1046- StepLimitMiddleware,
1047- TimeoutLimitMiddleware,
1048- StructuredOutputRetryLimitMiddleware,
1049- )
1039+ from splunklib.ai.limits import AgentLimits
10501040
10511041async with Agent(
10521042 ... ,
1053- middleware = [
1054- TokenLimitMiddleware(50_000 ), # overrides default 200 000; other defaults still apply
1055- ],
1043+ limits = AgentLimits(max_tokens = 50_000 ), # overrides default 200 000; other defaults still apply
10561044) as agent: ...
10571045```
10581046
1059- To override all defaults, pass all of these to Agent's middleware list :
1047+ To override all defaults:
10601048
10611049``` py
10621050async with Agent(
10631051 ... ,
1064- middleware = [
1065- StructuredOutputRetryLimitMiddleware( 0 ), # no-retries.
1066- TokenLimitMiddleware( 50_000 ) ,
1067- StepLimitMiddleware( 10 ) ,
1068- TimeoutLimitMiddleware( 30.0 ),
1069- ] ,
1052+ limits = AgentLimits(
1053+ max_tokens = 50_000 ,
1054+ max_steps = 10 ,
1055+ timeout = 30.0 ,
1056+ max_structured_output_retires = 0 , # no retries
1057+ ) ,
10701058) as agent: ...
10711059```
10721060
1073- ** Note** : When overriding limit middlewares, order matters. Place ` StructuredOutputRetryLimitMiddleware `
1074- first and ` TokenLimitMiddleware ` , ` StepLimitMiddleware ` , and ` TimeoutLimitMiddleware ` last,
1075- otherwise the limits may not behave as expected.
1076-
10771061There is no explicit opt-out - the intent is that agents should always have some guardrails.
10781062
10791063## Logger
0 commit comments