Open
Conversation
… env var Replace provider-specific environment variable checks (OPENAI_API_KEY, ANTHROPIC_API_KEY) and skip markers (requires_openai, requires_anthropic) with a single EFFECTFUL_LLM_MODEL environment variable that controls which model is used for all LLM integration tests. - Remove hardcoded model names from all test files in favor of LLM_MODEL read from EFFECTFUL_LLM_MODEL env var - Replace requires_openai/requires_anthropic markers with requires_llm - Remove model parametrization that was cross-provider; tests now use whichever model the env var specifies - Use litellm.supports_vision() to conditionally skip vision tests - Remove default model from LiteLLMProvider (make model required) - Update CI workflow to pass EFFECTFUL_LLM_MODEL as a matrix parameter, making it easy to add parallel CI stages for different providers - Rename/remove fixture files to match updated test names Closes #589
…ename test - Move LLM_MODEL and requires_llm definitions to tests/conftest.py; all four LLM test files now import from there - Fix import ordering in test_handlers_llm_encoding.py (stdlib before local) - Rename test_agent_tool_names_are_openai_compatible_integration to test_agent_tool_names_are_valid_integration since it's no longer OpenAI-specific
- Restore LiteLLMProvider default model via env var fallback so existing
callers (template.py, test_handlers_llm_template.py, notebook) are not
broken: model=os.environ.get("EFFECTFUL_LLM_MODEL", "gpt-4o")
- Move LLM_MODEL/requires_llm from conftest.py to tests/_llm_helpers.py
since conftest.py should not be imported directly
- Fix import placement in test_handlers_llm_provider.py (was between
module-level constants instead of grouped with imports)
- Remove redundant @requires_llm on vision tests where the skipif
condition already covers the not-LLM_MODEL case
The env var belongs in test infrastructure, not the library API. LiteLLMProvider should have a clean, explicit default.
Remove provider-specific environment variable checks and skip markers from LLM tests in favor of a single EFFECTFUL_LLM_MODEL env var. - Add LLM_MODEL and requires_llm to tests/conftest.py; LLM_MODEL defaults to "gpt-4o-mini" and is overridable via EFFECTFUL_LLM_MODEL - Live tests (tool calling, encoding, agent tool names) use LLM_MODEL and are gated by requires_llm which checks for any provider API key - Integration tests use make_provider() which returns a live LiteLLMProvider when API keys are available, else falls back to ReplayLiteLLMProvider for offline replay from fixtures - Replay-only tests (simple_prompt_multiple_models, cross_endpoint, caching) keep hardcoded model names and always run since they never call the API - Update CI workflow to pass EFFECTFUL_LLM_MODEL as a matrix parameter for easy parallel stages with different providers Closes #589
…asisresearch/effectful into dn-fully-parametric-model-test
…asisresearch/effectful into dn-fully-parametric-model-test
Contributor
Author
|
Seems like this refractoring uncovered that we never tested encoding integration for real after putting some fixtures there. Tuple encoding is failing again. Will try to fix. |
Three fixes in encoding.py: 1. TupleEncodable.encode() now returns a TupleItems model instance (not a raw tuple), and deserialize() returns the model directly. This fixes pydantic validation in litellm integration tests for NamedTuple and fixed-tuple types. 2. Add _TupleSafeJsonSchema that overrides pydantic's tuple_schema() to produce object schemas (item_0, item_1 properties) instead of prefixItems arrays. Applied via _BoxEncoding.model_json_schema() so dataclasses containing tuple fields produce OpenAI-compatible schemas. 3. SequenceEncodable.encode() returns a list (not tuple) to preserve encode idempotency — nested_type on a list dispatches to the sequence encoder, avoiding a mismatch with TupleEncodable. Also adds test_handlers_llm_encoding.py back to CI workflow.
Revert encoding.py to master and remove encoding tests from CI workflow. Tuple schema fixes will be in a dedicated PR.
be46f6a to
7aad873
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
closes #589