feat(fireworks_ai): add Fireworks AI provider#1
Conversation
There was a problem hiding this comment.
Review Summary
This PR adds a Fireworks AI provider and fixes a cross-cutting bug in the default ResponseBuilder. After reviewing the implementation, tests, fixtures, and documentation, I find the changes well-structured and correct.
Key Changes Reviewed
-
Bug fix in
ResponseBuilder— Correctly populatesprovider,encrypted?, andformatdefaults for reasoning details from thinking chunks. The previous implementation ignored theproviderargument and left fields asnil, which affected all OpenAI-compatible reasoning providers (DeepSeek, Groq, Fireworks, etc.). The comprehensive test assertion relaxation from a hardcoded whitelist to "non-nil atom" is the right call. -
Fireworks AI provider — Follows the established provider pattern (OpenAI-compatible endpoint with custom
build_body/1andtranslate_options/3). Provider-specific quirks are handled appropriately:max_tokenscapped to 4096 for non-streaming requests with a warning- Message-level
metadatastripped to avoid Fireworks 400s - Canonical
tool_choicetranslated to OpenAI function shape stream_options.include_usageset automatically for streaming- Default
receive_timeoutof 5 minutes for slow reasoning completions reasoning_effortmapped correctly to Fireworks string values
-
Structured output — The dual-mode implementation (
:auto/:json_schemanative path and:toolfallback) is clean and correctly delegates to the default prepare request pipeline. -
Tests & fixtures — 23 unit tests cover provider contract, request preparation, authentication, body encoding, option translation, reasoning effort rendering, tool choice translation, extended parameters, structured output modes, and stream options. The 28 live-recorded fixtures (kimi-k2p5 + glm-5 across 14 scenarios each) provide good coverage.
-
Documentation — The provider guide is thorough and accurately documents the
top_knamespace collision workaround and the"max"reasoning effort escape hatch.
Verdict
No issues found. The PR is clean, low-risk, and ready to merge.
bdf25a9 to
3e00f27
Compare
There was a problem hiding this comment.
Overall Assessment
Clean, well-structured provider implementation that follows existing codebase patterns. The cross-cutting ResponseBuilder fix is correct and the test coverage (23 unit tests, 22 coverage replay tests, 28 fixtures) is thorough.
Findings
| Severity | Finding |
|---|---|
| Moderate | fireworks_json_schema_strict: false in provider_options is ignored due to ` |
| Low | Same ` |
Positive Notes
- The
ResponseBuilderreasoning-details fix correctly uses the previously-ignoredproviderargument and supplies sensible defaults forencrypted?andformat. - Fireworks-specific quirks (
max_tokenscapping, message metadata stripping,tool_choicetranslation,stream_options.include_usage) are handled appropriately. - Provider guide is comprehensive and covers the
top_knamespace collision workaround clearly. - Fixture coverage with two models (kimi-k2p5, glm-5) is in line with other providers.
Suggested Follow-up
Consider fixing the || pattern in both structured_output_mode/1 and prepare_object_via_json_schema/4 to use Keyword.fetch/2 or a sentinel value so that explicit false values in provider_options are respected.
There was a problem hiding this comment.
Review Summary
This PR adds a well-structured Fireworks AI provider and fixes a cross-cutting bug in the default ResponseBuilder for reasoning details. After reviewing the provider module, tests, fixtures, and the ResponseBuilder fix, I found no bugs.
ResponseBuilder Fix
The change in lib/req_llm/provider/defaults/response_builder.ex correctly uses the previously-ignored provider argument and supplies sensible defaults (encrypted?: false, format: "openai-reasoning-content-v1") for ReasoningDetails. The comprehensive test assertion relaxation from a hard-coded whitelist to "non-nil atom" is the appropriate follow-up.
Fireworks AI Provider
The provider module follows existing patterns correctly:
- OpenAI-compatible base with Fireworks-specific extensions plumbed through
build_body/1 max_tokenscapping to 4096 for non-streaming requests is implemented and tested- Message-level
metadatastripping is handled defensively (both atom and string keys) tool_choicetranslation andstream_options.include_usageinjection are correct- Structured output supports both native
json_schemaand tool-call fallback modes reasoning_effortatom-to-string rendering covers the full Fireworks surface area
Tests & Fixtures
- 23 provider unit tests cover request preparation, encoding, option translation, and response decoding
- 28 live-recorded fixtures (kimi-k2p5 + glm-5 across 14 scenarios each) provide good coverage
- The comprehensive test matrix integrates cleanly with the existing provider test framework
Minor Observation (non-blocking)
add_stream_options/2 unconditionally overwrites stream_options with %{include_usage: true} for streaming requests. This is intentional per the PR description, but it prevents users from passing additional Fireworks-specific stream_options fields should they be introduced in the future. A Map.merge would be more forward-compatible.
Verdict: Clean, low-risk PR. Approving.
There was a problem hiding this comment.
Review Summary
This PR adds a Fireworks AI provider and fixes a cross-cutting bug in the default ResponseBuilder. After reviewing the provider implementation, tests, fixtures, documentation, and the ResponseBuilder fix, I find the PR to be well-structured and correct.
Overall Assessment: Approve
Key Observations
-
ResponseBuilder Fix (Correct) — The change in
lib/req_llm/provider/defaults/response_builder.exproperly uses the previously-ignoredproviderargument and supplies sensible defaults (encrypted?: false,format: "openai-reasoning-content-v1"). The comprehensive test assertion relaxation from a hard-coded 5-element whitelist to "any non-nil atom" is the right follow-up. -
Fireworks Provider Structure (Correct) — The provider follows the established OpenAI-compatible provider pattern (similar to Groq, DeepSeek, Cerebras). It correctly:
- Delegates to
ReqLLM.Provider.Defaultsfor standard operations - Overrides
prepare_request/4for:objectoperations with two structured-output strategies (:json_schemanative and:toolfallback) - Caps non-streaming
max_tokensto 4096 with a warning intranslate_options/3 - Strips message-level
metadata,reasoning_details, andreasoning_contentfrom outbound messages to avoid Fireworks 400s on multi-turn conversations - Renders canonical
reasoning_effortatoms to Fireworks string values - Defaults
receive_timeoutto 5 minutes for slow reasoning completions
- Delegates to
-
Test Coverage (Good) — 23 focused unit tests cover provider contract, request preparation, authentication, body encoding, option translation, reasoning effort rendering, tool choice translation, structured output modes, stream options, and response decoding. The 28 live-recorded fixtures for
kimi-k2p5andglm-5provide comprehensive coverage. -
Documentation (Good) — The provider guide accurately documents all Fireworks-specific options, the
max_tokens > 4096constraint, thetop_knamespace collision workaround, and the three structured-output strategies.
Minor Notes
- The
config/test.exschange raises the globalreceive_timeoutandstream_receive_timeoutto 5 minutes for the entire test suite. This appears intentional to accommodate slow reasoning models, but be aware it affects all providers' tests. - The Fireworks provider depends on
ReqLLM.Providers.OpenAI.AdapterHelpersfortranslate_tool_choice_format/0andenforce_strict_recursive/1. This is consistent with other OpenAI-compatible providers in the codebase and is not an issue. - No blocking issues found.
…ing_details
extract_reasoning_from_thinking_chunks/2 was ignoring its provider arg and
producing %ReasoningDetails{provider: nil, encrypted?: nil, format: nil} for
OpenAI-compatible reasoning providers. Now uses model.provider as the fallback
and supplies encrypted?: false and a generic format string.
The comprehensive provider-test assertion is relaxed from a hard-coded
5-element whitelist to "non-nil atom" since the fix means it now applies to
every provider that emits thinking chunks via the default builder.
OpenAI-compatible chat completions at /inference/v1/chat/completions with Fireworks-specific extensions: reasoning_effort (xhigh/max), prompt cache keys, min_p/repetition_penalty/typical_p/mirostat sampling, speculation, prompt_truncate_len, safe_tokenization, parallel_tool_calls, max_completion_tokens, and native response_format json_schema with strict enforcement (with :tool fallback mode). Quirks handled at the provider level: - caps max_tokens to 4096 for non-streaming (Fireworks 400s otherwise) - strips message-level metadata (Fireworks rejects extra message fields) - translates canonical tool_choice to OpenAI function shape - sets stream_options.include_usage so the terminal SSE chunk has usage - 5-minute default receive_timeout for slow reasoning completions Includes unit tests (23), comprehensive coverage suite, fixtures for kimi-k2p5 and glm-5, and a provider guide.
… outbound messages
Fireworks's strict OpenAI-compat envelope rejects any field outside the
documented message shape. The default ResponseBuilder populates
`reasoning_details` on every assistant message (for OpenRouter / Anthropic
multi-turn reasoning continuity), and `:thinking` content parts get
serialised as a top-level `reasoning_content` string on assistant
messages. Either field, echoed back on the next turn, blows up the
request with:
HTTP 400 "Extra inputs are not permitted, field:
'messages[2].reasoning_details', value: [...]"
Broaden the existing message-level metadata strip into a single
`strip_rejected_message_fields` step that drops `metadata`,
`reasoning_details`, and `reasoning_content` (both atom and string keys)
before encoding. Adds a regression test that constructs an assistant
message carrying all three and asserts the encoded body is clean.
There was a problem hiding this comment.
Overall Assessment
Clean, well-structured PR that adds Fireworks AI provider support and fixes a cross-cutting bug in the default ResponseBuilder. No blocking issues found.
Findings
| # | Severity | Description |
|---|---|---|
| 1 | Info | config/test.exs adds global 5-minute timeouts that affect all providers, not just Fireworks. Verify this is intentional for the test suite. |
| 2 | Info | prepare_object_via_json_schema/4 sets operation: :object in opts but delegates to prepare_request with :chat. This is correct (endpoint routing), but worth confirming telemetry/fixture tagging handles the mixed operation value as expected. |
What was reviewed
- ResponseBuilder fix — The
providerargument was previously ignored, producingnilprovider/fmt/encrypted fields for all OpenAI-compatible reasoning providers. The fix supplies sensible defaults and the comprehensive test assertion is appropriately relaxed from a hard-coded 5-element whitelist to "non-nil atom". - Fireworks AI provider — Follows the established OpenAI-compatible provider pattern. Key implementation details verified:
translate_options/3correctly drops:reasoning_token_budget, caps non-streamingmax_tokensto 4096 with a warning, and defaultsreceive_timeoutto 5 minutes.build_body/1stripsmetadata,reasoning_details, andreasoning_contentfrom outbound messages (defensive against Fireworks 400s on multi-turn conversations).render_reasoning_effort/1correctly maps canonical atoms to Fireworks string values and omits:default.- Structured output routing (
:auto→ nativejson_schema,:tool→ tool-call fallback) is consistent with the OpenAI provider's approach. tool_choicetranslation reuses the shared OpenAI adapter helper.
- Tests — Provider contract, encoding, option translation, reasoning effort rendering, tool choice translation, structured output modes, and stream options are all covered. 28 live-recorded fixtures provide good replay coverage for
kimi-k2p5andglm-5. - Documentation — The provider guide is comprehensive and accurately documents quirks (max_tokens cap, metadata stripping, top_k namespace collision, unsupported endpoints).
Verdict
APPROVE — Low risk, good test coverage, follows existing conventions.
There was a problem hiding this comment.
Overall Assessment
Clean, well-tested provider addition with a sensible cross-cutting fix for ResponseBuilder reasoning metadata. One logic bug prevents disabling strict JSON schema mode via provider_options.
Findings
| # | Severity | File | Description |
|---|---|---|---|
| 1 | Medium | lib/req_llm/providers/fireworks_ai.ex |
fireworks_json_schema_strict: false in provider_options is ignored due to ` |
What looks good
- ResponseBuilder fix correctly populates
provider,encrypted?, andformatfor OpenAI-compatible reasoning providers instead of leaving themnil. - Message scrubbing (
strip_rejected_message_fields) is thorough and covers both atom and string keys, preventing the multi-turn HTTP 400s described in the PR. - Test coverage is extensive: provider contract, encoding, option translation, structured output modes, streaming, and tool round-trips are all exercised.
- Documentation (
guides/fireworks_ai.md) is detailed and includes the importantmax_tokens > 4096streaming caveat.
Minor notes
- The
render_reasoning_effortmapping of:minimal→"low"is a design choice; consider documenting it explicitly if users might expect:minimalto map to"none". - The existing test for
fireworks_json_schema_strict: falsepasses the option at the top level, so theprovider_optionsbug is not currently covered.
| type: :boolean, | ||
| doc: "Whether the model may invoke multiple tools in parallel" | ||
| ], | ||
| max_completion_tokens: [ |
There was a problem hiding this comment.
Bug: false value in provider_options is ignored for fireworks_json_schema_strict.
The || operator treats false as falsy, so passing provider_options: [fireworks_json_schema_strict: false] falls through to the top-level option (which defaults to true). This makes it impossible to disable strict mode via provider_options.
Replace with an explicit nil check:
strict =
case Keyword.get(provider_opts, :fireworks_json_schema_strict) do
nil -> Keyword.get(opts, :fireworks_json_schema_strict, true)
val -> val
endThe same pattern exists in structured_output_mode/1 (line 108), though it is less likely to cause problems since the expected values are truthy atoms.
There was a problem hiding this comment.
Review Summary
This PR adds a well-implemented Fireworks AI provider and fixes a cross-cutting bug in the default ResponseBuilder for OpenAI-compatible reasoning providers.
Overall Assessment
Clean, comprehensive, and well-tested. No blocking issues found.
Key Changes Reviewed
-
ResponseBuilder fix (
lib/req_llm/provider/defaults/response_builder.ex):- Correctly uses the previously-ignored
providerargument as a fallback forReasoningDetails.provider. - Supplies sensible defaults (
encrypted?: false,format: "openai-reasoning-content-v1"). - The relaxed comprehensive-test assertion (
is_atom(detail.provider) and not is_nil(detail.provider)) is the right generalization.
- Correctly uses the previously-ignored
-
Fireworks AI provider (
lib/req_llm/providers/fireworks_ai.ex):- OpenAI-compatible endpoint reuse is the correct architectural choice.
build_body/1correctly stripsmetadata,reasoning_details, andreasoning_contentfrom outbound messages while preserving top-level requestmetadata— this directly addresses the Fireworks "Extra inputs are not permitted" 400s on multi-turn conversations.max_tokenscapping to 4096 for non-streaming requests is correctly guarded bystream?check.render_reasoning_effort/1mapping (:minimal→"low") is reasonable given Fireworks's accepted values.- Structured output dual-mode (
:auto/:json_schemanative vs:toolfallback) is cleanly implemented.
-
Tests & fixtures:
- Provider-level tests cover identity, schema, request preparation, auth, body encoding, option translation, reasoning effort rendering, tool choice translation, structured output modes, and stream options.
- 28 live-recorded fixtures across two models (kimi-k2p5, glm-5) provide good coverage.
-
Documentation:
guides/fireworks_ai.mdis thorough and accurately describes provider quirks (max_tokens cap, message stripping,top_knamespace collision, timeout defaults).- Amazon Bedrock prefix corrections (
bedrock:→amazon_bedrock:) are consistent with the codebase.
Minor Notes (non-blocking)
-
The
encode_body strips reasoning_details...test intest/providers/fireworks_ai_test.exsconstructs a message struct directly rather than exercising a full round-trip throughResponseBuilder+ re-encoding. This is sufficient for unit-level coverage, but an integration-style test (message → encode → decode → re-encode) would provide even stronger confidence that the stripping survives realistic conversation flows. The fixture-based coverage tests likely cover this implicitly. -
render_reasoning_effort/1passes unknown atoms through unchanged (last clause:defp render_reasoning_effort(other), do: other). Fireworks will reject invalid values, so this is acceptable, but consider logging a warning for unrecognized atoms to aid debugging.
Verdict: Approve. The bug fix is correct, the provider implementation follows established patterns, and the test coverage is extensive.
The merge-base changed after approval.
There was a problem hiding this comment.
Review Summary
This PR adds the Fireworks AI provider and fixes a cross-cutting bug in the default ResponseBuilder. After reviewing the code, tests, fixtures, and documentation, I find no blocking issues.
ResponseBuilder Fix
The fix in lib/req_llm/provider/defaults/response_builder.ex correctly addresses a bug where extract_reasoning_from_thinking_chunks/2 ignored the provider argument and left encrypted?, provider, and format as nil. The new defaults (false, the passed provider atom, and "openai-reasoning-content-v1") are sensible. The relaxed comprehensive-test assertion from a hard-coded 5-element whitelist to is_atom(detail.provider) and not is_nil(detail.provider) is the right call now that the fix applies broadly to all OpenAI-compatible reasoning providers.
Fireworks AI Provider
The provider module (lib/req_llm/providers/fireworks_ai.ex) is well-structured and follows existing OpenAI-compatible provider patterns:
- Base URL & auth wiring — correct.
translate_options/3— drops unsupportedreasoning_token_budget, sets a 5-minute defaultreceive_timeout, and caps non-streamingmax_tokensto 4096 with a warning. All correct.build_body/1— stripsmetadata,reasoning_details, andreasoning_contentfrom outbound messages (critical for multi-turn conversations with Fireworks), translates canonicaltool_choiceto OpenAI function shape, and forwards the full Fireworks-specific parameter surface. Correct.- Structured output — three modes (
:auto,:json_schema,:tool) with strict JSON schema enforcement viaReqLLM.Providers.OpenAI.AdapterHelpers.enforce_strict_recursive/1. Clean implementation. render_reasoning_effort/1— correctly maps core atoms to Fireworks string values and omits:default.
Tests & Fixtures
- Provider-level tests (
test/providers/fireworks_ai_test.exs) cover identity, schema, request preparation, body encoding, option translation, reasoning effort rendering, tool choice translation, structured output modes, and stream options. Good coverage. - 28 live-recorded fixtures across two models (
kimi-k2p5,glm-5) and multiple scenarios (basic, streaming, reasoning, tools, object generation, context append, token limit, usage). Comprehensive. - The comprehensive coverage test module follows the established pattern.
Documentation
guides/fireworks_ai.mdis thorough and covers configuration, model specs, all provider options, structured output modes, and implementation notes (max_tokens cap, metadata stripping, tool_choice translation, top_k collision, streaming usage, timeout defaults).- README and
mix.exsare updated correctly to list Fireworks as the 20th supported provider.
Minor Notes (non-blocking)
- The guide mentions passing
"max"reasoning effort viaprovider_options: [reasoning_effort: "max"]. It is worth verifying in a follow-up that values nested insideprovider_optionsare correctly surfaced tobuild_body(the current implementation readsrequest.options[:reasoning_effort]directly). This is pre-existing behavior across providers, not a bug introduced here. - Some fixture files lack a trailing newline. Cosmetic only.
Verdict: Clean, well-tested provider addition with a legitimate cross-cutting bug fix. Low risk.
Description
Adds a Fireworks AI provider backed by
/inference/v1/chat/completions— the same endpoint the official Fireworks Python SDK posts to, with Fireworks-specific extensions plumbed through. Covers ~92% of the Fireworks chat-completion parameter surface (sampling extras, prompt-cache keys, speculation, nativeresponse_format: json_schemawith strict enforcement plus a:toolfallback mode, reasoning_effort, parallel_tool_calls, max_completion_tokens, …).Also fixes a cross-cutting bug in the default
ResponseBuilderthat produced%ReasoningDetails{provider: nil, encrypted?: nil, format: nil}for every OpenAI-compatible reasoning provider (DeepSeek, Groq, Fireworks, …).Commits
fix: populate provider and defaults on default ResponseBuilder reasoning_details— uses the previously-ignoredproviderarg and supplies sane defaults forencrypted?/format. Relaxes the comprehensive provider-test assertion from a hard-coded 5-element whitelist to "non-nil atom" now that the fix applies broadly. No behaviour change for providers with their ownResponseBuilder(Anthropic, Google, Minimax, OpenAI Responses API, OpenRouter).feat(fireworks_ai): add Fireworks AI provider— provider module, comprehensive coverage suite, 28 live-recorded fixtures (kimi-k2p5 + glm-5 × 14 scenarios each), and a provider guide.fix(fireworks_ai): strip reasoning_details and reasoning_content from outbound messages— extends the outbound-message scrubber:metadata,reasoning_details, andreasoning_contentall survive ResponseBuilder round-tripping on assistant turns and cause Fireworks to 400 with "Extra inputs are not permitted" on the next call.Provider quirks handled
max_tokensto 4096 for non-streaming requests (Fireworks 400s otherwise)metadata,reasoning_details, andreasoning_contentfrom outbound assistant turns (Fireworks rejects extra fields)tool_choiceto OpenAI function shapestream_options.include_usageis supplied by the default OpenAI-compat body builder for streaming requestsreceive_timeoutto 5 minutes for slow reasoning completionsType of Change
ResponseBuilder.reasoning_detailsfor OpenAI-compatible reasoning providersguides/fireworks_ai.mdBreaking Changes
None. The
ResponseBuilderfix populates fields that were previouslynil; no provider was relying on those nils.Testing
mix test)mix quality)Detail
mix test test/providers/fireworks_ai_test.exs— 24 / 0mix test test/coverage/fireworks_ai/— 22 / 0 in replay mode (kimi-k2p5 + glm-5)mix test --exclude integration --exclude live --exclude coverage— 3010 / 0 after rebasing onto latestmain(Azure Responses API reasoning-effort fix)mix quality(format +--warnings-as-errors+credo --strict+dialyzer) — all greenmain; unrelated Azure/Google embedding and Anthropic streaming structured-output tests)response_format: json_schemapath with strict modeChecklist
@docblocks, explicitencode_body/build_bodypair, module attributes grouped near top — matches Cerebras/Zenmux/OpenRouter conventions)guides/fireworks_ai.md, registered inmix.exsextras + Providers grouping)feat:,fix:)CHANGELOG.md(it is auto-generated by git_ops)Known follow-ups (out of scope for this PR)
top_kis documented in the core schema as a top-level option but crashes at runtime for every OpenAI-compatible provider that doesn't explicitly register it (Fireworks included). Fireworks users currently pass it viaprovider_options: [top_k: 40]. The cleaner fix is a one-line addition toextra_option_keysinprovider/defaults.explus per-provider drop rules where the wire doesn't accept it — better handled as its own PR with cross-cutting test coverage.video_url,max_frames,sample_fps,spatial_limit) — only Google has end-to-end video plumbing today; would need sharedContentPartwork.Related Issues
Closes #