Skip to content

Conversation

@slider23
Copy link

@slider23 slider23 commented Jan 28, 2026

Calculating the cost of a request based on a pre-set pricing list

Config llm.php:

    'presets' => [
        'openrouter-flash3' => [
            'driver' => 'openrouter',
            'model' => 'google/gemini-3-flash-preview',
            ...
            'pricing' => [
                'input' => 0.5,    // $0.5 per 1M input tokens
                'output' => 3.0,  // $3 per 1M output tokens
                // cacheRead, cacheWrite, reasoning default to input price if not set
        ],
        ...

Usage:

$response = (new StructuredOutput)
    ->using('openrouter-flash3')
    ->with(messages: $messages, responseModel: SomeSchema::class)
    ->response();
$cost = $response->usage()->calculateCost();

Usage without config changes:

$response = (new StructuredOutput)
    ->using('openrouter-flash3')
    ->with(messages: $messages, responseModel: SomeSchema::class)
    ->response();
$pricing = Pricing::fromArray([
    'input' => 0.5,
    'output' => 3,
]);
$cost = $response->usage()->calculateCost($pricing);
     

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

* **New Features**
  * LLM token cost estimation integrated across inference flow; pricing can come from model config or be attached to responses/usages.
  * Pricing supports input, output, cache read/write, and reasoning rates.

* **Examples**
  * New PHP example demonstrating cost breakdowns, multiple pricing patterns, and model cost comparisons.

* **Tests**
  * Comprehensive unit tests for pricing creation, serialization, cost calculations, accumulation, and edge cases.

<sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

…m config

Add ability to calculate request cost based on token usage:

- New Pricing value object ($/1M tokens pricing)
- LLMConfig.pricing field for per-preset pricing configuration
- Usage::calculateCost() - works without arguments when pricing is set
- Default: cacheRead, cacheWrite, reasoning fallback to input price
- BaseInferenceDriver automatically attaches pricing from config

Usage in config:
  'openrouter-claude' => [
      'pricing' => ['input' => 3.0, 'output' => 15.0],  // $/1M
  ]

Usage in code:
  $cost = $response->usage()->calculateCost();
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 28, 2026

Warning

Rate limit exceeded

@slider23 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 12 minutes and 28 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Adds token-pricing support: a new Pricing model, cost calculation on Usage, config storage in LLMConfig, propagation of pricing through inference (PendingInference, InferenceStream, drivers), response-level pricing attachment, tests, and a PHP example demonstrating cost breakdowns and comparisons.

Changes

Cohort / File(s) Summary
Pricing model
packages/polyglot/src/Inference/Data/Pricing.php
New Pricing final class with readonly per‑M‑token rates, constructor defaults, fromArray(), none(), hasAnyPricing(), validation, and toArray().
Usage cost API
packages/polyglot/src/Inference/Data/Usage.php
Added pricing storage and accessor pricing(), withPricing(Pricing), and calculateCost(?Pricing); pricing is propagated through copy() and withAccumulated() merges.
Config integration
packages/polyglot/src/Inference/Config/LLMConfig.php
Added public pricing: array property, getPricing(): Pricing accessor, and pricing included in toArray() output.
Inference response
packages/polyglot/src/Inference/Data/InferenceResponse.php
Added withPricing(Pricing): self to apply pricing to the nested Usage and return an updated response.
Driver / Stream / PendingInference
packages/polyglot/src/Inference/Drivers/BaseInferenceDriver.php, packages/polyglot/src/Inference/PendingInference.php, packages/polyglot/src/Inference/Streaming/InferenceStream.php, packages/polyglot/src/Inference/Inference.php
Pricing is resolved from config (LLM resolver/provider) and threaded into PendingInference, InferenceStream and attached to final InferenceResponse when present (driver post‑processing and stream finalization).
Execution / Attempts helpers
packages/polyglot/src/Inference/Data/InferenceExecution.php, packages/polyglot/src/Inference/Collections/InferenceAttemptList.php
Added withUpdatedResponse(InferenceResponse) to InferenceExecution and withUpdatedAttempt(InferenceAttempt) to InferenceAttemptList to support updating attempts/responses.
Tests
packages/polyglot/tests/Unit/Data/UsageCostCalculationTest.php
New comprehensive tests covering Pricing::fromArray(), defaults, hasAnyPricing(), none(), toArray(), Usage::calculateCost(), withPricing(), accumulation behavior, and realistic pricing scenarios.
Example
examples/A03_Troubleshooting/CostCalculation/run.php
New example script introducing printCostBreakdown() and a local User model; demonstrates three usage patterns: config presets, manual Pricing::fromArray(), and multi‑model cost comparisons.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Driver as BaseInferenceDriver
  participant Config as LLMConfig
  participant Resolver as LLMResolver/Provider
  participant Pending as PendingInference
  participant Stream as InferenceStream
  participant Response as InferenceResponse
  participant Usage
  participant Pricing

  Client->>Driver: send inference request
  Driver->>Resolver: resolve LLM config
  Resolver->>Config: return config (may include pricing array)
  Config->>Pricing: Pricing::fromArray(config.pricing)
  Driver->>Pending: create PendingInference(..., pricing?)
  Pending->>Stream: stream(..., pricing?)
  Stream->>Response: accumulate tokens -> Usage
  alt pricing present
    Pricing->>Usage: withPricing(Pricing) or Usage::calculateCost(Pricing)
    Stream->>Pending: finalize response with pricing
    Driver->>Response: httpResponseToInference() then withPricing()
  end
  Driver->>Client: dispatch InferenceResponseCreated (with Usage/pricing)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I counted tokens, nibble by nibble,
Prices lined up, not a scribble.
Pricing tucked inside Usage's den,
Streams and drivers pass it then.
Cost carrots for curious devs—huzzah!

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Calculating cost of request' is vague and does not clearly convey the main technical change: adding cost calculation support to the Usage API and pricing configuration to LLM presets. Revise to be more specific, such as 'Add cost calculation support via Pricing and Usage API' or 'Implement request cost calculation with per-token pricing configuration'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@examples/A03_Troubleshooting/CostCalculation/run.php`:
- Around line 78-82: The inline comments in the Pricing::fromArray call for the
$pricing variable are wrong due to copy-paste errors; update the comments to
correctly describe each key and its unit/rate (e.g., 'input' => 0.2 means $0.2
per 1M input tokens, 'output' => 0.8 means $0.8 per 1M output tokens, and
'cacheRead' => 0.05 means $0.05 per 1M cache read tokens) so the
Pricing::fromArray invocation accurately documents the values and units for
future readers.

In `@packages/polyglot/src/Inference/Data/Pricing.php`:
- Around line 43-57: The fromArray factory currently casts possibly non-numeric
values to 0.0; update Pricing::fromArray to validate each numeric field
(input/inputPerMToken, output/outputPerMToken, cacheRead/cacheReadPerMToken,
cacheWrite/cacheWritePerMToken, reasoning/reasoningPerMToken) with is_numeric
and ensure values are non-negative before casting, and throw an
InvalidArgumentException with a clear message naming the offending field when
validation fails; keep the same field names (inputPerMToken, outputPerMToken,
cacheReadPerMToken, cacheWritePerMToken, reasoningPerMToken) and only proceed to
cast to float and call the constructor if validation passes.

In `@packages/polyglot/src/Inference/Drivers/BaseInferenceDriver.php`:
- Around line 77-81: Streaming responses never get pricing attached, causing
Usage::calculateCost() to throw when pricing is null; update the streaming path
to attach pricing from $this->config->getPricing(). Specifically, locate
httpResponseToInferenceStream() and/or
PartialInferenceResponse::fromAccumulatedPartial() and ensure the final
InferenceResponse/PartialInferenceResponse instance calls
->withPricing($this->config->getPricing()) when
$this->config->getPricing()->hasAnyPricing() is true (mirroring the
non-streaming httpResponseToInference() behavior) so that calculateCost() always
has pricing available.
🧹 Nitpick comments (3)
packages/polyglot/src/Inference/Config/LLMConfig.php (1)

18-73: Prefer storing Pricing as a value object instead of a raw array.

Keeping pricing as an array reduces type safety and forces repeated parsing. Consider accepting a Pricing instance in the config and normalizing arrays in fromArray(), then serialize via $pricing->toArray().
As per coding guidelines: "Do not use arrays as collections; use dedicated collection classes instead."

♻️ Possible refactor
-        public array  $pricing = [],
+        public ?Pricing $pricing = null,
     ) {
+        $this->pricing = $pricing ?? Pricing::none();
         $this->assertNoRetryPolicyInOptions($this->options);
     }

     public static function fromArray(array $config) : LLMConfig {
         try {
+            if (isset($config['pricing']) && is_array($config['pricing'])) {
+                $config['pricing'] = Pricing::fromArray($config['pricing']);
+            }
             $instance = new self(...$config);
         } catch (Throwable $e) {
             $data = json_encode($config, JSON_PRETTY_PRINT | JSON_UNESCAPED_SLASHES);
             throw new ConfigurationException(
                 message: "Invalid configuration for LLMConfig: {$e->getMessage()}\nData: {$data}",
                 previous: $e,
             );
         }
         return $instance;
     }

     public function toArray() : array {
         return [
             'apiUrl' => $this->apiUrl,
             'apiKey' => $this->apiKey,
             'endpoint' => $this->endpoint,
             'queryParams' => $this->queryParams,
             'metadata' => $this->metadata,
             'model' => $this->model,
             'maxTokens' => $this->maxTokens,
             'contextLength' => $this->contextLength,
             'maxOutputLength' => $this->maxOutputLength,
             'driver' => $this->driver,
             'options' => $this->options,
-            'pricing' => $this->pricing,
+            'pricing' => $this->pricing->toArray(),
         ];
     }

-    public function getPricing(): Pricing {
-        return Pricing::fromArray($this->pricing);
-    }
+    public function getPricing(): Pricing {
+        return $this->pricing;
+    }
packages/polyglot/src/Inference/Data/Usage.php (1)

95-105: Consider handling pricing conflicts during accumulation.

If both usages have different Pricing, the current logic silently keeps the left one. Consider asserting equality or adding an explicit override strategy to avoid mixed-model cost calculations.

examples/A03_Troubleshooting/CostCalculation/run.php (1)

17-18: Missing declare(strict_types=1); declaration.

Per coding guidelines, PHP files should use strict types. Add the declaration after the opening tag.

Suggested fix
 <?php
+declare(strict_types=1);
+
 require 'examples/boot.php';

As per coding guidelines: "Use strict types and add type hints for all function/method parameters and return values".

Comment on lines +77 to +81
// Attach pricing from config if available
$pricing = $this->config->getPricing();
if ($pricing->hasAnyPricing()) {
$inferenceResponse = $inferenceResponse->withPricing($pricing);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, let's find the file structure and understand the Inference module
find packages/polyglot/src/Inference -type f -name "*.php" | head -20

Repository: cognesy/instructor-php

Length of output: 1364


🏁 Script executed:

#!/bin/bash
# Search for pricing-related code in the Inference module
rg -n "withPricing|getPricing|hasAnyPricing" packages/polyglot/src/Inference --type php -C 2

Repository: cognesy/instructor-php

Length of output: 2541


🏁 Script executed:

#!/bin/bash
# Search for PartialInferenceResponse and streaming-related methods
rg -n "class PartialInferenceResponse|httpResponseToInferenceStream|fromAccumulatedPartial" packages/polyglot/src/Inference --type php -C 3

Repository: cognesy/instructor-php

Length of output: 8272


🏁 Script executed:

#!/bin/bash
# Check the BaseInferenceDriver file structure and methods
rg -n "public function|private function" packages/polyglot/src/Inference/Drivers/BaseInferenceDriver.php | head -30

Repository: cognesy/instructor-php

Length of output: 1156


🏁 Script executed:

#!/bin/bash
# Get the full implementation of httpResponseToInferenceStream
sed -n '91,116p' packages/polyglot/src/Inference/Drivers/BaseInferenceDriver.php

Repository: cognesy/instructor-php

Length of output: 1010


🏁 Script executed:

#!/bin/bash
# Get the full implementation of fromAccumulatedPartial
sed -n '70,85p' packages/polyglot/src/Inference/Data/InferenceResponse.php

Repository: cognesy/instructor-php

Length of output: 720


🏁 Script executed:

#!/bin/bash
# Check if PartialInferenceResponse has pricing-related methods
grep -n "pricing\|withPricing" packages/polyglot/src/Inference/Data/PartialInferenceResponse.php

Repository: cognesy/instructor-php

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Check how finalization happens and if pricing is attached there
rg -n "finalized|pricing" packages/polyglot/src/Inference/Data/InferenceAttempt.php -C 2

Repository: cognesy/instructor-php

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Check PartialInferenceResponse class structure
wc -l packages/polyglot/src/Inference/Data/PartialInferenceResponse.php

Repository: cognesy/instructor-php

Length of output: 135


🏁 Script executed:

#!/bin/bash
# Read PartialInferenceResponse to see its full structure
cat -n packages/polyglot/src/Inference/Data/PartialInferenceResponse.php | head -100

Repository: cognesy/instructor-php

Length of output: 4144


🏁 Script executed:

#!/bin/bash
# Search for pricing attachment in the entire streaming flow
rg -n "pricing" packages/polyglot/src/Inference/Data/PartialInferenceResponse.php

Repository: cognesy/instructor-php

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Check if pricing is attached anywhere after streaming finalization
rg -n "makeStreamResponsesFor|makeResponseFor" packages/polyglot/src/Inference/Drivers/BaseInferenceDriver.php -A 10

Repository: cognesy/instructor-php

Length of output: 1015


🏁 Script executed:

#!/bin/bash
# Check how InferenceAttempt uses finalization and if pricing is attached there
cat -n packages/polyglot/src/Inference/Data/InferenceAttempt.php | sed -n '165,185p'

Repository: cognesy/instructor-php

Length of output: 934


🏁 Script executed:

#!/bin/bash
# Check if there's a calculateCost method that might fail without pricing
rg -n "calculateCost|pricing" packages/polyglot/src/Inference/Data/Usage.php -C 3

Repository: cognesy/instructor-php

Length of output: 2378


🏁 Script executed:

#!/bin/bash
# Verify the complete finalization flow for streaming
rg -n "makeStreamResponsesFor\|withFinalizedPartialResponse\|fromAccumulatedPartial" packages/polyglot/src/Inference -B 2 -A 5

Repository: cognesy/instructor-php

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Check if pricing is ever attached to PartialInferenceResponse or during finalization anywhere
rg -n "withPricing" packages/polyglot/src/Inference --type php

Repository: cognesy/instructor-php

Length of output: 573


Pricing is not attached in streaming responses, causing calculateCost() to throw RuntimeException.

Confirmed: withPricing() is only called in httpResponseToInference() (non-streaming). Streaming responses go through httpResponseToInferenceStream()PartialInferenceResponsefromAccumulatedPartial() without pricing attachment. Since Usage::calculateCost() throws \RuntimeException when pricing is null, any streaming caller invoking it will fail. Attach pricing in the streaming path, either in httpResponseToInferenceStream() or during finalization via fromAccumulatedPartial().

🤖 Prompt for AI Agents
In `@packages/polyglot/src/Inference/Drivers/BaseInferenceDriver.php` around lines
77 - 81, Streaming responses never get pricing attached, causing
Usage::calculateCost() to throw when pricing is null; update the streaming path
to attach pricing from $this->config->getPricing(). Specifically, locate
httpResponseToInferenceStream() and/or
PartialInferenceResponse::fromAccumulatedPartial() and ensure the final
InferenceResponse/PartialInferenceResponse instance calls
->withPricing($this->config->getPricing()) when
$this->config->getPricing()->hasAnyPricing() is true (mirroring the
non-streaming httpResponseToInference() behavior) so that calculateCost() always
has pricing available.

…fromArray

- Validate all pricing fields are numeric using is_numeric()
- Reject negative values with clear error messages
- Accept numeric strings (e.g., '3.0')
- Add 5 new validation tests
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@packages/polyglot/tests/Unit/Data/UsageCostCalculationTest.php`:
- Around line 138-140: Replace strict float equality checks in the test using
expect($cost)->toBe(12.975) with the project's tolerant float matcher: use
expect($cost)->toBeCloseTo(12.975, 3) (or an appropriate precision) after the
call to Usage::calculateCost($pricing) so minor binary float differences won't
make the test flaky; update any similar assertions (e.g., 0.0825) to use
toBeCloseTo with suitable precision as well.
🧹 Nitpick comments (1)
packages/polyglot/tests/Unit/Data/UsageCostCalculationTest.php (1)

6-7: Add explicit return types to Pest closures.

Add : void to describe() and it() closures to align with the PHP typing guideline for strict types and explicit type hints:

Example pattern
-describe('Pricing', function () {
-    it('creates pricing from array with short keys', function () {
+describe('Pricing', function (): void {
+    it('creates pricing from array with short keys', function (): void {

Apply this pattern to all closures in the file.

Streaming responses now get pricing attached during finalization:
- InferenceStream accepts optional Pricing parameter
- PendingInference passes pricing to InferenceStream
- Inference.create() extracts pricing from LLMConfig
- InferenceExecution.withUpdatedResponse() for post-finalization updates
- InferenceAttemptList.withUpdatedAttempt() for attempt updates

This ensures calculateCost() works for both sync and streaming responses.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@packages/polyglot/src/Inference/Collections/InferenceAttemptList.php`:
- Around line 70-88: The method InferenceAttemptList::withUpdatedAttempt
currently falls back to replacing the last item when no matching ID is found;
change this to fail-fast: remove the fallback block that overwrites the last
element and instead throw an InvalidArgumentException (or a domain-specific
exception) when $updated is false, including the missing attempt ID in the
message; keep the rest of the logic (iterating $this->attempts->all(), replacing
by index, and returning new self via ArrayList::fromArray($items)) unchanged.
🧹 Nitpick comments (1)
packages/polyglot/src/Inference/Collections/InferenceAttemptList.php (1)

70-72: Docblock is misleading.

The docblock says "Updates the last attempt" but the method updates whichever attempt matches by ID, not necessarily the last one.

Suggested fix
-    /**
-     * Updates the last attempt with the given one (by matching ID).
-     */
+    /**
+     * Replaces an existing attempt with the given one, matched by ID.
+     */

Replace silent fallback with InvalidArgumentException to catch bugs early.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant