-
Notifications
You must be signed in to change notification settings - Fork 21
Calculating cost of request #52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…m config
Add ability to calculate request cost based on token usage:
- New Pricing value object ($/1M tokens pricing)
- LLMConfig.pricing field for per-preset pricing configuration
- Usage::calculateCost() - works without arguments when pricing is set
- Default: cacheRead, cacheWrite, reasoning fallback to input price
- BaseInferenceDriver automatically attaches pricing from config
Usage in config:
'openrouter-claude' => [
'pricing' => ['input' => 3.0, 'output' => 15.0], // $/1M
]
Usage in code:
$cost = $response->usage()->calculateCost();
…to feature/usage-calculate-cost
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughAdds token-pricing support: a new Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Driver as BaseInferenceDriver
participant Config as LLMConfig
participant Resolver as LLMResolver/Provider
participant Pending as PendingInference
participant Stream as InferenceStream
participant Response as InferenceResponse
participant Usage
participant Pricing
Client->>Driver: send inference request
Driver->>Resolver: resolve LLM config
Resolver->>Config: return config (may include pricing array)
Config->>Pricing: Pricing::fromArray(config.pricing)
Driver->>Pending: create PendingInference(..., pricing?)
Pending->>Stream: stream(..., pricing?)
Stream->>Response: accumulate tokens -> Usage
alt pricing present
Pricing->>Usage: withPricing(Pricing) or Usage::calculateCost(Pricing)
Stream->>Pending: finalize response with pricing
Driver->>Response: httpResponseToInference() then withPricing()
end
Driver->>Client: dispatch InferenceResponseCreated (with Usage/pricing)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@examples/A03_Troubleshooting/CostCalculation/run.php`:
- Around line 78-82: The inline comments in the Pricing::fromArray call for the
$pricing variable are wrong due to copy-paste errors; update the comments to
correctly describe each key and its unit/rate (e.g., 'input' => 0.2 means $0.2
per 1M input tokens, 'output' => 0.8 means $0.8 per 1M output tokens, and
'cacheRead' => 0.05 means $0.05 per 1M cache read tokens) so the
Pricing::fromArray invocation accurately documents the values and units for
future readers.
In `@packages/polyglot/src/Inference/Data/Pricing.php`:
- Around line 43-57: The fromArray factory currently casts possibly non-numeric
values to 0.0; update Pricing::fromArray to validate each numeric field
(input/inputPerMToken, output/outputPerMToken, cacheRead/cacheReadPerMToken,
cacheWrite/cacheWritePerMToken, reasoning/reasoningPerMToken) with is_numeric
and ensure values are non-negative before casting, and throw an
InvalidArgumentException with a clear message naming the offending field when
validation fails; keep the same field names (inputPerMToken, outputPerMToken,
cacheReadPerMToken, cacheWritePerMToken, reasoningPerMToken) and only proceed to
cast to float and call the constructor if validation passes.
In `@packages/polyglot/src/Inference/Drivers/BaseInferenceDriver.php`:
- Around line 77-81: Streaming responses never get pricing attached, causing
Usage::calculateCost() to throw when pricing is null; update the streaming path
to attach pricing from $this->config->getPricing(). Specifically, locate
httpResponseToInferenceStream() and/or
PartialInferenceResponse::fromAccumulatedPartial() and ensure the final
InferenceResponse/PartialInferenceResponse instance calls
->withPricing($this->config->getPricing()) when
$this->config->getPricing()->hasAnyPricing() is true (mirroring the
non-streaming httpResponseToInference() behavior) so that calculateCost() always
has pricing available.
🧹 Nitpick comments (3)
packages/polyglot/src/Inference/Config/LLMConfig.php (1)
18-73: Prefer storing Pricing as a value object instead of a raw array.Keeping pricing as an array reduces type safety and forces repeated parsing. Consider accepting a
Pricinginstance in the config and normalizing arrays infromArray(), then serialize via$pricing->toArray().
As per coding guidelines: "Do not use arrays as collections; use dedicated collection classes instead."♻️ Possible refactor
- public array $pricing = [], + public ?Pricing $pricing = null, ) { + $this->pricing = $pricing ?? Pricing::none(); $this->assertNoRetryPolicyInOptions($this->options); } public static function fromArray(array $config) : LLMConfig { try { + if (isset($config['pricing']) && is_array($config['pricing'])) { + $config['pricing'] = Pricing::fromArray($config['pricing']); + } $instance = new self(...$config); } catch (Throwable $e) { $data = json_encode($config, JSON_PRETTY_PRINT | JSON_UNESCAPED_SLASHES); throw new ConfigurationException( message: "Invalid configuration for LLMConfig: {$e->getMessage()}\nData: {$data}", previous: $e, ); } return $instance; } public function toArray() : array { return [ 'apiUrl' => $this->apiUrl, 'apiKey' => $this->apiKey, 'endpoint' => $this->endpoint, 'queryParams' => $this->queryParams, 'metadata' => $this->metadata, 'model' => $this->model, 'maxTokens' => $this->maxTokens, 'contextLength' => $this->contextLength, 'maxOutputLength' => $this->maxOutputLength, 'driver' => $this->driver, 'options' => $this->options, - 'pricing' => $this->pricing, + 'pricing' => $this->pricing->toArray(), ]; } - public function getPricing(): Pricing { - return Pricing::fromArray($this->pricing); - } + public function getPricing(): Pricing { + return $this->pricing; + }packages/polyglot/src/Inference/Data/Usage.php (1)
95-105: Consider handling pricing conflicts during accumulation.If both usages have different
Pricing, the current logic silently keeps the left one. Consider asserting equality or adding an explicit override strategy to avoid mixed-model cost calculations.examples/A03_Troubleshooting/CostCalculation/run.php (1)
17-18: Missingdeclare(strict_types=1);declaration.Per coding guidelines, PHP files should use strict types. Add the declaration after the opening tag.
Suggested fix
<?php +declare(strict_types=1); + require 'examples/boot.php';As per coding guidelines: "Use strict types and add type hints for all function/method parameters and return values".
| // Attach pricing from config if available | ||
| $pricing = $this->config->getPricing(); | ||
| if ($pricing->hasAnyPricing()) { | ||
| $inferenceResponse = $inferenceResponse->withPricing($pricing); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, let's find the file structure and understand the Inference module
find packages/polyglot/src/Inference -type f -name "*.php" | head -20Repository: cognesy/instructor-php
Length of output: 1364
🏁 Script executed:
#!/bin/bash
# Search for pricing-related code in the Inference module
rg -n "withPricing|getPricing|hasAnyPricing" packages/polyglot/src/Inference --type php -C 2Repository: cognesy/instructor-php
Length of output: 2541
🏁 Script executed:
#!/bin/bash
# Search for PartialInferenceResponse and streaming-related methods
rg -n "class PartialInferenceResponse|httpResponseToInferenceStream|fromAccumulatedPartial" packages/polyglot/src/Inference --type php -C 3Repository: cognesy/instructor-php
Length of output: 8272
🏁 Script executed:
#!/bin/bash
# Check the BaseInferenceDriver file structure and methods
rg -n "public function|private function" packages/polyglot/src/Inference/Drivers/BaseInferenceDriver.php | head -30Repository: cognesy/instructor-php
Length of output: 1156
🏁 Script executed:
#!/bin/bash
# Get the full implementation of httpResponseToInferenceStream
sed -n '91,116p' packages/polyglot/src/Inference/Drivers/BaseInferenceDriver.phpRepository: cognesy/instructor-php
Length of output: 1010
🏁 Script executed:
#!/bin/bash
# Get the full implementation of fromAccumulatedPartial
sed -n '70,85p' packages/polyglot/src/Inference/Data/InferenceResponse.phpRepository: cognesy/instructor-php
Length of output: 720
🏁 Script executed:
#!/bin/bash
# Check if PartialInferenceResponse has pricing-related methods
grep -n "pricing\|withPricing" packages/polyglot/src/Inference/Data/PartialInferenceResponse.phpRepository: cognesy/instructor-php
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Check how finalization happens and if pricing is attached there
rg -n "finalized|pricing" packages/polyglot/src/Inference/Data/InferenceAttempt.php -C 2Repository: cognesy/instructor-php
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Check PartialInferenceResponse class structure
wc -l packages/polyglot/src/Inference/Data/PartialInferenceResponse.phpRepository: cognesy/instructor-php
Length of output: 135
🏁 Script executed:
#!/bin/bash
# Read PartialInferenceResponse to see its full structure
cat -n packages/polyglot/src/Inference/Data/PartialInferenceResponse.php | head -100Repository: cognesy/instructor-php
Length of output: 4144
🏁 Script executed:
#!/bin/bash
# Search for pricing attachment in the entire streaming flow
rg -n "pricing" packages/polyglot/src/Inference/Data/PartialInferenceResponse.phpRepository: cognesy/instructor-php
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Check if pricing is attached anywhere after streaming finalization
rg -n "makeStreamResponsesFor|makeResponseFor" packages/polyglot/src/Inference/Drivers/BaseInferenceDriver.php -A 10Repository: cognesy/instructor-php
Length of output: 1015
🏁 Script executed:
#!/bin/bash
# Check how InferenceAttempt uses finalization and if pricing is attached there
cat -n packages/polyglot/src/Inference/Data/InferenceAttempt.php | sed -n '165,185p'Repository: cognesy/instructor-php
Length of output: 934
🏁 Script executed:
#!/bin/bash
# Check if there's a calculateCost method that might fail without pricing
rg -n "calculateCost|pricing" packages/polyglot/src/Inference/Data/Usage.php -C 3Repository: cognesy/instructor-php
Length of output: 2378
🏁 Script executed:
#!/bin/bash
# Verify the complete finalization flow for streaming
rg -n "makeStreamResponsesFor\|withFinalizedPartialResponse\|fromAccumulatedPartial" packages/polyglot/src/Inference -B 2 -A 5Repository: cognesy/instructor-php
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Check if pricing is ever attached to PartialInferenceResponse or during finalization anywhere
rg -n "withPricing" packages/polyglot/src/Inference --type phpRepository: cognesy/instructor-php
Length of output: 573
Pricing is not attached in streaming responses, causing calculateCost() to throw RuntimeException.
Confirmed: withPricing() is only called in httpResponseToInference() (non-streaming). Streaming responses go through httpResponseToInferenceStream() → PartialInferenceResponse → fromAccumulatedPartial() without pricing attachment. Since Usage::calculateCost() throws \RuntimeException when pricing is null, any streaming caller invoking it will fail. Attach pricing in the streaming path, either in httpResponseToInferenceStream() or during finalization via fromAccumulatedPartial().
🤖 Prompt for AI Agents
In `@packages/polyglot/src/Inference/Drivers/BaseInferenceDriver.php` around lines
77 - 81, Streaming responses never get pricing attached, causing
Usage::calculateCost() to throw when pricing is null; update the streaming path
to attach pricing from $this->config->getPricing(). Specifically, locate
httpResponseToInferenceStream() and/or
PartialInferenceResponse::fromAccumulatedPartial() and ensure the final
InferenceResponse/PartialInferenceResponse instance calls
->withPricing($this->config->getPricing()) when
$this->config->getPricing()->hasAnyPricing() is true (mirroring the
non-streaming httpResponseToInference() behavior) so that calculateCost() always
has pricing available.
…fromArray - Validate all pricing fields are numeric using is_numeric() - Reject negative values with clear error messages - Accept numeric strings (e.g., '3.0') - Add 5 new validation tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@packages/polyglot/tests/Unit/Data/UsageCostCalculationTest.php`:
- Around line 138-140: Replace strict float equality checks in the test using
expect($cost)->toBe(12.975) with the project's tolerant float matcher: use
expect($cost)->toBeCloseTo(12.975, 3) (or an appropriate precision) after the
call to Usage::calculateCost($pricing) so minor binary float differences won't
make the test flaky; update any similar assertions (e.g., 0.0825) to use
toBeCloseTo with suitable precision as well.
🧹 Nitpick comments (1)
packages/polyglot/tests/Unit/Data/UsageCostCalculationTest.php (1)
6-7: Add explicit return types to Pest closures.Add
: voidtodescribe()andit()closures to align with the PHP typing guideline for strict types and explicit type hints:Example pattern
-describe('Pricing', function () { - it('creates pricing from array with short keys', function () { +describe('Pricing', function (): void { + it('creates pricing from array with short keys', function (): void {Apply this pattern to all closures in the file.
Streaming responses now get pricing attached during finalization: - InferenceStream accepts optional Pricing parameter - PendingInference passes pricing to InferenceStream - Inference.create() extracts pricing from LLMConfig - InferenceExecution.withUpdatedResponse() for post-finalization updates - InferenceAttemptList.withUpdatedAttempt() for attempt updates This ensures calculateCost() works for both sync and streaming responses.
…CostCalculationTest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@packages/polyglot/src/Inference/Collections/InferenceAttemptList.php`:
- Around line 70-88: The method InferenceAttemptList::withUpdatedAttempt
currently falls back to replacing the last item when no matching ID is found;
change this to fail-fast: remove the fallback block that overwrites the last
element and instead throw an InvalidArgumentException (or a domain-specific
exception) when $updated is false, including the missing attempt ID in the
message; keep the rest of the logic (iterating $this->attempts->all(), replacing
by index, and returning new self via ArrayList::fromArray($items)) unchanged.
🧹 Nitpick comments (1)
packages/polyglot/src/Inference/Collections/InferenceAttemptList.php (1)
70-72: Docblock is misleading.The docblock says "Updates the last attempt" but the method updates whichever attempt matches by ID, not necessarily the last one.
Suggested fix
- /** - * Updates the last attempt with the given one (by matching ID). - */ + /** + * Replaces an existing attempt with the given one, matched by ID. + */
Replace silent fallback with InvalidArgumentException to catch bugs early.
Calculating the cost of a request based on a pre-set pricing list
Config
llm.php:Usage:
Usage without config changes: