Skip to content

Fill in Inference Optimization TODOs#8

Open
UmangShankar wants to merge 1 commit into
victorsteeb:mainfrom
UmangShankar:umang/inference-optimization-solutions
Open

Fill in Inference Optimization TODOs#8
UmangShankar wants to merge 1 commit into
victorsteeb:mainfrom
UmangShankar:umang/inference-optimization-solutions

Conversation

@UmangShankar
Copy link
Copy Markdown

Completes all 6 TODO exercises in day2/02_inference-optimization/Inference_Optimization.py.

  • compute_otps: output_tokens / (total_time - ttft)
  • calculate_cost: per-token pricing for input/output
  • CALCULATOR_TOOL schema: operation enum + operands array
  • measure_tool_use_latency: full two-turn tool-use round-trip
  • Prompt caching: system_block with cache_control ephemeral
  • Multi-turn chat caching: marks last assistant message before each turn
  • Cleaned up Jupyter magic lines, wired API key to env var

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant