Go SDK for Opik - an open-source LLM observability platform by Comet ML.
Current Version: v0.5.0 - See Release Notes
go get github.com/agentplexus/go-opikpackage main
import (
"context"
"log"
"github.com/agentplexus/go-opik"
)
func main() {
// Create client (uses OPIK_API_KEY and OPIK_WORKSPACE env vars for Opik Cloud)
client, err := opik.NewClient(
opik.WithProjectName("My Project"),
)
if err != nil {
log.Fatal(err)
}
ctx := context.Background()
// Create a trace
trace, _ := client.Trace(ctx, "my-trace",
opik.WithTraceInput(map[string]any{"prompt": "Hello"}),
)
// Create a span for an LLM call
span, _ := trace.Span(ctx, "llm-call",
opik.WithSpanType(opik.SpanTypeLLM),
opik.WithSpanModel("gpt-4"),
)
// Do your LLM call here...
// End span with output
span.End(ctx, opik.WithSpanOutput(map[string]any{"response": "Hi!"}))
// End trace
trace.End(ctx)
}The SDK includes a built-in llmops subpackage that implements the OmniObserve llmops.Provider interface. This allows you to use Opik through a unified observability abstraction alongside other providers like Phoenix, Langfuse, etc.
package main
import (
"context"
"github.com/agentplexus/omniobserve/llmops"
_ "github.com/agentplexus/go-opik/llmops" // Register Opik provider
)
func main() {
// Open the Opik provider through OmniObserve
provider, err := llmops.Open("opik",
llmops.WithAPIKey("your-api-key"),
llmops.WithWorkspace("your-workspace"),
llmops.WithProjectName("my-project"),
)
if err != nil {
panic(err)
}
defer provider.Close()
ctx := context.Background()
// Start a trace
ctx, trace, _ := provider.StartTrace(ctx, "my-operation")
defer trace.End()
// Start a span
ctx, span, _ := provider.StartSpan(ctx, "llm-call",
llmops.WithSpanType(llmops.SpanTypeLLM),
)
span.SetModel("gpt-4")
span.SetInput("Hello, world!")
span.SetOutput("Hi there!")
span.End()
}This pattern allows you to:
- Switch between observability providers (Opik, Phoenix, Langfuse) without code changes
- Use a consistent API across different LLM observability platforms
- Build provider-agnostic observability tooling
| Feature | Opik (Python) | go-opik | omniobserve/llmops | Tests | Notes |
|---|---|---|---|---|---|
| Tracing | |||||
| StartTrace | âś… | âś… | âś… | âś… | |
| StartSpan | âś… | âś… | âś… | âś… | |
| SetInput/Output | âś… | âś… | âś… | âś… | |
| SetModel/Provider | âś… | âś… | âś… | âś… | |
| SetUsage (tokens) | âś… | âś… | âś… | âś… | |
| AddFeedbackScore | âś… | âś… | âś… | âś… | |
| TraceFromContext | âś… | âś… | âś… | âś… | |
| SpanFromContext | âś… | âś… | âś… | âś… | |
| Nested Spans | âś… | âś… | âś… | âś… | |
| Span Types | âś… | âś… | âś… | âś… | general, llm, tool, guardrail |
| Duration/Timing | âś… | âś… | âś… | âś… | |
| Prompts | |||||
| CreatePrompt | âś… | âś… | âś… | âś… | |
| GetPrompt | âś… | âś… | âś… | âś… | By name + optional version |
| ListPrompts | âś… | âś… | âś… | âś… | |
| CreatePromptVersion | ✅ | ✅ | ❌ | Not in omniobserve interface | |
| ListPromptVersions | ✅ | ✅ | ❌ | Not in omniobserve interface | |
| Datasets | |||||
| CreateDataset | âś… | âś… | âś… | âś… | |
| GetDataset | âś… | âś… | âś… | By name | |
| AddDatasetItems | âś… | âś… | âś… | âś… | |
| ListDatasets | âś… | âś… | âś… | âś… | |
| DeleteDataset | ✅ | ✅ | ❌ | Not in omniobserve interface | |
| Experiments | |||||
| CreateExperiment | ✅ | ✅ | ❌ | Not in omniobserve interface | |
| LogExperimentItem | ✅ | ✅ | ❌ | Not in omniobserve interface | |
| ListExperiments | ✅ | ✅ | ❌ | Not in omniobserve interface | |
| Projects | |||||
| CreateProject | âś… | âś… | âś… | âś… | |
| GetProject | âś… | âś… | âś… | ||
| ListProjects | âś… | âś… | âś… | âś… | |
| SetProject | âś… | âś… | âś… | âś… | |
| Evaluation | |||||
| Evaluate | âś… | âś… | âś… | âś… | Run metrics |
| AddFeedbackScore | âś… | âś… | âś… | âś… | Record results |
| Advanced | |||||
| Distributed Tracing | ✅ | ✅ | ❌ | Not in omniobserve interface | |
| Streaming Spans | ✅ | ✅ | ❌ | Not in omniobserve interface | |
| Attachments | ✅ | ✅ | ❌ | Not in omniobserve interface | |
| HTTP Middleware | ❌ | ✅ | ❌ | Go SDK extension | |
| Local Recording | ❌ | ✅ | ❌ | Go SDK extension | |
| Batching Client | ✅ | ✅ | ❌ | Not in omniobserve interface |
Running omniobserve/llmops tests:
# Skip tests when no API key is set
go test -v ./llmops/
# Run tests with API key
export OPIK_API_KEY=your-api-key
export OPIK_WORKSPACE=your-workspace # optional
go test -v ./llmops/| Variable | Description |
|---|---|
OPIK_URL_OVERRIDE |
API endpoint URL |
OPIK_API_KEY |
API key for Opik Cloud |
OPIK_WORKSPACE |
Workspace name for Opik Cloud |
OPIK_PROJECT_NAME |
Default project name |
OPIK_TRACK_DISABLE |
Set to "true" to disable tracing |
Create ~/.opik.config:
[opik]
url_override = https://www.comet.com/opik/api
api_key = your-api-key
workspace = your-workspace
project_name = My Projectclient, err := opik.NewClient(
opik.WithURL("https://www.comet.com/opik/api"),
opik.WithAPIKey("your-api-key"),
opik.WithWorkspace("your-workspace"),
opik.WithProjectName("My Project"),
)// Create a trace
trace, _ := client.Trace(ctx, "my-trace",
opik.WithTraceInput(input),
opik.WithTraceMetadata(map[string]any{"key": "value"}),
opik.WithTraceTags("tag1", "tag2"),
)
// Create spans (supports nesting)
span1, _ := trace.Span(ctx, "outer-span")
span2, _ := span1.Span(ctx, "inner-span")
// End spans and traces
span2.End(ctx, opik.WithSpanOutput(output))
span1.End(ctx)
trace.End(ctx)// LLM spans
span, _ := trace.Span(ctx, "llm-call",
opik.WithSpanType(opik.SpanTypeLLM),
opik.WithSpanModel("gpt-4"),
opik.WithSpanProvider("openai"),
)
// Tool spans
span, _ := trace.Span(ctx, "tool-call",
opik.WithSpanType(opik.SpanTypeTool),
)
// General spans (default)
span, _ := trace.Span(ctx, "processing",
opik.WithSpanType(opik.SpanTypeGeneral),
)// Add feedback to traces
trace.AddFeedbackScore(ctx, "accuracy", 0.95, "High accuracy")
// Add feedback to spans
span.AddFeedbackScore(ctx, "relevance", 0.87, "Mostly relevant")// Start trace with context
ctx, trace, _ := opik.StartTrace(ctx, client, "my-trace")
// Start nested spans
ctx, span1, _ := opik.StartSpan(ctx, "span-1")
ctx, span2, _ := opik.StartSpan(ctx, "span-2") // Automatically nested under span1
// Get current trace/span from context
currentTrace := opik.TraceFromContext(ctx)
currentSpan := opik.SpanFromContext(ctx)// List recent traces
traces, _ := client.ListTraces(ctx, page, size)
for _, t := range traces {
fmt.Printf("Trace: %s (ID: %s)\n", t.Name, t.ID)
}
// List spans for a specific trace
spans, _ := client.ListSpans(ctx, traceID, page, size)
for _, s := range spans {
fmt.Printf("Span: %s (Type: %s, Model: %s)\n", s.Name, s.Type, s.Model)
}// Inject trace headers into outgoing requests
opik.InjectDistributedTraceHeaders(ctx, req)
// Extract trace headers from incoming requests
headers := opik.ExtractDistributedTraceHeaders(req)
// Continue a distributed trace
ctx, span, _ := client.ContinueTrace(ctx, headers, "handle-request")
// Use propagating HTTP client
httpClient := opik.PropagatingHTTPClient()// Start a streaming span
ctx, streamSpan, _ := opik.StartStreamingSpan(ctx, "stream-response",
opik.WithSpanType(opik.SpanTypeLLM),
)
// Add chunks as they arrive
for chunk := range chunks {
streamSpan.AddChunk(chunk.Content,
opik.WithChunkTokenCount(chunk.Tokens),
)
}
// Mark final chunk
streamSpan.AddChunk(lastChunk, opik.WithChunkFinishReason("stop"))
// End with accumulated data
streamSpan.End(ctx)// Create a dataset
dataset, _ := client.CreateDataset(ctx, "my-dataset",
opik.WithDatasetDescription("Test data for evaluation"),
opik.WithDatasetTags("test", "evaluation"),
)
// Insert items
dataset.InsertItem(ctx, map[string]any{
"input": "What is the capital of France?",
"expected": "Paris",
})
// Insert multiple items
items := []map[string]any{
{"input": "2+2", "expected": "4"},
{"input": "3+3", "expected": "6"},
}
dataset.InsertItems(ctx, items)
// Retrieve items
items, _ := dataset.GetItems(ctx, 1, 100)
// List datasets
datasets, _ := client.ListDatasets(ctx, 1, 100)
// Get dataset by name
dataset, _ := client.GetDatasetByName(ctx, "my-dataset")
// Delete dataset
dataset.Delete(ctx)// Create an experiment
experiment, _ := client.CreateExperiment(ctx, "my-dataset",
opik.WithExperimentName("gpt-4-evaluation"),
opik.WithExperimentMetadata(map[string]any{"model": "gpt-4"}),
)
// Log experiment items
experiment.LogItem(ctx, datasetItemID, traceID,
opik.WithExperimentItemInput(input),
opik.WithExperimentItemOutput(output),
)
// Complete or cancel experiments
experiment.Complete(ctx)
experiment.Cancel(ctx)
// List experiments
experiments, _ := client.ListExperiments(ctx, datasetID, 1, 100)
// Delete experiment
experiment.Delete(ctx)// Create a prompt
prompt, _ := client.CreatePrompt(ctx, "greeting-prompt",
opik.WithPromptDescription("Greeting template"),
opik.WithPromptTemplate("Hello, {{name}}! Welcome to {{place}}."),
opik.WithPromptTags("greeting", "template"),
)
// Get prompt by name
version, _ := client.GetPromptByName(ctx, "greeting-prompt", "")
// Render template with variables
rendered := version.Render(map[string]string{
"name": "Alice",
"place": "Wonderland",
})
// Result: "Hello, Alice! Welcome to Wonderland."
// Extract variables from template
vars := version.ExtractVariables()
// Result: ["name", "place"]
// Create new version
newVersion, _ := prompt.CreateVersion(ctx, "Hi, {{name}}!",
opik.WithVersionChangeDescription("Simplified greeting"),
)
// List all versions
versions, _ := prompt.GetVersions(ctx, 1, 100)
// List all prompts
prompts, _ := client.ListPrompts(ctx, 1, 100)import "github.com/agentplexus/go-opik/middleware"
// Wrap HTTP handlers with automatic tracing
handler := middleware.TracingMiddleware(client, "api-request")(yourHandler)
// Use tracing HTTP client for outgoing requests
httpClient := middleware.TracingHTTPClient("external-call")
resp, _ := httpClient.Get("https://api.example.com/data")
// Or wrap an existing transport
transport := middleware.NewTracingRoundTripper(http.DefaultTransport, "api-call")
httpClient := &http.Client{Transport: transport}// Record traces locally without sending to server
client := opik.RecordTracesLocally("my-project")
trace, _ := client.Trace(ctx, "test-trace")
span, _ := trace.Span(ctx, "test-span")
span.End(ctx)
trace.End(ctx)
// Access recorded data
traces := client.Recording().Traces()
spans := client.Recording().Spans()// Create a batching client for efficient API calls
client, _ := opik.NewBatchingClient(
opik.WithProjectName("My Project"),
)
// Operations are batched automatically
client.AddFeedbackAsync("trace", traceID, "score", 0.95, "reason")
// Flush pending operations
client.Flush(5 * time.Second)
// Close and flush on shutdown
defer client.Close(10 * time.Second)// Create attachment from file
attachment, _ := opik.NewAttachmentFromFile("/path/to/image.png")
// Create attachment from bytes
attachment := opik.NewAttachmentFromBytes("data.json", jsonBytes, "application/json")
// Create text attachment
attachment := opik.NewTextAttachment("notes.txt", "Some text content")
// Get data URL for embedding
dataURL := attachment.ToDataURL()import (
"github.com/agentplexus/go-opik/evaluation"
"github.com/agentplexus/go-opik/evaluation/heuristic"
)
// Create metrics
metrics := []evaluation.Metric{
heuristic.NewEquals(false), // Case-insensitive equality
heuristic.NewContains(false), // Substring check
heuristic.NewIsJSON(), // JSON validation
heuristic.NewLevenshteinSimilarity(false), // Edit distance
heuristic.NewBLEU(4), // BLEU score
heuristic.NewROUGE(1.0), // ROUGE-L score
heuristic.MustRegexMatch(`\d+`), // Regex matching
}
// Evaluate
engine := evaluation.NewEngine(metrics, evaluation.WithConcurrency(4))
input := evaluation.NewMetricInput("What is 2+2?", "The answer is 4.")
input = input.WithExpected("4")
result := engine.EvaluateOne(ctx, input)
fmt.Printf("Average score: %.2f\n", result.AverageScore())import (
"github.com/agentplexus/go-opik/evaluation/llm"
"github.com/agentplexus/go-opik/integrations/openai"
)
// Create LLM provider
provider := openai.NewProvider(openai.WithModel("gpt-4o"))
// Create LLM-based metrics
metrics := []evaluation.Metric{
llm.NewAnswerRelevance(provider),
llm.NewHallucination(provider),
llm.NewFactuality(provider),
llm.NewCoherence(provider),
llm.NewHelpfulness(provider),
}
// Custom judge with custom prompt
customJudge := llm.NewCustomJudge("tone_check", `
Evaluate whether the response maintains a professional tone.
User message: {{input}}
AI response: {{output}}
Return JSON: {"score": <0.0-1.0>, "reason": "<explanation>"}
`, provider)geval := llm.NewGEval(provider, "fluency and coherence").
WithEvaluationSteps([]string{
"Check if the response is grammatically correct",
"Evaluate logical flow of ideas",
"Assess clarity of expression",
})
score := geval.Score(ctx, input)import "github.com/agentplexus/go-opik/integrations/openai"
// Create provider for evaluation
provider := openai.NewProvider(
openai.WithAPIKey("your-api-key"),
openai.WithModel("gpt-4o"),
)
// Create tracing provider (auto-traces all calls)
tracingProvider := openai.TracingProvider(opikClient,
openai.WithModel("gpt-4o"),
)
// Use tracing HTTP client with existing code
httpClient := openai.TracingHTTPClient(opikClient)import "github.com/agentplexus/go-opik/integrations/anthropic"
// Create provider for evaluation
provider := anthropic.NewProvider(
anthropic.WithAPIKey("your-api-key"),
anthropic.WithModel("claude-sonnet-4-20250514"),
)
// Create tracing provider (auto-traces all calls)
tracingProvider := anthropic.TracingProvider(opikClient)
// Use tracing HTTP client with existing code
httpClient := anthropic.TracingHTTPClient(opikClient)# Install CLI
go install github.com/agentplexus/go-opik/cmd/opik@latest
# Configure
opik configure -api-key=your-key -workspace=your-workspace
# List projects
opik projects -list
# Create project
opik projects -create="New Project"
# List traces
opik traces -list -project="My Project" -limit=20
# List datasets
opik datasets -list
# Create dataset
opik datasets -create="evaluation-data"
# List experiments
opik experiments -list -dataset="my-dataset"For advanced usage, access the underlying ogen-generated API client:
api := client.API()
// Use api.* methods for full API access// Check for specific errors
if opik.IsNotFound(err) {
// Handle not found
}
if opik.IsUnauthorized(err) {
// Handle auth failure
}
if opik.IsRateLimited(err) {
// Handle rate limiting
}For integrating Opik with agent frameworks like Google ADK and Eino, see the Agentic Observability Tutorial. This tutorial covers:
- Tracing Google ADK agents with tools
- Tracing Eino workflow graphs
- Multi-agent orchestration observability
- Best practices for agent debugging and monitoring
go test ./...golangci-lint runThe SDK uses ogen to generate a type-safe API client from the Opik OpenAPI specification. When the upstream API changes, regenerate the client:
Prerequisites:
# Install ogen
go install github.com/ogen-go/ogen/cmd/ogen@latestUpdate the OpenAPI spec:
# Download latest spec from Opik repository
curl -o openapi/openapi.yaml \
https://raw.githubusercontent.com/comet-ml/opik/main/sdks/code_generation/fern/openapi/openapi.yamlGenerate the client:
./generate.shThis script runs ogen, applies necessary fixes, and verifies the build. For detailed documentation on the generation process and troubleshooting, see the Development Guide.
MIT License - see LICENSE for details.
- Opik - Open-source LLM observability platform
- Opik Python SDK - Official Python SDK
- Opik Documentation - Official documentation