Go SDK for Comet ML Opik

Go SDK for Opik - an open-source LLM observability platform by Comet ML.

Current Version: v0.5.0 - See Release Notes

Installation

go get github.com/agentplexus/go-opik

Quick Start

package main

import (
    "context"
    "log"

    "github.com/agentplexus/go-opik"
)

func main() {
    // Create client (uses OPIK_API_KEY and OPIK_WORKSPACE env vars for Opik Cloud)
    client, err := opik.NewClient(
        opik.WithProjectName("My Project"),
    )
    if err != nil {
        log.Fatal(err)
    }

    ctx := context.Background()

    // Create a trace
    trace, _ := client.Trace(ctx, "my-trace",
        opik.WithTraceInput(map[string]any{"prompt": "Hello"}),
    )

    // Create a span for an LLM call
    span, _ := trace.Span(ctx, "llm-call",
        opik.WithSpanType(opik.SpanTypeLLM),
        opik.WithSpanModel("gpt-4"),
    )

    // Do your LLM call here...

    // End span with output
    span.End(ctx, opik.WithSpanOutput(map[string]any{"response": "Hi!"}))

    // End trace
    trace.End(ctx)
}

OmniObserve Integration

The SDK includes a built-in llmops subpackage that implements the OmniObserve llmops.Provider interface. This allows you to use Opik through a unified observability abstraction alongside other providers like Phoenix, Langfuse, etc.

package main

import (
    "context"

    "github.com/agentplexus/omniobserve/llmops"
    _ "github.com/agentplexus/go-opik/llmops" // Register Opik provider
)

func main() {
    // Open the Opik provider through OmniObserve
    provider, err := llmops.Open("opik",
        llmops.WithAPIKey("your-api-key"),
        llmops.WithWorkspace("your-workspace"),
        llmops.WithProjectName("my-project"),
    )
    if err != nil {
        panic(err)
    }
    defer provider.Close()

    ctx := context.Background()

    // Start a trace
    ctx, trace, _ := provider.StartTrace(ctx, "my-operation")
    defer trace.End()

    // Start a span
    ctx, span, _ := provider.StartSpan(ctx, "llm-call",
        llmops.WithSpanType(llmops.SpanTypeLLM),
    )
    span.SetModel("gpt-4")
    span.SetInput("Hello, world!")
    span.SetOutput("Hi there!")
    span.End()
}

This pattern allows you to:

Switch between observability providers (Opik, Phoenix, Langfuse) without code changes
Use a consistent API across different LLM observability platforms
Build provider-agnostic observability tooling

Feature Comparison

Feature	Opik (Python)	go-opik	omniobserve/llmops	Tests	Notes
Tracing
StartTrace	✅	✅	✅	✅
StartSpan	✅	✅	✅	✅
SetInput/Output	✅	✅	✅	✅
SetModel/Provider	✅	✅	✅	✅
SetUsage (tokens)	✅	✅	✅	✅
AddFeedbackScore	✅	✅	✅	✅
TraceFromContext	✅	✅	✅	✅
SpanFromContext	✅	✅	✅	✅
Nested Spans	✅	✅	✅	✅
Span Types	✅	✅	✅	✅	general, llm, tool, guardrail
Duration/Timing	✅	✅	✅	✅
Prompts
CreatePrompt	✅	✅	✅	✅
GetPrompt	✅	✅	✅	✅	By name + optional version
ListPrompts	✅	✅	✅	✅
CreatePromptVersion	✅	✅	❌		Not in omniobserve interface
ListPromptVersions	✅	✅	❌		Not in omniobserve interface
Datasets
CreateDataset	✅	✅	✅	✅
GetDataset	✅	✅	✅		By name
AddDatasetItems	✅	✅	✅	✅
ListDatasets	✅	✅	✅	✅
DeleteDataset	✅	✅	❌		Not in omniobserve interface
Experiments
CreateExperiment	✅	✅	❌		Not in omniobserve interface
LogExperimentItem	✅	✅	❌		Not in omniobserve interface
ListExperiments	✅	✅	❌		Not in omniobserve interface
Projects
CreateProject	✅	✅	✅	✅
GetProject	✅	✅	✅
ListProjects	✅	✅	✅	✅
SetProject	✅	✅	✅	✅
Evaluation
Evaluate	✅	✅	✅	✅	Run metrics
AddFeedbackScore	✅	✅	✅	✅	Record results
Advanced
Distributed Tracing	✅	✅	❌		Not in omniobserve interface
Streaming Spans	✅	✅	❌		Not in omniobserve interface
Attachments	✅	✅	❌		Not in omniobserve interface
HTTP Middleware	❌	✅	❌		Go SDK extension
Local Recording	❌	✅	❌		Go SDK extension
Batching Client	✅	✅	❌		Not in omniobserve interface

Running omniobserve/llmops tests:

# Skip tests when no API key is set
go test -v ./llmops/

# Run tests with API key
export OPIK_API_KEY=your-api-key
export OPIK_WORKSPACE=your-workspace  # optional
go test -v ./llmops/

Configuration

Environment Variables

Variable	Description
`OPIK_URL_OVERRIDE`	API endpoint URL
`OPIK_API_KEY`	API key for Opik Cloud
`OPIK_WORKSPACE`	Workspace name for Opik Cloud
`OPIK_PROJECT_NAME`	Default project name
`OPIK_TRACK_DISABLE`	Set to "true" to disable tracing

Config File

Create ~/.opik.config:

[opik]
url_override = https://www.comet.com/opik/api
api_key = your-api-key
workspace = your-workspace
project_name = My Project

Programmatic Configuration

client, err := opik.NewClient(
    opik.WithURL("https://www.comet.com/opik/api"),
    opik.WithAPIKey("your-api-key"),
    opik.WithWorkspace("your-workspace"),
    opik.WithProjectName("My Project"),
)

Features

Traces and Spans

// Create a trace
trace, _ := client.Trace(ctx, "my-trace",
    opik.WithTraceInput(input),
    opik.WithTraceMetadata(map[string]any{"key": "value"}),
    opik.WithTraceTags("tag1", "tag2"),
)

// Create spans (supports nesting)
span1, _ := trace.Span(ctx, "outer-span")
span2, _ := span1.Span(ctx, "inner-span")

// End spans and traces
span2.End(ctx, opik.WithSpanOutput(output))
span1.End(ctx)
trace.End(ctx)

Span Types

// LLM spans
span, _ := trace.Span(ctx, "llm-call",
    opik.WithSpanType(opik.SpanTypeLLM),
    opik.WithSpanModel("gpt-4"),
    opik.WithSpanProvider("openai"),
)

// Tool spans
span, _ := trace.Span(ctx, "tool-call",
    opik.WithSpanType(opik.SpanTypeTool),
)

// General spans (default)
span, _ := trace.Span(ctx, "processing",
    opik.WithSpanType(opik.SpanTypeGeneral),
)

Feedback Scores

// Add feedback to traces
trace.AddFeedbackScore(ctx, "accuracy", 0.95, "High accuracy")

// Add feedback to spans
span.AddFeedbackScore(ctx, "relevance", 0.87, "Mostly relevant")

Context Propagation

// Start trace with context
ctx, trace, _ := opik.StartTrace(ctx, client, "my-trace")

// Start nested spans
ctx, span1, _ := opik.StartSpan(ctx, "span-1")
ctx, span2, _ := opik.StartSpan(ctx, "span-2") // Automatically nested under span1

// Get current trace/span from context
currentTrace := opik.TraceFromContext(ctx)
currentSpan := opik.SpanFromContext(ctx)

Listing Traces and Spans

// List recent traces
traces, _ := client.ListTraces(ctx, page, size)
for _, t := range traces {
    fmt.Printf("Trace: %s (ID: %s)\n", t.Name, t.ID)
}

// List spans for a specific trace
spans, _ := client.ListSpans(ctx, traceID, page, size)
for _, s := range spans {
    fmt.Printf("Span: %s (Type: %s, Model: %s)\n", s.Name, s.Type, s.Model)
}

Distributed Tracing

// Inject trace headers into outgoing requests
opik.InjectDistributedTraceHeaders(ctx, req)

// Extract trace headers from incoming requests
headers := opik.ExtractDistributedTraceHeaders(req)

// Continue a distributed trace
ctx, span, _ := client.ContinueTrace(ctx, headers, "handle-request")

// Use propagating HTTP client
httpClient := opik.PropagatingHTTPClient()

Streaming Support

// Start a streaming span
ctx, streamSpan, _ := opik.StartStreamingSpan(ctx, "stream-response",
    opik.WithSpanType(opik.SpanTypeLLM),
)

// Add chunks as they arrive
for chunk := range chunks {
    streamSpan.AddChunk(chunk.Content,
        opik.WithChunkTokenCount(chunk.Tokens),
    )
}

// Mark final chunk
streamSpan.AddChunk(lastChunk, opik.WithChunkFinishReason("stop"))

// End with accumulated data
streamSpan.End(ctx)

Datasets

// Create a dataset
dataset, _ := client.CreateDataset(ctx, "my-dataset",
    opik.WithDatasetDescription("Test data for evaluation"),
    opik.WithDatasetTags("test", "evaluation"),
)

// Insert items
dataset.InsertItem(ctx, map[string]any{
    "input":    "What is the capital of France?",
    "expected": "Paris",
})

// Insert multiple items
items := []map[string]any{
    {"input": "2+2", "expected": "4"},
    {"input": "3+3", "expected": "6"},
}
dataset.InsertItems(ctx, items)

// Retrieve items
items, _ := dataset.GetItems(ctx, 1, 100)

// List datasets
datasets, _ := client.ListDatasets(ctx, 1, 100)

// Get dataset by name
dataset, _ := client.GetDatasetByName(ctx, "my-dataset")

// Delete dataset
dataset.Delete(ctx)

Experiments

// Create an experiment
experiment, _ := client.CreateExperiment(ctx, "my-dataset",
    opik.WithExperimentName("gpt-4-evaluation"),
    opik.WithExperimentMetadata(map[string]any{"model": "gpt-4"}),
)

// Log experiment items
experiment.LogItem(ctx, datasetItemID, traceID,
    opik.WithExperimentItemInput(input),
    opik.WithExperimentItemOutput(output),
)

// Complete or cancel experiments
experiment.Complete(ctx)
experiment.Cancel(ctx)

// List experiments
experiments, _ := client.ListExperiments(ctx, datasetID, 1, 100)

// Delete experiment
experiment.Delete(ctx)

Prompts

// Create a prompt
prompt, _ := client.CreatePrompt(ctx, "greeting-prompt",
    opik.WithPromptDescription("Greeting template"),
    opik.WithPromptTemplate("Hello, {{name}}! Welcome to {{place}}."),
    opik.WithPromptTags("greeting", "template"),
)

// Get prompt by name
version, _ := client.GetPromptByName(ctx, "greeting-prompt", "")

// Render template with variables
rendered := version.Render(map[string]string{
    "name":  "Alice",
    "place": "Wonderland",
})
// Result: "Hello, Alice! Welcome to Wonderland."

// Extract variables from template
vars := version.ExtractVariables()
// Result: ["name", "place"]

// Create new version
newVersion, _ := prompt.CreateVersion(ctx, "Hi, {{name}}!",
    opik.WithVersionChangeDescription("Simplified greeting"),
)

// List all versions
versions, _ := prompt.GetVersions(ctx, 1, 100)

// List all prompts
prompts, _ := client.ListPrompts(ctx, 1, 100)

HTTP Middleware

import "github.com/agentplexus/go-opik/middleware"

// Wrap HTTP handlers with automatic tracing
handler := middleware.TracingMiddleware(client, "api-request")(yourHandler)

// Use tracing HTTP client for outgoing requests
httpClient := middleware.TracingHTTPClient("external-call")
resp, _ := httpClient.Get("https://api.example.com/data")

// Or wrap an existing transport
transport := middleware.NewTracingRoundTripper(http.DefaultTransport, "api-call")
httpClient := &http.Client{Transport: transport}

Local Recording (Testing)

// Record traces locally without sending to server
client := opik.RecordTracesLocally("my-project")
trace, _ := client.Trace(ctx, "test-trace")
span, _ := trace.Span(ctx, "test-span")
span.End(ctx)
trace.End(ctx)

// Access recorded data
traces := client.Recording().Traces()
spans := client.Recording().Spans()

Batching

// Create a batching client for efficient API calls
client, _ := opik.NewBatchingClient(
    opik.WithProjectName("My Project"),
)

// Operations are batched automatically
client.AddFeedbackAsync("trace", traceID, "score", 0.95, "reason")

// Flush pending operations
client.Flush(5 * time.Second)

// Close and flush on shutdown
defer client.Close(10 * time.Second)

Attachments

// Create attachment from file
attachment, _ := opik.NewAttachmentFromFile("/path/to/image.png")

// Create attachment from bytes
attachment := opik.NewAttachmentFromBytes("data.json", jsonBytes, "application/json")

// Create text attachment
attachment := opik.NewTextAttachment("notes.txt", "Some text content")

// Get data URL for embedding
dataURL := attachment.ToDataURL()

Evaluation Framework

Heuristic Metrics

import (
    "github.com/agentplexus/go-opik/evaluation"
    "github.com/agentplexus/go-opik/evaluation/heuristic"
)

// Create metrics
metrics := []evaluation.Metric{
    heuristic.NewEquals(false),           // Case-insensitive equality
    heuristic.NewContains(false),         // Substring check
    heuristic.NewIsJSON(),                // JSON validation
    heuristic.NewLevenshteinSimilarity(false), // Edit distance
    heuristic.NewBLEU(4),                 // BLEU score
    heuristic.NewROUGE(1.0),              // ROUGE-L score
    heuristic.MustRegexMatch(`\d+`),      // Regex matching
}

// Evaluate
engine := evaluation.NewEngine(metrics, evaluation.WithConcurrency(4))
input := evaluation.NewMetricInput("What is 2+2?", "The answer is 4.")
input = input.WithExpected("4")

result := engine.EvaluateOne(ctx, input)
fmt.Printf("Average score: %.2f\n", result.AverageScore())

LLM Judge Metrics

import (
    "github.com/agentplexus/go-opik/evaluation/llm"
    "github.com/agentplexus/go-opik/integrations/openai"
)

// Create LLM provider
provider := openai.NewProvider(openai.WithModel("gpt-4o"))

// Create LLM-based metrics
metrics := []evaluation.Metric{
    llm.NewAnswerRelevance(provider),
    llm.NewHallucination(provider),
    llm.NewFactuality(provider),
    llm.NewCoherence(provider),
    llm.NewHelpfulness(provider),
}

// Custom judge with custom prompt
customJudge := llm.NewCustomJudge("tone_check", `
Evaluate whether the response maintains a professional tone.

User message: {{input}}
AI response: {{output}}

Return JSON: {"score": <0.0-1.0>, "reason": "<explanation>"}
`, provider)

G-EVAL

geval := llm.NewGEval(provider, "fluency and coherence").
    WithEvaluationSteps([]string{
        "Check if the response is grammatically correct",
        "Evaluate logical flow of ideas",
        "Assess clarity of expression",
    })

score := geval.Score(ctx, input)

LLM Provider Integrations

OpenAI

import "github.com/agentplexus/go-opik/integrations/openai"

// Create provider for evaluation
provider := openai.NewProvider(
    openai.WithAPIKey("your-api-key"),
    openai.WithModel("gpt-4o"),
)

// Create tracing provider (auto-traces all calls)
tracingProvider := openai.TracingProvider(opikClient,
    openai.WithModel("gpt-4o"),
)

// Use tracing HTTP client with existing code
httpClient := openai.TracingHTTPClient(opikClient)

Anthropic

import "github.com/agentplexus/go-opik/integrations/anthropic"

// Create provider for evaluation
provider := anthropic.NewProvider(
    anthropic.WithAPIKey("your-api-key"),
    anthropic.WithModel("claude-sonnet-4-20250514"),
)

// Create tracing provider (auto-traces all calls)
tracingProvider := anthropic.TracingProvider(opikClient)

// Use tracing HTTP client with existing code
httpClient := anthropic.TracingHTTPClient(opikClient)

CLI Tool

# Install CLI
go install github.com/agentplexus/go-opik/cmd/opik@latest

# Configure
opik configure -api-key=your-key -workspace=your-workspace

# List projects
opik projects -list

# Create project
opik projects -create="New Project"

# List traces
opik traces -list -project="My Project" -limit=20

# List datasets
opik datasets -list

# Create dataset
opik datasets -create="evaluation-data"

# List experiments
opik experiments -list -dataset="my-dataset"

API Client Access

For advanced usage, access the underlying ogen-generated API client:

api := client.API()
// Use api.* methods for full API access

Error Handling

// Check for specific errors
if opik.IsNotFound(err) {
    // Handle not found
}

if opik.IsUnauthorized(err) {
    // Handle auth failure
}

if opik.IsRateLimited(err) {
    // Handle rate limiting
}

Tutorials

Agentic Observability

For integrating Opik with agent frameworks like Google ADK and Eino, see the Agentic Observability Tutorial. This tutorial covers:

Tracing Google ADK agents with tools
Tracing Eino workflow graphs
Multi-agent orchestration observability
Best practices for agent debugging and monitoring

Development

Running Tests

go test ./...

Running Linter

golangci-lint run

Regenerating API Client

The SDK uses ogen to generate a type-safe API client from the Opik OpenAPI specification. When the upstream API changes, regenerate the client:

Prerequisites:

# Install ogen
go install github.com/ogen-go/ogen/cmd/ogen@latest

Update the OpenAPI spec:

# Download latest spec from Opik repository
curl -o openapi/openapi.yaml \
  https://raw.githubusercontent.com/comet-ml/opik/main/sdks/code_generation/fern/openapi/openapi.yaml

Generate the client:

./generate.sh

This script runs ogen, applies necessary fixes, and verifies the build. For detailed documentation on the generation process and troubleshooting, see the Development Guide.

License

MIT License - see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.github		.github
cmd/opik		cmd/opik
docs		docs
docsrc		docsrc
evaluation		evaluation
examples		examples
integrations		integrations
internal/api		internal/api
llmops		llmops
middleware		middleware
openapi		openapi
testutil		testutil
.golangci.yaml		.golangci.yaml
CHANGELOG.json		CHANGELOG.json
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
PRESENTATION.md		PRESENTATION.md
PRESENTATION_CASE_STUDY.md		PRESENTATION_CASE_STUDY.md
README.md		README.md
RELEASE_NOTES_v0.4.0.md		RELEASE_NOTES_v0.4.0.md
RELEASE_NOTES_v0.5.0.md		RELEASE_NOTES_v0.5.0.md
attachment.go		attachment.go
attachment_test.go		attachment_test.go
batcher.go		batcher.go
batcher_test.go		batcher_test.go
client.go		client.go
client_test.go		client_test.go
config.go		config.go
config_test.go		config_test.go
context.go		context.go
context_test.go		context_test.go
dataset.go		dataset.go
dataset_test.go		dataset_test.go
distributed.go		distributed.go
distributed_test.go		distributed_test.go
errors.go		errors.go
errors_test.go		errors_test.go
experiment.go		experiment.go
experiment_test.go		experiment_test.go
generate.sh		generate.sh
go.mod		go.mod
go.sum		go.sum
integration_test.go		integration_test.go
mkdocs.yml		mkdocs.yml
ogen.yml		ogen.yml
opik		opik
options.go		options.go
options_test.go		options_test.go
prompt.go		prompt.go
prompt_test.go		prompt_test.go
recording.go		recording.go
recording_test.go		recording_test.go
span.go		span.go
span_test.go		span_test.go
streaming.go		streaming.go
streaming_test.go		streaming_test.go
trace.go		trace.go
trace_test.go		trace_test.go
vibeminds-theme.css		vibeminds-theme.css
vibeminds.css		vibeminds.css

License

agentplexus/go-opik

Folders and files

Latest commit

History

Repository files navigation

Go SDK for Comet ML Opik

Installation

Quick Start

OmniObserve Integration

Feature Comparison

Configuration

Environment Variables

Config File

Programmatic Configuration

Features

Traces and Spans

Span Types

Feedback Scores

Context Propagation

Listing Traces and Spans

Distributed Tracing

Streaming Support

Datasets

Experiments

Prompts

HTTP Middleware

Local Recording (Testing)

Batching

Attachments

Evaluation Framework

Heuristic Metrics

LLM Judge Metrics

G-EVAL

LLM Provider Integrations

OpenAI

Anthropic

CLI Tool

API Client Access

Error Handling

Tutorials

Agentic Observability

Development

Running Tests

Running Linter

Regenerating API Client

License

Related

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Contributors 3

Uh oh!

Languages