Traces and Spans
Lumina uses OpenTelemetry’s trace model.Trace
A trace represents a complete request or operation in your application. Properties:- Unique
trace_id - One or more spans
- Start and end timestamps
- Service name
Span
A span represents a single operation within a trace. Properties:- Unique
span_id - Parent
span_id(for child spans) - Operation name
- Start and end timestamps
- Attributes (key-value metadata)
Single-Span vs Multi-Span
Single-Span Trace: One operation per trace (simple LLM calls).Attributes
Attributes are key-value pairs attached to spans.Standard Attributes
Automatically extracted by Lumina:| Attribute | Type | Description |
|---|---|---|
model | string | LLM model name |
prompt_tokens | int | Input token count |
completion_tokens | int | Output token count |
cost_usd | float | Calculated cost |
latency_ms | int | Duration in milliseconds |
status | string | ok, error, degraded |
Custom Attributes
Add your own metadata:- User identification
- Session tracking
- Feature flags
- A/B test variants
- Custom dimensions for analytics
Cost Calculation
Lumina automatically calculates costs based on provider pricing.Supported Providers
| Provider | Models | Pricing Source |
|---|---|---|
| OpenAI | GPT-4, GPT-3.5, GPT-4 Turbo | openai.com/pricing |
| Anthropic | Claude 3 Opus, Sonnet, Haiku | anthropic.com/pricing |
| Anthropic | Claude 3.5 Sonnet | anthropic.com/pricing |
| Anthropic | Claude Sonnet 4.5 | anthropic.com/pricing |
How It Works
1. Extract model nameCustom Pricing
For unlisted models, specify costs manually:Quality Monitoring
Lumina tracks response quality with hybrid detection.Exact Match (Fast)
Hash-based comparison for identical responses. Use case: Detect unexpected changesSemantic Similarity (Accurate)
AI-powered semantic comparison for meaning preservation. Use case: Detect quality degradationWhen to Use Each
Exact match:- Regression detection
- Template responses
- Structured output (JSON)
- Natural language responses
- Prompt engineering
- A/B testing
Alerting
Lumina detects anomalies and sends alerts.Cost Alerts
Triggered when costs spike above baseline. Example:Quality Alerts
Triggered when response quality drops. Example:Configuration
Set thresholds per service:Replay Testing
Test changes against real production data.Workflow
1. Capture baselineUse Cases
Prompt engineering:- Test prompt variations
- Optimize system messages
- Validate few-shot examples
- Compare model outputs
- Validate cost/quality tradeoffs
- Test provider switching
- Test rate limiting
- Validate caching
- Load testing
Sampling
For high-volume workloads, sample a subset of traces.Head-Based Sampling
Decide at trace start whether to record. Example:- Low overhead
- Predictable costs
- May miss rare errors
- No dynamic adjustment
Conditional Sampling
Always sample important traces. Example:- Sample routine operations at 10%
- Capture all important events at 100%
Data Retention
Lumina automatically manages trace lifecycle.Self-Hosted Defaults
- Daily limit: 50,000 traces
- Retention: 7 days
- Cleanup: Automatic at midnight UTC
Custom Configuration
Archival
Export traces before deletion:Architecture Components
Understanding the system architecture helps with deployment and troubleshooting.Ingestion Service
Purpose: Receive and validate traces Responsibilities:- Accept OTLP/HTTP traces
- Validate schema
- Publish to NATS queue
Worker Pool
Purpose: Process traces asynchronously Responsibilities:- Calculate costs
- Extract metadata
- Store in PostgreSQL
Query API
Purpose: Retrieve traces and analytics Responsibilities:- Serve dashboard queries
- Provide REST API
- Cache frequent queries
Replay Engine
Purpose: Re-execute traces with new parameters Responsibilities:- Capture replay sets
- Execute with LLM APIs
- Compare results
Dashboard
Purpose: Visualization and management Responsibilities:- Display traces
- Show analytics
- Manage alerts
Best Practices
Attribute Naming
Use consistent naming conventions: Good:Span Naming
Use descriptive, hierarchical names: Good:Error Handling
Always capture errors:Cost Management
Monitor costs proactively:- Set up cost alerts
- Review expensive queries daily
- Optimize high-cost endpoints
- Consider model downgrading for simple tasks