Lumina Documentation
Open-source, OpenTelemetry-native observability for AI systems
Lumina is a lightweight observability platform for LLM applications that helps you track costs, latency, and quality across your AI systems with minimal overhead.
π Quick Start
Get Lumina running in 5 minutes:
- Quickstart Guide - Step-by-step installation
- Integration Guides - Connect your LLM apps
- API Reference - Complete API docs
π Documentation
Getting Started
- Quickstart Guide - Install and run Lumina in 5 minutes
- FAQ - Common questions answered
- Troubleshooting - Fix common issues
Integration Guides
- Integrations Overview - All supported providers
- OpenAI Integration
- Anthropic (Claude) Integration
- LangChain Integration
- Vercel AI SDK Integration
Architecture & Advanced
- Architecture Overview - System design and components
- Multi-Span Tracing - Complex trace patterns
- RAG Integration - RAG pipeline observability
- Best Practices - Production tips
API Reference
- API Reference - Complete REST API docs
- OpenAPI Spec - Swagger/OpenAPI specification
β¨ Features
Real-Time Trace Ingestion
Track every LLM call with OpenTelemetry-compatible trace collection. Sub-500ms from trace to dashboard.
Cost & Quality Monitoring
Get alerted when costs spike or quality drops. Automatic baseline detection and anomaly alerting.
Replay Testing
Re-run production traces against new prompts or models. Semantic diffing shows you exactly what changed.
Semantic Comparison
Hybrid quality detection: instant hash-based checks + AI semantic scoring when needed.
All Features Included
Everything is free forever in self-hosted mode. No feature gating, no artificial limits.
π Self-Hosted vs Managed Cloud
Self-Hosted (Free Forever)
- 50,000 traces per day - Resets at midnight UTC
- 7-day retention - Automatic cleanup
- All features - Alerts, replay, semantic scoring
- Community support - GitHub Discussions
Perfect for:
- Individual developers
- Small teams
- Side projects
- Proof of concepts
Managed Cloud
- Unlimited traces - No daily limits
- Unlimited retention - Keep your data forever
- SSO & SAML - Enterprise authentication
- SLA support - 99.9% uptime guarantee
- Dedicated support - Email + Slack
Perfect for:
- Production applications
- Enterprise teams
- Companies needing compliance
- Teams wanting hassle-free hosting
π― Why Lumina?
Built for Backend Engineers
Unlike existing AI observability platforms built for data scientists, Lumina is designed for backend/SRE teams. We use the same observability patterns you already know: traces, P95 latencies, and PagerDuty alerts.
OpenTelemetry-First
Lumina is built on OpenTelemetry from day one (not retrofitted). This means:
- Works with your existing OTEL stack
- Can send to multiple backends simultaneously
- No vendor lock-in
- Industry standard instrumentation
Infrastructure-Grade Architecture
Built with production in mind:
- NATS JetStream for high-throughput ingestion
- PostgreSQL for analytical queries
- Redis for caching
- Real-time alerting (<500ms)
- Horizontal scaling (handles 10M+ traces/day)
Cost + Quality Correlation
The only platform that can query cost > $0.50 AND quality < 0.8 in a single dashboard. When your /chat endpoint gets expensive and broken, you know immediately.
π¦ Installation
Docker Compose (Recommended)
git clone https://github.com/use-lumina/Lumina
cd Lumina/infra/docker
cp ../../.env.docker.example ../../.env.docker
# Add your ANTHROPIC_API_KEY to .env.docker
docker-compose --env-file ../../.env.docker up -d
Dashboard available at: http://localhost:3000
Kubernetes (Coming Soon)
Helm charts for production Kubernetes deployments coming soon.
π Quick Integration
import { Lumina } from '@lumina/sdk';
import Anthropic from '@anthropic-ai/sdk';
// Initialize Lumina (no API key needed for self-hosted!)
const lumina = new Lumina({
endpoint: 'http://localhost:8080/v1/traces',
serviceName: 'my-app',
});
// Initialize your LLM client
const anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
// Wrap your LLM call - that's it!
const response = await lumina.traceLLM(
async () => {
return await anthropic.messages.create({
model: 'claude-sonnet-4-5',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Hello!' }],
});
},
{
name: 'chat-completion',
provider: 'anthropic',
model: 'claude-sonnet-4-5',
prompt: 'Hello!',
}
);
See Integration Guides for more examples.
π Comparisons
vs LangSmith
- Lumina: OTEL-first, built for backend engineers, infrastructure-grade
- LangSmith: Tight LangChain integration, evaluation focus
vs Langfuse
- Lumina: 50k traces/day free, OTEL-native, real-time alerts
- Langfuse: Unlimited traces (self-hosted), prompt management
vs Helicone
- Lumina: End-to-end RAG tracing, cost+quality correlation
- Helicone: Gateway approach, simple cost tracking
vs Datadog
- Lumina: Purpose-built for AI, startup-friendly pricing
- Datadog: General APM, enterprise pricing ($100k+/year)
π€ Contributing
We welcome contributions! Check out:
π Resources
- GitHub Repository - Star us!
- Example Applications - Working examples
- Changelog - Whatβs new
- Security Policy - Reporting vulnerabilities
- License - Apache 2.0
π¬ Community
- GitHub Discussions - Ask questions
- Discord - Join the community (coming soon)
- Twitter - Follow for updates
- Email - Contact us
π Security
Found a security vulnerability? Please email us at security@yourdomain.com. Do not open a public issue.
See our Security Policy for details.
Free Forever β’ Self-Hosted β’ All Features Included
Self-hosted Lumina includes all features with 50k traces/day and 7-day retention for $0. Need more? Upgrade to managed cloud β