What Lumina Does
Cost Tracking
Automatic cost calculation for OpenAI, Anthropic, and major providers. Track spending per service, model, user, or query.
Distributed Tracing
Visualize complex AI workflows including RAG pipelines, agent loops, and multi-model systems with hierarchical span support.
Quality Monitoring
Detect semantic degradation and quality regressions with built-in similarity scoring and custom evaluation metrics.
Replay Testing
Test prompt changes against real production queries before deployment. Compare responses with semantic diff visualization.
Why Lumina
Traditional APM tools treat LLM calls as opaque HTTP requests. Lumina provides native observability for AI systems. The Problem: AI applications are fundamentally different from traditional software. Token costs accumulate rapidly. Response quality degrades silently. Latency compounds across multi-step workflows. Production incidents require full trace context to debug. The Solution: Lumina provides automatic cost calculation, quality tracking, and hierarchical tracing for complex pipelines like RAG and agents. Built on OpenTelemetry standards, Lumina integrates into your existing infrastructure without vendor lock-in.Quick Links
5-Minute Quickstart
Get Lumina running locally with Docker Compose
SDK Reference
Complete API documentation for TypeScript/JavaScript
Production Deployment
Deploy to production with Kubernetes
Architecture
System design and component details
Trust Signals
OpenTelemetry-Native Built on OpenTelemetry from day one. Works with your existing observability stack without vendor lock-in. Production Ready PostgreSQL backend with production-tested schema. NATS-based reliable ingestion pipeline handles 10M+ traces/day. Self-Hostable Full data control with self-hosting. Deploy on your infrastructure with Docker, Kubernetes, or bare metal. Open Source Apache 2.0 licensed. Free forever with all features included. No paywalled functionality.How It Works
- Automatic cost calculation
- Token tracking (prompt, completion)
- Latency measurement
- Error tracking
- Custom metadata
Features
Single-Span Tracing Track individual LLM calls with automatic attribute extraction.Self-Hosted Limits
The open-source version includes usage limits to prevent abuse:- 50,000 traces per day — Resets at midnight UTC
- 7-day retention — Automatic cleanup of older traces
- All features enabled — No paywalled functionality
Architecture
- Ingestion Service — Receives OTLP traces over HTTP. Publishes to NATS for async processing.
- Worker Pool — Consumes traces from queue. Calculates costs and persists to PostgreSQL.
- Query API — REST endpoints for trace retrieval and analytics. Powers dashboard.
- Replay Engine — Captures production traces and re-executes with modified parameters.
- Dashboard — Next.js application for visualization and trace inspection.
Use Cases
Production Monitoring Track all LLM calls across microservices. Identify expensive queries, slow endpoints, and quality degradations. Cost Optimization Analyze spending by model, service, and user. Find opportunities to switch models or optimize prompts. Regression Testing Test prompt changes against real production queries before deployment. Catch quality regressions with semantic scoring. Debugging Reproduce production issues with full trace context. View prompt, response, model parameters, and execution timeline. Compliance Audit all AI interactions with complete logs. Filter by user, timestamp, or custom metadata.Next Steps
Install Lumina
Get running in 5 minutes with Docker Compose
Instrument Your App
Add the SDK to your application
View Examples
Explore example applications
Join Community
Ask questions and get help
Support
- Documentation: docs.uselumina.io
- GitHub Issues: Bug reports & feature requests
- Discussions: Community forum
- Email: support@uselumina.io