Why Lumina - Lumina Documentation

AI applications require specialized observability. Lumina provides native support for LLM-specific concerns that traditional APM tools miss.

The Problem

Traditional observability tools fall short for AI applications. APM platforms like Datadog and New Relic treat LLM calls as opaque HTTP requests. They capture latency and status codes but miss critical AI-specific metrics:

Cost per request — LLM calls have variable costs based on token usage
Quality degradation — Responses can become less accurate without errors
Complex workflows — RAG pipelines and agents involve multiple steps
Prompt debugging — Need to see full prompts and responses to debug

Existing AI observability tools have limitations:

Vendor lock-in — Proprietary formats and APIs
Cloud-only — Cannot self-host with data control
Expensive — Usage-based pricing scales unpredictably
Incomplete — Missing critical features like replay testing

The Lumina Approach

OpenTelemetry-Native

Built on OpenTelemetry from day one, not retrofitted. Benefits:

Works with your existing observability stack
Send traces to multiple backends simultaneously
Industry-standard instrumentation
No vendor lock-in

Example:

// Send to both Lumina and Datadog
const lumina = initLumina({
  endpoint: 'http://lumina:9411/v1/traces',
  service_name: 'my-service',
});

// OpenTelemetry SDK automatically sends to configured exporters

Self-Hostable

Deploy on your infrastructure with full data control. Benefits:

Complete data ownership
Deploy in regulated environments
No external data transfer
Predictable costs

Deployment options:

Docker Compose (development)
Kubernetes (production)
Bare metal (custom setups)

Cost + Quality Correlation

The only platform that connects cost and quality metrics. Example query:

SELECT * FROM traces
WHERE cost_usd > 0.50
AND quality_score < 0.8
AND timestamp > NOW() - INTERVAL '1 hour'

When your endpoint becomes expensive AND broken, you know immediately.

Production-Grade Architecture

Reliable ingestion:

NATS JetStream for high-throughput message queue
Handles 10M+ traces/day per cluster
Automatic retry and backpressure

Fast queries:

PostgreSQL with optimized indexes
Redis caching for frequent queries
<100ms P95 latency on 10M+ trace tables

Real-time alerting:

<500ms from trace to alert
Webhook integration (Slack, PagerDuty)
Automatic baseline detection

Built for Backend Engineers

Unlike AI observability platforms built for data scientists, Lumina uses familiar observability patterns. Familiar concepts:

Traces and spans (not “runs” or “chains”)
P95 latency (not “average duration”)
PagerDuty webhooks (not email reports)
SQL queries (not proprietary DSLs)

Example:

// Standard OpenTelemetry trace
await lumina.trace('api_request', async (span) => {
  span.setAttribute('http.method', 'POST');
  span.setAttribute('http.route', '/api/chat');

  const response = await lumina.traceLLM(
    () => llm.generate(prompt),
    { name: 'llm_call' }
  );

  return response;
});

Feature Comparison

Feature	Lumina	LangSmith	Langfuse	Helicone	Datadog
Self-Hosted	✓	✗	✓	✗	✗
OpenTelemetry	Native	Adapter	No	No	Yes
Multi-Span Tracing	✓	✓	✓	✗	✓
Cost Tracking	Automatic	Automatic	Manual	Automatic	Manual
Replay Testing	✓	✗	✗	✗	✗
Semantic Diff	✓	✗	✗	✗	✗
Real-Time Alerts	<500ms	Minutes	Minutes	N/A	Seconds
Free Tier	50k traces/day	5k traces/month	Unlimited	100k requests/month	Limited

Use Cases

Startup to Scale

Early Stage:

Self-host for free (50k traces/day)
Track costs from day one
Iterate quickly with replay testing

Growth Stage:

Scale to millions of traces
Configure advanced alerting
Deploy multi-region

Enterprise:

Upgrade to managed cloud
SSO integration
SLA guarantees

Regulated Industries

Healthcare, Finance, Government:

Self-host for data sovereignty
Audit trail for compliance
On-premises deployment

Multi-Model Applications

Routing and Fallback:

Track costs across providers
Compare model quality
Optimize routing logic

Example:

await lumina.trace('chat_with_fallback', async () => {
  try {
    return await lumina.traceLLM(
      () => anthropic.create({ model: 'claude-opus-4' }),
      { name: 'primary', system: 'anthropic' }
    );
  } catch (error) {
    return await lumina.traceLLM(
      () => openai.create({ model: 'gpt-4' }),
      { name: 'fallback', system: 'openai' }
    );
  }
});

What Makes Lumina Different

Replay Testing

Capture production traces and replay with new prompts before deployment. Example workflow:

Capture baseline from production
Modify prompt in your codebase
Replay against captured traces
Compare responses with semantic diff
Deploy with confidence

No other platform offers this.

Semantic Scoring

Automatic quality detection with hybrid approach: Fast path: Hash-based exact match (instant) Slow path: AI semantic similarity (when needed) Example:

// Automatically scored
const score = await lumina.compareResponses(
  originalResponse,
  newResponse
);

// score.method: 'exact_match' | 'semantic'
// score.similarity: 0.0 - 1.0

Multi-Span Visibility

See the complete execution tree for complex workflows. RAG pipeline example:

rag_pipeline (1200ms, $0.05)
├── retrieval (800ms, $0.01)
│   ├── embedding (100ms, $0.001)
│   └── vector_search (700ms, $0.009)
└── synthesis (400ms, $0.04)
    └── llm_call (380ms, $0.039)

Identify bottlenecks at a glance.

Migration Path

From LangSmith

// Before (LangSmith)
import { Client } from "langsmith";
const client = new Client();
await client.traceRun({ name: "chat", inputs, outputs });

// After (Lumina)
import { initLumina } from '@uselumina/sdk';
const lumina = initLumina({ endpoint, service_name });
await lumina.traceLLM(() => llm.generate(prompt), { name: 'chat' });

From Langfuse

// Before (Langfuse)
import Langfuse from "langfuse";
const langfuse = new Langfuse();
const trace = langfuse.trace({ name: "chat" });

// After (Lumina)
import { initLumina } from '@uselumina/sdk';
const lumina = initLumina({ endpoint, service_name });
await lumina.trace('chat', async () => { /* ... */ });

From Custom Logging

// Before (Custom)
logger.info('LLM call', { prompt, response, cost });

// After (Lumina)
await lumina.traceLLM(
  () => llm.generate(prompt),
  { name: 'chat', prompt }
);
// Automatic cost, latency, token tracking

Getting Started

Quickstart

Install Lumina in 5 minutes

SDK Reference

Instrument your application

Architecture

Understand how Lumina works

Compare Features

Detailed feature comparison

​The Problem

​The Lumina Approach

​OpenTelemetry-Native

​Self-Hostable

​Cost + Quality Correlation

​Production-Grade Architecture

​Built for Backend Engineers

​Feature Comparison

​Use Cases

​Startup to Scale

​Regulated Industries

​Multi-Model Applications

​What Makes Lumina Different

​Replay Testing

​Semantic Scoring

​Multi-Span Visibility

​Migration Path

​From LangSmith

​From Langfuse

​From Custom Logging

​Getting Started

Quickstart

SDK Reference

Architecture

Compare Features

The Problem

The Lumina Approach

OpenTelemetry-Native

Self-Hostable

Cost + Quality Correlation

Production-Grade Architecture

Built for Backend Engineers

Feature Comparison

Use Cases

Startup to Scale

Regulated Industries

Multi-Model Applications

What Makes Lumina Different

Replay Testing

Semantic Scoring

Multi-Span Visibility

Migration Path

From LangSmith

From Langfuse

From Custom Logging

Getting Started