Skip to main content
AI applications require specialized observability. Lumina provides native support for LLM-specific concerns that traditional APM tools miss.

The Problem

Traditional observability tools fall short for AI applications. APM platforms like Datadog and New Relic treat LLM calls as opaque HTTP requests. They capture latency and status codes but miss critical AI-specific metrics:
  • Cost per request — LLM calls have variable costs based on token usage
  • Quality degradation — Responses can become less accurate without errors
  • Complex workflows — RAG pipelines and agents involve multiple steps
  • Prompt debugging — Need to see full prompts and responses to debug
Existing AI observability tools have limitations:
  • Vendor lock-in — Proprietary formats and APIs
  • Cloud-only — Cannot self-host with data control
  • Expensive — Usage-based pricing scales unpredictably
  • Incomplete — Missing critical features like replay testing

The Lumina Approach

OpenTelemetry-Native

Built on OpenTelemetry from day one, not retrofitted. Benefits:
  • Works with your existing observability stack
  • Send traces to multiple backends simultaneously
  • Industry-standard instrumentation
  • No vendor lock-in
Example:
// Send to both Lumina and Datadog
const lumina = initLumina({
  endpoint: 'http://lumina:9411/v1/traces',
  service_name: 'my-service',
});

// OpenTelemetry SDK automatically sends to configured exporters

Self-Hostable

Deploy on your infrastructure with full data control. Benefits:
  • Complete data ownership
  • Deploy in regulated environments
  • No external data transfer
  • Predictable costs
Deployment options:
  • Docker Compose (development)
  • Kubernetes (production)
  • Bare metal (custom setups)

Cost + Quality Correlation

The only platform that connects cost and quality metrics. Example query:
SELECT * FROM traces
WHERE cost_usd > 0.50
AND quality_score < 0.8
AND timestamp > NOW() - INTERVAL '1 hour'
When your endpoint becomes expensive AND broken, you know immediately.

Production-Grade Architecture

Reliable ingestion:
  • NATS JetStream for high-throughput message queue
  • Handles 10M+ traces/day per cluster
  • Automatic retry and backpressure
Fast queries:
  • PostgreSQL with optimized indexes
  • Redis caching for frequent queries
  • <100ms P95 latency on 10M+ trace tables
Real-time alerting:
  • <500ms from trace to alert
  • Webhook integration (Slack, PagerDuty)
  • Automatic baseline detection

Built for Backend Engineers

Unlike AI observability platforms built for data scientists, Lumina uses familiar observability patterns. Familiar concepts:
  • Traces and spans (not “runs” or “chains”)
  • P95 latency (not “average duration”)
  • PagerDuty webhooks (not email reports)
  • SQL queries (not proprietary DSLs)
Example:
// Standard OpenTelemetry trace
await lumina.trace('api_request', async (span) => {
  span.setAttribute('http.method', 'POST');
  span.setAttribute('http.route', '/api/chat');

  const response = await lumina.traceLLM(
    () => llm.generate(prompt),
    { name: 'llm_call' }
  );

  return response;
});

Feature Comparison

FeatureLuminaLangSmithLangfuseHeliconeDatadog
Self-Hosted
OpenTelemetryNativeAdapterNoNoYes
Multi-Span Tracing
Cost TrackingAutomaticAutomaticManualAutomaticManual
Replay Testing
Semantic Diff
Real-Time Alerts<500msMinutesMinutesN/ASeconds
Free Tier50k traces/day5k traces/monthUnlimited100k requests/monthLimited

Use Cases

Startup to Scale

Early Stage:
  • Self-host for free (50k traces/day)
  • Track costs from day one
  • Iterate quickly with replay testing
Growth Stage:
  • Scale to millions of traces
  • Configure advanced alerting
  • Deploy multi-region
Enterprise:
  • Upgrade to managed cloud
  • SSO integration
  • SLA guarantees

Regulated Industries

Healthcare, Finance, Government:
  • Self-host for data sovereignty
  • Audit trail for compliance
  • On-premises deployment

Multi-Model Applications

Routing and Fallback:
  • Track costs across providers
  • Compare model quality
  • Optimize routing logic
Example:
await lumina.trace('chat_with_fallback', async () => {
  try {
    return await lumina.traceLLM(
      () => anthropic.create({ model: 'claude-opus-4' }),
      { name: 'primary', system: 'anthropic' }
    );
  } catch (error) {
    return await lumina.traceLLM(
      () => openai.create({ model: 'gpt-4' }),
      { name: 'fallback', system: 'openai' }
    );
  }
});

What Makes Lumina Different

Replay Testing

Capture production traces and replay with new prompts before deployment. Example workflow:
  1. Capture baseline from production
  2. Modify prompt in your codebase
  3. Replay against captured traces
  4. Compare responses with semantic diff
  5. Deploy with confidence
No other platform offers this.

Semantic Scoring

Automatic quality detection with hybrid approach: Fast path: Hash-based exact match (instant) Slow path: AI semantic similarity (when needed) Example:
// Automatically scored
const score = await lumina.compareResponses(
  originalResponse,
  newResponse
);

// score.method: 'exact_match' | 'semantic'
// score.similarity: 0.0 - 1.0

Multi-Span Visibility

See the complete execution tree for complex workflows. RAG pipeline example:
rag_pipeline (1200ms, $0.05)
├── retrieval (800ms, $0.01)
│   ├── embedding (100ms, $0.001)
│   └── vector_search (700ms, $0.009)
└── synthesis (400ms, $0.04)
    └── llm_call (380ms, $0.039)
Identify bottlenecks at a glance.

Migration Path

From LangSmith

// Before (LangSmith)
import { Client } from "langsmith";
const client = new Client();
await client.traceRun({ name: "chat", inputs, outputs });

// After (Lumina)
import { initLumina } from '@uselumina/sdk';
const lumina = initLumina({ endpoint, service_name });
await lumina.traceLLM(() => llm.generate(prompt), { name: 'chat' });

From Langfuse

// Before (Langfuse)
import Langfuse from "langfuse";
const langfuse = new Langfuse();
const trace = langfuse.trace({ name: "chat" });

// After (Lumina)
import { initLumina } from '@uselumina/sdk';
const lumina = initLumina({ endpoint, service_name });
await lumina.trace('chat', async () => { /* ... */ });

From Custom Logging

// Before (Custom)
logger.info('LLM call', { prompt, response, cost });

// After (Lumina)
await lumina.traceLLM(
  () => llm.generate(prompt),
  { name: 'chat', prompt }
);
// Automatic cost, latency, token tracking

Getting Started