Harper Eye

AI-Powered Knowledge Assistant for Harper

User Guide

What is Harper Eye?

Harper Eye is an AI-powered assistant that helps Harper engineers diagnose production incidents and answer questions about the Harper platform. Describe an issue or ask a question in plain English — via Slack or the web UI — and Harper Eye searches across multiple internal and external data sources in parallel, then synthesizes the results into an actionable response with cited sources.

Harper Eye is built on Harper itself, running as a Resource API application deployed to Harper Fabric. It uses Claude as its reasoning engine and Google Gemini for embedding generation, backed by a generalized indexer framework that continuously indexes knowledge from multiple sources into Harper tables with vector search.

The codebase can be found here: https://github.com/HarperFast/harper-eye

How to Use

Mention @harper-eye in any Slack channel or use the web UI to describe an incident or ask a question. Harper Eye automatically classifies your query and routes it to the right model and data sources — fast answers for documentation questions, deeper investigation for debugging or customer issues, full Opus analysis for production incidents.

For example: @harper-eye replication is falling behind on us-east cluster or @harper-eye how does Harper handle schema migrations?

Query Classification & Tiers

Every query is automatically classified before being routed. The tier determines which Claude model is used and which sources are prioritized:

Tier	Query Types	Model
T1 — Docs	How-to questions, documentation lookups, feature explanations	Claude Haiku 4.5 (fast)
T2 — Debug / Support	Debugging sessions, customer issue analysis, general investigation	Claude Sonnet 4.6
T3 — Incident	Production incidents, outages, performance degradation, on-call escalations	Claude Opus 4.6 (full power)

Data Sources

Harper Eye combines live queries with pre-indexed knowledge from a generalized indexer framework:

Live Sources (queried at request time by the investigation planner)

Source	Type	What It Provides
Datadog	Monitoring	Monitors, events, host metrics, logs, and per-container status
Grafana/Prometheus	Monitoring	Per-customer Grafana Cloud metrics for customer-specific issue analysis

Indexed Knowledge (pre-embedded, updated on schedule)

Source	Schedule	What It Indexes
Confluence	Daily	Full page content from configured spaces
Zendesk	Daily	Resolved tickets (90 days) with all comments and customer metadata
Harper Docs	Daily	Sitemap-crawled documentation from docs.harperdb.io
Datadog Docs	Weekly	Metrics, monitors, logs, API, agent, and infrastructure documentation
Grafana Docs	Weekly	Grafana Cloud, Alloy, and Loki documentation
GitHub Repos	Daily	Org repo READMEs and open bug issues
Slack Threads	Daily	Resolution discussions from joined channels (3+ replies or resolution signals)
Datadog Config	Daily	All monitor definitions (paginated) and available host tags/dimensions
Prometheus Metrics	Weekly	Metric metadata (names, types, help text) per Grafana Cloud stack
PromQL Reference	Weekly	Static PromQL query language guide
Domain Knowledge	Curated	Admin-maintained articles about Harper architecture, known issues, and best practices
Code Knowledge	On demand	Pre-indexed architecture context from key source files

Feedback Loop

Engineers can rate responses as helpful or not helpful directly from Slack. Positive feedback on T1 (documentation) queries saves the answer as a domain knowledge article for future context. Negative feedback is tracked with vector embeddings and injected as warnings into future prompts for similar queries, so Claude avoids repeating past mistakes. T2/T3 answers with live metrics are not cached, since they go stale.

Using Harper Eye in Slack

@harper-eye Mentions

Mention @harper-eye in any channel followed by your question or incident description. Harper Eye responds in a thread with a synthesized diagnosis, resolution steps, and source citations. Multi-turn follow-ups are supported — just keep replying in the thread.

Admin Slash Commands

Command	Description
`/index-repos`	Re-index GitHub repos into the code knowledge base

Using the Web UI

Harper Eye also has a web interface accessible via Google OAuth. The web UI provides:

Search — Ask questions or describe incidents from your browser, with the same multi-source search as Slack
Dashboard — View usage stats, feedback analytics, and query classification breakdowns by tier
Knowledge — Manage curated domain knowledge articles
Learn — This page

How a Query Works

When you submit a question or incident description, here is what happens:

Your query is received via Slack @mention or the web UI (Slack events are deduped to prevent duplicate responses)
The query is classified by tier (T1 docs / T2 debug / T3 incident), an embedding is generated, and for T2+ queries AI entity extraction (Haiku) runs in parallel — extracting hostname, customer name, and cluster prefix
A 5-minute response cache is checked — if a similar query (same tier, same host) was recently answered, the cached response is returned immediately
T1 (docs): Indexed knowledge and domain knowledge articles are searched via vector similarity, then Claude Haiku synthesizes a response
T2/T3 (investigation): An iterative Plan → Execute → Evaluate loop runs. The planner searches indexed knowledge (Datadog monitors, host tags, metric metadata, Confluence, Zendesk, Slack threads, docs) for context, then constructs and executes live queries against Datadog and Grafana/Prometheus. Past negative feedback for similar queries is injected so the planner avoids repeating mistakes. Multiple iterations run until a sufficient answer is found (T2: 2 rounds, T3: 3 rounds)
Claude synthesizes all gathered evidence into a concise, actionable response with linked sources
The response is posted back to Slack or displayed in the web UI
The query and response are logged. "Helpful" T1 answers are saved as domain articles; "Not helpful" answers are tracked with embeddings for the negative feedback loop. An operational monitor checks indexer health and query quality hourly

Architecture

System Layers

Slack Integration Layer — A Harper Resource API endpoint (SlackEvents) that handles slash commands, event callbacks (app_mention), and URL verification. Uses the Slack Web API client for posting responses and updating messages.

Orchestration Layer — The core intelligence engine. For T1 queries, generates embeddings, searches indexed knowledge, and calls Claude to synthesize a response. For T2/T3, delegates to an investigation planner that iteratively plans and executes live queries against Datadog and Grafana/Prometheus, using indexed knowledge (monitors, host tags, metric metadata) to construct precise queries. Includes retry logic for rate limits and response normalization.

Data Source Layer — Live source modules for Datadog and Grafana/Prometheus, each wrapping a REST API and returning structured results. All other sources (Confluence, Zendesk, GitHub, Harper Docs, Slack) are served via the indexer framework's vector search. Domain Knowledge articles are curated, admin-maintained content injected as authoritative context. Negative feedback from past queries is vector-searched and injected into the planner to avoid repeating mistakes.

Indexer Framework — A generalized pipeline that continuously indexes knowledge from 10+ source adapters. Each adapter fetches chunks, the core handles content-hash dedup, Gemini embedding, and storage in Harper's IndexedChunk table. Adapters run on daily or weekly schedules with distributed locking across cluster nodes.

Knowledge & Embedding Layer — Harper tables with HNSW vector indexes for indexed chunks, code knowledge, domain knowledge, and source relevance. Uses Google Gemini for embedding generation and cosine similarity for search.

Tech Stack

Component	Technology
Application Platform	Harper Resource API on Harper Fabric
AI Reasoning	Claude API (Haiku 4.5 / Sonnet 4.6 / Opus 4.6, selected per query tier)
Embeddings	Google Gemini (text-embedding-004)
Vector Search	Harper HNSW indexes
Slack Integration	Slack Web API (@slack/web-api)
Web UI Auth	Google OAuth 2.0
Live Sources	Datadog, Grafana/Prometheus REST APIs
Indexed Sources	Confluence, Zendesk, GitHub, Harper Docs, Slack, Datadog config, Prometheus metadata

Harper-Specific Intelligence

Harper Eye is tuned for Harper's specific platform and architecture. All responses are grounded in Harper-specific data and use correct terminology:

Current Architecture Awareness — Knows the difference between current (Plexus, Resource API, Fabric) and deprecated (NATS, Custom Functions, Studio) features
Negative Feedback Learning — When an answer is marked unhelpful, future queries on similar topics avoid repeating the same mistakes
Source Citations — Every response cites where its information came from, so engineers can drill deeper
No Generic Advice — Will not give general database or infrastructure advice; all guidance is specific to Harper's architecture and tooling