Live demo · public SEC data · read-only

A messy data room goes in.
A fully cited credit memo comes out.

Memo Engine is a production deal-analysis system for institutional credit investing. This is the real pipeline, pointed at a real target: 79 public SEC filings for AMC Entertainment, ingested and analyzed end to end, drafted into an investment memorandum where every figure traces back to the exact source page. Nothing here is mocked. Open it and click any citation.

Open the live workspace →How it works ↓

79: SEC filings ingested
3,336: chunks embedded · 1024-dim
~14,000: words, cited memo
~40: fields extracted
126: source citations
1: unattended run

Ingest→Embed→Retrieve→Reason→Extract→Draft→Cite→Export

What you’ll be looking at

The run already completed. You land in the analyst workspace, where the whole thing is clickable and read-only.

The workspace

The six-stage pipeline, the analysis, and the chat review gate that sits in front of every generated artifact.

Open workspace→

The tear sheet

A dense, structured analyst view: credit snapshot, capital structure, covenants, comparables, scenarios, and an auto-populated public-issuer profile from SEC EDGAR.

Open tear sheet→

The memorandum

A ~14,000-word investment-bank-format credit memo with inline citations — click any superscript to jump to the source page or cell.

Open workspace→

How it works

Eight stages, each one a deliberate engineering choice

The hard part of analyzing a credit deal with an LLM is not the prose — it is grounding every claim in a large, messy corpus without blowing the context window or hallucinating. Each stage below is built against the way the naive version fails.

01
Ingest a real data room
Drag a folder of mixed, badly-named documents onto the page. The system takes it as-is.
Folder drop walks the tree at any depth. Sixteen concurrent workers stream each file straight to blob storage rather than buffering whole workbooks in function memory, and a SHA-256 hash per file means re-uploading a data room after two new documents costs two uploads, not the whole set. A parser registry dispatches by MIME type: PDFs (text extraction with an OCR fallback for scans), Excel (each sheet emits two chunks — a CSV chunk for search and a structural Markdown chunk that captures formulas, named ranges, and cell comments), Word, Outlook .msg and .eml correspondence, and images handed to the model as vision input. Ten files parse in parallel behind a per-file timeout, and a defensive sweep marks any file that never finalized as failed so one corrupt scan cannot stall the deal.
vs. the traditional approach
An associate spends the first day just opening and renaming files. A naive loader chokes on a 50 MB styled workbook, or silently drops the email thread that holds the real terms.
02
Ground every chunk before embedding
A bare passage is unsearchable. Each chunk is told where it lives first.
Before embedding, every ~1,500-token chunk receives a two-to-three sentence prefix, written by the model, that explains how the passage fits its source document. The first 400K characters of each document are cached so generating those prefixes bills at the cached input rate instead of the full rate. Chunks are then embedded with Voyage AI voyage-3 (1024 dimensions) into Postgres with pgvector. This follows Anthropic’s contextual-retrieval research, and it matters most on dense financial text.
vs. the traditional approach
Naive RAG embeds the raw chunk. “Leverage steps down to 3.5x in FY2” loses which document and which facility it came from — so retrieval surfaces a confident wrong answer instead of the right clause.
03
Retrieve in parallel, score by relevance and recency
One similarity search is not enough for a credit file.
Eight retrieval queries fan out at once, each targeting a different dimension of the deal — financials, covenants, capital structure, management, comparables, and so on — then dedupe per source file into a single context set. Scoring blends vector similarity with a recency boost, score = similarity × (1 + e^(−ageDays / halfLife)), so that between two equally similar clauses the one from the more recent filing wins.
vs. the traditional approach
A single similarity search returns five paragraphs that all say the same thing and misses the covenant table entirely. Recency-blind retrieval cites a figure that a later amendment already superseded.
04
Reason first, then extract — on two models, by design
You cannot get deep reasoning and a rigid 40-field schema out of one call. So it is two.
Analysis runs in two passes. Pass one runs on Claude Fable 5, where thinking is always on: free-form analytical prose with inline citations across every dimension the schema will need. Pass two is mechanical — forced tool use extracts roughly forty Zod-validated fields out of the pass-one prose. They are split because the API rejects extended thinking combined with forced tool choice, and a forty-field schema overruns the structured-output grammar compiler in a single shot. Four analyst “desks” extract in parallel and a fifth reads their output to form the recommendation. The system blocks are byte-identical across passes, so every call after the first reads from the prompt cache.
vs. the traditional approach
Ask one model for “a structured forty-field analysis with deep reasoning” and you get reasoning bent to fit the schema, or an outright schema violation. Splitting the work gets you both: the thinking is genuine, the output is typed.
05
Draft the memos under a house-style linter
A triage memo to decide if the deal is worth time, and a full memorandum for the file.
The internal memo answers one question — is this worth real diligence? — while the external memo is a full investment-bank-format credit memorandum. Both stream from the same eight-query retrieval, with the completed analysis facts woven into every query. A linter checks the finished draft against a house style guide (banned words, table and formatting rules) and logs violations, and that style guidance also lives in the system prompt so the model gets it up front rather than through an expensive second pass.
vs. the traditional approach
A junior analyst drafts for two days and still drifts off house style. An ungrounded model writes fluent prose that is quietly detached from what the documents actually say.
06
Cite everything; gate what is not cited
In a credit memo, an unsourced number is a liability. Here, every claim traces to a page or a cell.
Every factual claim carries an inline [F#:location] marker bound to a specific source chunk. In the workspace these render as clickable superscripts that open a side panel showing the source text and jump to the exact PDF page or Excel cell. A validation pass strips any citation that points at a document id the model invented. A coverage gate then scans the finished draft for numeric, date, and proper-noun claims that lack an adjacent citation and surfaces them as warnings — an analyst can still approve with a reason, but nothing unsourced slips by unnoticed.
vs. the traditional approach
In a hand-written memo you take the figure on faith and chase the footnote later, if ever. Here the unsourced figure is flagged before sign-off and every sentence is traceable to its origin.
07
Run it durably — it takes 40 minutes, and that is fine
A real run is long. The orchestration is built to survive that.
Parse, analysis, research, and both memo stages each run as a step in a durable workflow (Vercel Workflow DevKit) with its own 800-second budget and crash recovery, so the whole pipeline resumes from the step that failed rather than from the beginning. Parsing once ran inside a serverless after() hook and hit a hard 300-second ceiling on long filings; moving it into a workflow step removed that ceiling and made the run resumable.
vs. the traditional approach
A 40-minute job stuffed into one serverless function dies at the platform timeout and you start over. Here a crashed step picks up exactly where it stopped.
08
Export bank-quality artifacts
The output is not a chat transcript. It is a document set you could send.
PDF is rendered through headless Chromium (the @sparticuz/chromium build that runs on serverless) with a branded cover, numbered footnotes, and a bibliography. The Excel export is a six-worksheet ExcelJS financial model with live SUM and margin formulas, a two-variable sensitivity table, and conditional formatting. There is a DOCX with real footnote references, and a ZIP bundle of all of it.
vs. the traditional approach
The usual finish line is copy-pasting model output into a Word template and rebuilding the spreadsheet by hand.

Built from scratch, direct SDK calls, no framework wrappers

Reasoning: Claude Fable 5 (always-on thinking) + Sonnet 4.6 extraction
Framework: Next.js 16 App Router, React 19
Retrieval: Voyage voyage-3 embeddings, Postgres + pgvector
Orchestration: Vercel Workflow DevKit (durable steps)
Structured output: Forced tool use, Zod → JSON schema
Export: Headless Chromium PDF, ExcelJS, DOCX

See it for yourself

The workspace is read-only and runs on public filings, so you can open anything. Editing and pipeline actions are disabled — no tokens are spent by visitors.

Open the live workspace →Read the source

A messy data room goes in.A fully cited credit memo comes out.

What you’ll be looking at

The workspace

The tear sheet

The memorandum

Eight stages, each one a deliberate engineering choice

Ingest a real data room

Ground every chunk before embedding

Retrieve in parallel, score by relevance and recency

Reason first, then extract — on two models, by design

Draft the memos under a house-style linter

Cite everything; gate what is not cited

Run it durably — it takes 40 minutes, and that is fine

Export bank-quality artifacts

Built from scratch, direct SDK calls, no framework wrappers

See it for yourself

A messy data room goes in.
A fully cited credit memo comes out.