rcd-maildir-rag.el: Private Offline RAG for Emacs Maildir Emails


Talking to Your Emails: A Local RAG Solution for Emacs Users

Introduction

Imagine being able to have a conversation with all your emails—asking questions like “What did I discuss with John last week?” or “Find all emails about the project deadline"—and getting intelligent answers powered by your own private AI. This is now possible with rcd-maildir-rag.el, an Emacs package that enables Retrieval-Augmented Generation (RAG) queries against your local Maildir emails.

This article explores how to set up a completely local embedding and reranking pipeline, allowing you to chat with your emails without sending any data to external cloud services.

The Vision: Doug Engelbart’s Influence

The ability to talk to your personal knowledge base aligns with Doug Engelbart’s vision of augmenting human intelligence through technology. As described in the Technology Template Project OHS Framework, Engelbart envisioned systems that help humans manage and make sense of increasing amounts of information.

“The capability to interact with information at all levels of detail, to derive new understanding from existing knowledge, and to collaborate across both time and space.”

Our implementation of talking to emails is a practical realization of this vision—using AI to help manage the overwhelming volume of personal communications we all face.

The Database Schema

Embedding Types Table

The embeddingtypes table defines what kinds of content can be embedded and searched:

embeddingtypes_id embeddingtypes_name embeddingtypes_table
1 Elementary objects hyobjects
2 People people
3 Files files
4 LLM Responses llm
5 Speech speech
6 Org Mode Headings org
7 Emacs Lisp elisp
14 E-mail (via messageids table)

This extensible design allows adding new content types as needed—for example, type 14 was added specifically for email support.

Embeddings Table Definition

CREATE TABLE public.embeddings (
    embeddings_id integer NOT NULL,
    embeddings_uuid uuid DEFAULT gen_random_uuid() NOT NULL,
    embeddings_datecreated timestamp with time zone DEFAULT CURRENT_TIMESTAMP NOT NULL,
    embeddings_datemodified timestamp with time zone,
    embeddings_usercreated text DEFAULT CURRENT_USER NOT NULL,
    embeddings_usermodified text DEFAULT CURRENT_USER NOT NULL,
    embeddings_referencedid integer,
    embeddings_referenceduuid uuid,
    embeddings_embeddingtypes integer NOT NULL,
    embeddings_embeddings jsonb,
    embeddings_text text,
    embeddings_name text,
    embeddings_files integer,
    embeddings_chunkid integer,
    embeddings_embedding_id integer,
    FOREIGN KEY (embeddings_embeddingtypes) REFERENCES public.embeddingtypes(embeddingtypes_id)
);

Key columns:

Local Embedding Model

The embedding generation happens entirely locally using a self-hosted model:

(defun rcd-llm-get-embedding-single (input &optional prefix)
  "Fetch a single embedding for INPUT."
  (let* ((url "http://192.168.1.68:9999/v1/embeddings")
         (model "any")
         ;; ... HTTP POST request to local server
         ))

Details:

Setting Up the Local Embedding Server

You can run your own embedding server using:

  1. llama.cpp with embedding server enabled
  2. text-embedding-ollama self-hosted
  3. FastAPI with sentence-transformers

The server exposes an OpenAI-compatible endpoint at port 9999, accepting POST requests with the model name and input text, returning a 1536-dimensional (or model-specific) vector.

Local Reranking

After retrieving candidate embeddings, a cross-encoder reranker reorders results for better relevance:

(defun rcd-llm-rerank (query documents &optional top-n)
  "Rerank DOCUMENTS based on relevance to QUERY."
  (let* ((json-plist (list
                      :model "llama-server"
                      :query query
                      :top_n top-n
                      :return_documents t
                      :documents text-only-list))
         ;; POST to local reranker at port 7676
         ))

Details:

Setting Up the Local Reranker

A local reranker can be implemented using:

Generating Email Embeddings

Before you can query your emails, you need to generate embeddings for them:

(require 'rcd-maildir-embeddings)

;; Process all emails in a Maildir
(rcd-maildir-process-email-embeddings "me@example.com")

This function:

  1. Scans ~/Maildir/me@example.com/cur/ and new/
  2. For each email, extracts Subject, From, and body
  3. Sends text to the local embedding server
  4. Stores vectors in the embeddings table with type 14

Querying Emails with RAG

Once embeddings exist, use the RAG functions:

(require 'rcd-maildir-rag)

;; Get context without asking LLM
(rcd-maildir-rag-query "me@example.com" "meeting scheduler")

;; Chat with your emails - opens buffer with LLM response
(rcd-maildir-rag-chat "me@example.com")

The system:

  1. Converts your query to an embedding
  2. Finds similar emails from your Maildir using vector similarity
  3. Reranks results for better relevance
  4. Presents context to the LLM for answer generation

Privacy Benefits

This setup provides significant privacy advantages:

  1. No cloud dependency: All processing happens on your local machines
  2. Your data stays yours: Email contents never leave your network
  3. No API key exposure: No external service sees what you’re searching
  4. Offline capable: Works without internet (if models are local)

Architecture Summary

┌─────────────────────────────────────────────────────────────┐
│                    Emacs (rcd-maildir-rag)                  │
└────────────────────────┬────────────────────────────────────┘
                         │
         ┌───────────────┴───────────────┐
         │                               │
         ▼                               ▼
┌─────────────────────┐      ┌─────────────────────┐
│ Local Embedding     │      │ Local Reranker      │
│ Server (port 9999)  │      │ (port 7676)         │
│                     │      │                     │
│ • Text → Vector     │      │ • Re-rank results   │
│ • pgvector format   │      │ • Better relevance  │
└─────────────────────┘      └─────────────────────┘
         │                               │
         └───────────────┬───────────────┘
                         │
                         ▼
              ┌─────────────────────┐
              │   PostgreSQL        │
              │   (pgvector)        │
              │                     │
              │ • embeddings table  │
              │ • messageids table  │
              │ • files table       │
              └─────────────────────┘

Conclusion

By combining Emacs, local LLM servers, PostgreSQL with pgvector, and a self-hosted reranker, you can create a powerful, private system for chatting with your emails. This approach:

Start by installing the required packages, setting up your local embedding and reranking servers, processing your Maildir, and then enjoy having conversations with your email archive—all within the comfortable embrace of Emacs.

References