MemoryRelay

MemoryRelay is a persistent memory service for AI agents. It gives your agents long-term memory that survives across sessions, projects, and deployments. Store observations, retrieve them by meaning with semantic search, automatically extract entities and relationships, and build knowledge graphs that grow with every interaction.

Key Features

Semantic Search -- Store memories as natural language and retrieve them by meaning, not keywords. MemoryRelay generates vector embeddings for every memory and uses cosine similarity to find the most relevant results.
Entity Extraction -- Automatically identify people, organizations, technologies, projects, and concepts mentioned in memories. Entities are extracted and linked without any manual tagging.
Knowledge Graph -- Entities are connected through relationships, forming a knowledge graph that your agent can traverse. Understand how people relate to projects, which technologies are used together, and how decisions connect across time.
Session Tracking -- Group related memories into sessions that represent bounded interaction periods. Summarize sessions to create compressed records of what happened and why.
Decision Records -- Track architectural and design decisions with full lifecycle management. Decisions can be active, superseded, or reverted, with links to the decisions that replaced them.
Reusable Patterns -- Discover and share patterns across projects. When your agent learns something useful in one codebase, that knowledge can be adopted in others.
Multi-Project Context -- Manage memory across multiple projects with dependency tracking, impact analysis, and shared pattern discovery.

How It Works

flowchart LR
    A[Your Agent] -->|store| B[MemoryRelay API]
    B -->|embed| C[Vector Embedding]
    C -->|index| D[(PostgreSQL + pgvector)]
    A -->|search| B
    B -->|similarity query| D
    D -->|ranked results| B
    B -->|memories| A

    B -->|extract| E[Entity Extraction]
    E -->|link| F[Knowledge Graph]

Your agent stores memories through the API -- plain text content with optional metadata.
MemoryRelay embeds the content into a 384-dimensional vector using sentence-transformers.
When your agent needs context, it searches by meaning. The query is embedded and compared against stored vectors using cosine similarity.
Entities are extracted automatically and linked into a knowledge graph that grows over time.

Quick Links

Resource	Description
Quickstart	Get up and running in 5 minutes
Core Concepts	Understand memories, agents, entities, and sessions
Authentication	API keys, scopes, and rate limits
API Reference	Full REST API documentation
Python SDK	Official Python client library
Self-Hosting	Deploy MemoryRelay on your own infrastructure

Architecture

MemoryRelay is built on a proven stack designed for production workloads:

FastAPI -- Async Python API with automatic OpenAPI documentation
PostgreSQL + pgvector -- Relational storage with native vector similarity search
Redis -- Caching and rate limiting
sentence-transformers -- Local embedding generation (no external API calls for embeddings)

All data stays within your infrastructure boundary. Embeddings are generated locally -- no content is sent to third-party services for vectorization.

Key Features​

How It Works​

Quick Links​

Architecture​

Key Features

How It Works

Quick Links

Architecture