Building an AI Agent with Persistent Memory

Build an AI agent that learns user preferences over time and uses them to personalize responses. This guide walks through the core MemoryRelay SDK patterns: storing memories, semantic search, metadata filtering, sessions, and entities.

What You'll Build

A preference-learning agent that:

Stores user preferences as structured memories with metadata
Recalls relevant preferences using semantic search before responding
Tracks conversations with sessions
Builds a knowledge graph with entities
Uses OpenAI to generate personalized responses informed by memory

Prerequisites

Python 3.9+
A MemoryRelay API key (get one here)
An OpenAI API key (for the LLM)

Installation

pip install memoryrelay openai

export MEMORYRELAY_API_KEY="mem_your_key_here"
export OPENAI_API_KEY="sk-your_key_here"

Step 1: Create an Agent

Every agent in MemoryRelay is an isolated memory namespace. Create one for your preference-learning bot:

import os
from memoryrelay import MemoryRelay

client = MemoryRelay(api_key=os.environ["MEMORYRELAY_API_KEY"])

agent = client.agents.create(name="preference-agent")
print(f"Agent created: {agent.id}")

Agent names are for humans, IDs are for code

You can create multiple agents with the same name. Always use agent.id (a UUID) to reference the agent programmatically.

Step 2: Store User Preferences

Store preferences as memories with structured metadata. The metadata lets you filter and categorize memories later, while the content text is what gets embedded for semantic search.

agent_id = str(agent.id)

# UI preference
client.memories.create(
    content="User prefers dark mode interfaces",
    agent_id=agent_id,
    metadata={"type": "preference", "category": "ui"},
)

# Locale preference
client.memories.create(
    content="User's timezone is EST (UTC-5)",
    agent_id=agent_id,
    metadata={"type": "preference", "category": "locale"},
)

# Communication preference
client.memories.create(
    content="User prefers concise responses with code examples over long explanations",
    agent_id=agent_id,
    metadata={"type": "preference", "category": "communication"},
)

# Technical preference
client.memories.create(
    content="User uses Python 3.12, prefers type hints, and uses pytest for testing",
    agent_id=agent_id,
    metadata={"type": "preference", "category": "development"},
)

# Dietary preference
client.memories.create(
    content="User is vegetarian and allergic to nuts",
    agent_id=agent_id,
    metadata={"type": "preference", "category": "food"},
)

print("Stored 5 preferences.")

Each memory is automatically embedded (converted to a 384-dimensional vector) by MemoryRelay, enabling semantic search.

Step 3: Search Memories with Semantic Similarity

The power of MemoryRelay is semantic search — you query in natural language and get back the most relevant memories, even if the wording is different.

results = client.memories.search(
    query="What display settings does the user prefer?",
    agent_id=agent_id,
    limit=5,
)

print("Search results for 'display settings':\n")
for r in results.data:
    print(f"  [{r.score:.2f}] {r.content}")
    print(f"         metadata: {r.metadata}\n")

Output:

Search results for 'display settings':

  [0.82] User prefers dark mode interfaces
         metadata: {'type': 'preference', 'category': 'ui'}

  [0.51] User prefers concise responses with code examples over long explanations
         metadata: {'type': 'preference', 'category': 'communication'}

Notice that "display settings" matched "dark mode interfaces" even though the words are completely different — that is semantic search at work.

Step 4: Build a Preference-Aware Response Generator

Combine MemoryRelay with OpenAI to build an agent that checks its memory before responding:

from openai import OpenAI

openai_client = OpenAI()


def respond_with_memory(user_message: str, agent_id: str) -> str:
    """Generate a response informed by relevant memories."""

    # Step 1: Search for relevant memories
    results = client.memories.search(
        query=user_message,
        agent_id=agent_id,
        limit=5,
    )

    # Step 2: Build context from memories
    memory_context = ""
    if results.data:
        memory_lines = [f"- {r.content}" for r in results.data if r.score >= 0.4]
        if memory_lines:
            memory_context = (
                "You know the following about this user:\n"
                + "\n".join(memory_lines)
                + "\n\n"
            )

    # Step 3: Generate response with memory-augmented prompt
    system_prompt = (
        "You are a helpful personal assistant. Use what you know about the "
        "user to personalize your responses. Be concise.\n\n"
        f"{memory_context}"
    )

    completion = openai_client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_message},
        ],
    )

    assistant_message = completion.choices[0].message.content

    # Step 4: Store this exchange as a new memory
    client.memories.create(
        content=f"User asked: {user_message}\nAssistant responded: {assistant_message}",
        agent_id=agent_id,
        metadata={"type": "conversation"},
    )

    return assistant_message

Try it:

print(respond_with_memory("Can you recommend an IDE theme?", agent_id))
# Response will reference dark mode preference — pulled from memory!

print(respond_with_memory("Help me set up a new Python project", agent_id))
# Response will mention Python 3.12, type hints, pytest — all from memory!

Step 5: Track Conversations with Sessions

Sessions group related memories together and provide a timeline of interactions:

# Start a session
session = client.sessions.create(
    agent_id=agent_id,
    title="Onboarding call",
)

# Store memories within the session
client.memories.create(
    content="User's name is Jordan. Works at TechFlow as a senior engineer.",
    agent_id=agent_id,
    session_id=str(session.id),
    metadata={"type": "fact", "category": "identity"},
)

client.memories.create(
    content="User is building a real-time data pipeline with Kafka and Flink",
    agent_id=agent_id,
    session_id=str(session.id),
    metadata={"type": "fact", "category": "project"},
)

client.memories.create(
    content="User wants weekly check-ins on Mondays at 10am EST",
    agent_id=agent_id,
    session_id=str(session.id),
    metadata={"type": "preference", "category": "scheduling"},
)

# End the session with a summary
client.sessions.end(
    session_id=str(session.id),
    summary="Onboarding call with Jordan from TechFlow. Discussed their Kafka/Flink "
            "pipeline project and set up weekly Monday check-ins.",
)

print(f"Session ended: {session.id}")

Use session summaries

Session summaries are themselves searchable. When you end a session with a good summary, future queries like "what did we discuss last time?" can match against it.

Step 6: Build a Knowledge Graph with Entities

Entities let you organize memories around people, projects, organizations, and concepts:

# Create entities
jordan = client.entities.create(
    name="Jordan",
    entity_type="person",
    agent_id=agent_id,
    metadata={"role": "senior engineer", "company": "TechFlow"},
)

techflow = client.entities.create(
    name="TechFlow",
    entity_type="organization",
    agent_id=agent_id,
)

pipeline_project = client.entities.create(
    name="Real-time Data Pipeline",
    entity_type="project",
    agent_id=agent_id,
    metadata={"tech": ["kafka", "flink"]},
)

print(f"Created entities: Jordan ({jordan.id}), TechFlow ({techflow.id}), Pipeline ({pipeline_project.id})")

Link memories to entities to build the graph:

# Search for memories about Jordan to link them
jordan_memories = client.memories.search(
    query="Jordan TechFlow engineer",
    agent_id=agent_id,
    limit=5,
)

for memory in jordan_memories.data:
    if memory.score >= 0.5:
        client.entities.link(
            entity_id=str(jordan.id),
            memory_id=str(memory.id),
            relationship="mentioned_in",
        )
        print(f"Linked memory to Jordan: {memory.content[:60]}...")

Entity extraction

MemoryRelay can automatically extract entities from memory content. Check the entities field on memory responses to see auto-detected people, places, and organizations.

Step 7: Check Agent Stats

Monitor how your agent's memory is growing:

stats = client.agents.stats(agent_id=agent_id)
print(f"Total memories: {stats.memory_count}")
print(f"Total entities: {stats.entity_count}")
print(f"Total sessions: {stats.session_count}")

Full Working Example

Here is the complete agent as a runnable script:

"""preference_agent.py — AI agent with persistent memory via MemoryRelay."""

import os

from memoryrelay import MemoryRelay
from openai import OpenAI


def main():
    mr = MemoryRelay(api_key=os.environ["MEMORYRELAY_API_KEY"])
    oai = OpenAI()

    # Create or reuse an agent
    agent = mr.agents.create(name="preference-agent")
    agent_id = str(agent.id)
    print(f"Agent ID: {agent_id}\n")

    # Start a session
    session = mr.sessions.create(agent_id=agent_id, title="Interactive chat")
    session_id = str(session.id)

    print("Chat with the preference-learning agent.")
    print("Commands:  /remember <fact>  |  /search <query>  |  /stats  |  /quit\n")

    while True:
        user_input = input("You: ").strip()
        if not user_input:
            continue

        # --- Special commands ---
        if user_input.lower() == "/quit":
            mr.sessions.end(session_id=session_id, summary="Interactive chat session.")
            print("Session ended. Goodbye!")
            break

        if user_input.startswith("/remember "):
            fact = user_input[len("/remember "):]
            mr.memories.create(
                content=fact,
                agent_id=agent_id,
                session_id=session_id,
                metadata={"type": "user_provided"},
            )
            print(f"Stored: {fact}\n")
            continue

        if user_input.startswith("/search "):
            query = user_input[len("/search "):]
            results = mr.memories.search(query=query, agent_id=agent_id, limit=5)
            if not results.data:
                print("No memories found.\n")
            else:
                for r in results.data:
                    print(f"  [{r.score:.2f}] {r.content}")
                print()
            continue

        if user_input.lower() == "/stats":
            stats = mr.agents.stats(agent_id=agent_id)
            print(f"  Memories: {stats.memory_count}")
            print(f"  Entities: {stats.entity_count}")
            print(f"  Sessions: {stats.session_count}\n")
            continue

        # --- Memory-augmented response ---
        results = mr.memories.search(query=user_input, agent_id=agent_id, limit=5)
        memory_lines = [f"- {r.content}" for r in results.data if r.score >= 0.4]
        memory_context = ""
        if memory_lines:
            memory_context = (
                "You know the following about this user:\n"
                + "\n".join(memory_lines)
                + "\n\n"
            )

        completion = oai.chat.completions.create(
            model="gpt-4o",
            messages=[
                {
                    "role": "system",
                    "content": (
                        "You are a helpful personal assistant with memory. "
                        "Use what you know to personalize responses. Be concise.\n\n"
                        f"{memory_context}"
                    ),
                },
                {"role": "user", "content": user_input},
            ],
        )

        response = completion.choices[0].message.content
        print(f"Bot: {response}\n")

        # Store the exchange
        mr.memories.create(
            content=f"User: {user_input}\nAssistant: {response}",
            agent_id=agent_id,
            session_id=session_id,
            metadata={"type": "conversation"},
        )


if __name__ == "__main__":
    main()

Run it:

python preference_agent.py

Example session:

Agent ID: 8f3a1b2c-...

Chat with the preference-learning agent.
Commands:  /remember <fact>  |  /search <query>  |  /stats  |  /quit

You: /remember I prefer dark mode and use VS Code
Stored: I prefer dark mode and use VS Code

You: /remember I'm a Python developer working on data pipelines
Stored: I'm a Python developer working on data pipelines

You: Can you suggest a good VS Code extension for my work?
Bot: Since you're working on data pipelines in Python, I'd recommend:
- **Pylance** — best-in-class Python IntelliSense
- **Python Indent** — fixes Python indentation
- **Data Wrangler** — preview and explore data inline
All work great with your dark mode setup.

You: /search data pipelines
  [0.89] I'm a Python developer working on data pipelines
  [0.62] User: Can you suggest a good VS Code extension...

You: /stats
  Memories: 4
  Entities: 0
  Sessions: 1

You: /quit
Session ended. Goodbye!

Key Concepts Recap

Concept	What It Does	SDK Method
Agent	Isolated memory namespace for one bot/user	`client.agents.create()`
Memory	A piece of information with an embedding	`client.memories.create()`
Search	Semantic similarity lookup across memories	`client.memories.search()`
Session	Groups memories from a single interaction	`client.sessions.create()`
Entity	Knowledge graph node (person, project, org, etc.)	`client.entities.create()`
Metadata	Structured tags on memories for filtering	`metadata={"type": "..."}` param

What You'll Build​

Prerequisites​

Installation​

Step 1: Create an Agent​

Step 2: Store User Preferences​

Step 3: Search Memories with Semantic Similarity​

Step 4: Build a Preference-Aware Response Generator​

Step 5: Track Conversations with Sessions​

Step 6: Build a Knowledge Graph with Entities​

Step 7: Check Agent Stats​

Full Working Example​

Key Concepts Recap​

Next Steps​