Building an AI Agent with Persistent Memory
Build an AI agent that learns user preferences over time and uses them to personalize responses. This guide walks through the core MemoryRelay SDK patterns: storing memories, semantic search, metadata filtering, sessions, and entities.
What You'll Build
A preference-learning agent that:
- Stores user preferences as structured memories with metadata
- Recalls relevant preferences using semantic search before responding
- Tracks conversations with sessions
- Builds a knowledge graph with entities
- Uses OpenAI to generate personalized responses informed by memory
Prerequisites
- Python 3.9+
- A MemoryRelay API key (get one here)
- An OpenAI API key (for the LLM)
Installation
pip install memoryrelay openai
export MEMORYRELAY_API_KEY="mem_your_key_here"
export OPENAI_API_KEY="sk-your_key_here"
Step 1: Create an Agent
Every agent in MemoryRelay is an isolated memory namespace. Create one for your preference-learning bot:
import os
from memoryrelay import MemoryRelay
client = MemoryRelay(api_key=os.environ["MEMORYRELAY_API_KEY"])
agent = client.agents.create(name="preference-agent")
print(f"Agent created: {agent.id}")
You can create multiple agents with the same name. Always use agent.id (a UUID) to reference the agent programmatically.
Step 2: Store User Preferences
Store preferences as memories with structured metadata. The metadata lets you filter and categorize memories later, while the content text is what gets embedded for semantic search.
agent_id = str(agent.id)
# UI preference
client.memories.create(
content="User prefers dark mode interfaces",
agent_id=agent_id,
metadata={"type": "preference", "category": "ui"},
)
# Locale preference
client.memories.create(
content="User's timezone is EST (UTC-5)",
agent_id=agent_id,
metadata={"type": "preference", "category": "locale"},
)
# Communication preference
client.memories.create(
content="User prefers concise responses with code examples over long explanations",
agent_id=agent_id,
metadata={"type": "preference", "category": "communication"},
)
# Technical preference
client.memories.create(
content="User uses Python 3.12, prefers type hints, and uses pytest for testing",
agent_id=agent_id,
metadata={"type": "preference", "category": "development"},
)
# Dietary preference
client.memories.create(
content="User is vegetarian and allergic to nuts",
agent_id=agent_id,
metadata={"type": "preference", "category": "food"},
)
print("Stored 5 preferences.")
Each memory is automatically embedded (converted to a 384-dimensional vector) by MemoryRelay, enabling semantic search.
Step 3: Search Memories with Semantic Similarity
The power of MemoryRelay is semantic search — you query in natural language and get back the most relevant memories, even if the wording is different.
results = client.memories.search(
query="What display settings does the user prefer?",
agent_id=agent_id,
limit=5,
)
print("Search results for 'display settings':\n")
for r in results.data:
print(f" [{r.score:.2f}] {r.content}")
print(f" metadata: {r.metadata}\n")
Output:
Search results for 'display settings':
[0.82] User prefers dark mode interfaces
metadata: {'type': 'preference', 'category': 'ui'}
[0.51] User prefers concise responses with code examples over long explanations
metadata: {'type': 'preference', 'category': 'communication'}
Notice that "display settings" matched "dark mode interfaces" even though the words are completely different — that is semantic search at work.
Step 4: Build a Preference-Aware Response Generator
Combine MemoryRelay with OpenAI to build an agent that checks its memory before responding:
from openai import OpenAI
openai_client = OpenAI()
def respond_with_memory(user_message: str, agent_id: str) -> str:
"""Generate a response informed by relevant memories."""
# Step 1: Search for relevant memories
results = client.memories.search(
query=user_message,
agent_id=agent_id,
limit=5,
)
# Step 2: Build context from memories
memory_context = ""
if results.data:
memory_lines = [f"- {r.content}" for r in results.data if r.score >= 0.4]
if memory_lines:
memory_context = (
"You know the following about this user:\n"
+ "\n".join(memory_lines)
+ "\n\n"
)
# Step 3: Generate response with memory-augmented prompt
system_prompt = (
"You are a helpful personal assistant. Use what you know about the "
"user to personalize your responses. Be concise.\n\n"
f"{memory_context}"
)
completion = openai_client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_message},
],
)
assistant_message = completion.choices[0].message.content
# Step 4: Store this exchange as a new memory
client.memories.create(
content=f"User asked: {user_message}\nAssistant responded: {assistant_message}",
agent_id=agent_id,
metadata={"type": "conversation"},
)
return assistant_message
Try it:
print(respond_with_memory("Can you recommend an IDE theme?", agent_id))
# Response will reference dark mode preference — pulled from memory!
print(respond_with_memory("Help me set up a new Python project", agent_id))
# Response will mention Python 3.12, type hints, pytest — all from memory!
Step 5: Track Conversations with Sessions
Sessions group related memories together and provide a timeline of interactions:
# Start a session
session = client.sessions.create(
agent_id=agent_id,
title="Onboarding call",
)
# Store memories within the session
client.memories.create(
content="User's name is Jordan. Works at TechFlow as a senior engineer.",
agent_id=agent_id,
session_id=str(session.id),
metadata={"type": "fact", "category": "identity"},
)
client.memories.create(
content="User is building a real-time data pipeline with Kafka and Flink",
agent_id=agent_id,
session_id=str(session.id),
metadata={"type": "fact", "category": "project"},
)
client.memories.create(
content="User wants weekly check-ins on Mondays at 10am EST",
agent_id=agent_id,
session_id=str(session.id),
metadata={"type": "preference", "category": "scheduling"},
)
# End the session with a summary
client.sessions.end(
session_id=str(session.id),
summary="Onboarding call with Jordan from TechFlow. Discussed their Kafka/Flink "
"pipeline project and set up weekly Monday check-ins.",
)
print(f"Session ended: {session.id}")
Session summaries are themselves searchable. When you end a session with a good summary, future queries like "what did we discuss last time?" can match against it.
Step 6: Build a Knowledge Graph with Entities
Entities let you organize memories around people, projects, organizations, and concepts:
# Create entities
jordan = client.entities.create(
name="Jordan",
entity_type="person",
agent_id=agent_id,
metadata={"role": "senior engineer", "company": "TechFlow"},
)
techflow = client.entities.create(
name="TechFlow",
entity_type="organization",
agent_id=agent_id,
)
pipeline_project = client.entities.create(
name="Real-time Data Pipeline",
entity_type="project",
agent_id=agent_id,
metadata={"tech": ["kafka", "flink"]},
)
print(f"Created entities: Jordan ({jordan.id}), TechFlow ({techflow.id}), Pipeline ({pipeline_project.id})")
Link memories to entities to build the graph:
# Search for memories about Jordan to link them
jordan_memories = client.memories.search(
query="Jordan TechFlow engineer",
agent_id=agent_id,
limit=5,
)
for memory in jordan_memories.data:
if memory.score >= 0.5:
client.entities.link(
entity_id=str(jordan.id),
memory_id=str(memory.id),
relationship="mentioned_in",
)
print(f"Linked memory to Jordan: {memory.content[:60]}...")
MemoryRelay can automatically extract entities from memory content. Check the entities field on memory responses to see auto-detected people, places, and organizations.
Step 7: Check Agent Stats
Monitor how your agent's memory is growing:
stats = client.agents.stats(agent_id=agent_id)
print(f"Total memories: {stats.memory_count}")
print(f"Total entities: {stats.entity_count}")
print(f"Total sessions: {stats.session_count}")
Full Working Example
Here is the complete agent as a runnable script:
"""preference_agent.py — AI agent with persistent memory via MemoryRelay."""
import os
from memoryrelay import MemoryRelay
from openai import OpenAI
def main():
mr = MemoryRelay(api_key=os.environ["MEMORYRELAY_API_KEY"])
oai = OpenAI()
# Create or reuse an agent
agent = mr.agents.create(name="preference-agent")
agent_id = str(agent.id)
print(f"Agent ID: {agent_id}\n")
# Start a session
session = mr.sessions.create(agent_id=agent_id, title="Interactive chat")
session_id = str(session.id)
print("Chat with the preference-learning agent.")
print("Commands: /remember <fact> | /search <query> | /stats | /quit\n")
while True:
user_input = input("You: ").strip()
if not user_input:
continue
# --- Special commands ---
if user_input.lower() == "/quit":
mr.sessions.end(session_id=session_id, summary="Interactive chat session.")
print("Session ended. Goodbye!")
break
if user_input.startswith("/remember "):
fact = user_input[len("/remember "):]
mr.memories.create(
content=fact,
agent_id=agent_id,
session_id=session_id,
metadata={"type": "user_provided"},
)
print(f"Stored: {fact}\n")
continue
if user_input.startswith("/search "):
query = user_input[len("/search "):]
results = mr.memories.search(query=query, agent_id=agent_id, limit=5)
if not results.data:
print("No memories found.\n")
else:
for r in results.data:
print(f" [{r.score:.2f}] {r.content}")
print()
continue
if user_input.lower() == "/stats":
stats = mr.agents.stats(agent_id=agent_id)
print(f" Memories: {stats.memory_count}")
print(f" Entities: {stats.entity_count}")
print(f" Sessions: {stats.session_count}\n")
continue
# --- Memory-augmented response ---
results = mr.memories.search(query=user_input, agent_id=agent_id, limit=5)
memory_lines = [f"- {r.content}" for r in results.data if r.score >= 0.4]
memory_context = ""
if memory_lines:
memory_context = (
"You know the following about this user:\n"
+ "\n".join(memory_lines)
+ "\n\n"
)
completion = oai.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": (
"You are a helpful personal assistant with memory. "
"Use what you know to personalize responses. Be concise.\n\n"
f"{memory_context}"
),
},
{"role": "user", "content": user_input},
],
)
response = completion.choices[0].message.content
print(f"Bot: {response}\n")
# Store the exchange
mr.memories.create(
content=f"User: {user_input}\nAssistant: {response}",
agent_id=agent_id,
session_id=session_id,
metadata={"type": "conversation"},
)
if __name__ == "__main__":
main()
Run it:
python preference_agent.py
Example session:
Agent ID: 8f3a1b2c-...
Chat with the preference-learning agent.
Commands: /remember <fact> | /search <query> | /stats | /quit
You: /remember I prefer dark mode and use VS Code
Stored: I prefer dark mode and use VS Code
You: /remember I'm a Python developer working on data pipelines
Stored: I'm a Python developer working on data pipelines
You: Can you suggest a good VS Code extension for my work?
Bot: Since you're working on data pipelines in Python, I'd recommend:
- **Pylance** — best-in-class Python IntelliSense
- **Python Indent** — fixes Python indentation
- **Data Wrangler** — preview and explore data inline
All work great with your dark mode setup.
You: /search data pipelines
[0.89] I'm a Python developer working on data pipelines
[0.62] User: Can you suggest a good VS Code extension...
You: /stats
Memories: 4
Entities: 0
Sessions: 1
You: /quit
Session ended. Goodbye!
Key Concepts Recap
| Concept | What It Does | SDK Method |
|---|---|---|
| Agent | Isolated memory namespace for one bot/user | client.agents.create() |
| Memory | A piece of information with an embedding | client.memories.create() |
| Search | Semantic similarity lookup across memories | client.memories.search() |
| Session | Groups memories from a single interaction | client.sessions.create() |
| Entity | Knowledge graph node (person, project, org, etc.) | client.entities.create() |
| Metadata | Structured tags on memories for filtering | metadata={"type": "..."} param |