As AI agents become more integrated into our workflows, their ability to remember past interactions is essential for providing personalized and intelligent assistance. While simple memory gives an agent context in a single conversation, true value comes from a long-term memory system that functions like a human’s, recalling facts and relationships over time. This is achieved using knowledge graphs.
This guide walks through the evolution of an AI agent’s memory system, as demonstrated by a specific automated workflow. We will explore three distinct methods, from a basic setup to a highly optimized hybrid model, using tools like Zep and PostgreSQL to build a truly intelligent agent.
The Foundation: A Simple Memory and Its Limits
The first part of the workflow demonstrates the most basic approach to AI memory, labeled “Zep Memory”.
Download workflow: https://romhub.io/n8n/ZEP_Memory
In this setup, a Telegram Trigger
captures a user’s message. This message is sent to an AI Agent
node, which is directly connected to a Zep
memory node. This Zep node is responsible for storing and retrieving the conversation history. The agent processes the request and sends a response back to the user via Telegram.
However, this method has a significant drawback, as hinted by a sticky note in the workflow labeled “Ew… simple memory”. The native Zep integration sends the entire conversation history to the AI model with every new message. This can lead to high token consumption and increased costs, especially as the conversation grows longer. It also risks confusing the model with old, irrelevant information.
The Optimized Approach: Selective Memory with HTTP Requests
To overcome the limitations of the basic setup, the workflow introduces a more sophisticated method labeled “Zep Memory (HTTP)”. This approach uses direct HTTP requests to the Zep API, allowing for precise control over what data is sent to the AI model.
This refined process works as follows:
- Parallel Data Retrieval: When the
Telegram Trigger1
receives a message, it initiates two parallel requests to gather different types of memory. - Retrieve Short-Term Context: A
Context Window
node sends an HTTP GET request to Zep’s/messages
endpoint to retrieve the last 10 messages, serving as the short-term conversation history. - Query Long-Term Knowledge Graph: Simultaneously, a
User Graph
node sends an HTTP POST request to Zep’s/graph/search
endpoint. This query is highly specific: it searches for the top 3 facts from the user’s knowledge graph that have a relevance score of 0.7 or higher to the current message. - Clean and Format Data: The raw JSON output from both requests is processed by
Code
nodes, which use JavaScript to format the conversation history and extract only the relevant facts. - Merge and Send to Agent: A
Merge
node combines the cleaned conversation history and the filtered facts. This focused context is then passed to theAI Agent 2
node within a structured system prompt:Here is some additional information about Nate:\n{{ $json.facts.join("\\n") }}\n\n5 most recent interactions with Nate:\n{{ $json.conversations.join("\\n\\n")}}
. This ensures the model only receives the most pertinent information. - Update Memory: After the agent generates a response, an
Add Memory
node sends a POST request back to Zep’s/memory
endpoint, saving the new interaction to update the knowledge graph for future use.
This selective filtering method drastically reduces token usage, making the agent more efficient and cost-effective without sacrificing the benefits of long-term memory.
The Hybrid Approach: Combining Zep and PostgreSQL
The final workflow, “Zep & Postgres Hybrid,” presents the most advanced strategy by combining the strengths of two different systems.
- Zep for Long-Term Memory (Knowledge Graph): Zep is used exclusively for what it does best—managing the complex, relational data of the user’s knowledge graph. This is handled by the
User Graph1
HTTP request node. - PostgreSQL for Short-Term Memory (Context Window): A
Postgres Chat Memory
node is used to handle the short-term conversation history. It connects directly to theAI Agent 3
node and provides the most recent conversational turns.
In this hybrid model, the AI agent receives long-term facts from Zep via its system prompt while getting its short-term conversational context from PostgreSQL. After the interaction, the memory is updated in both systems: an Add Memory1
node sends the new information to Zep to enrich the knowledge graph, while the Postgres Chat Memory
node automatically saves the conversation to its database.
Real-World Applications and Session Management
The key to making these memory systems work for individual users is Session Management. In all three workflows, the system identifies each user uniquely by using their Telegram Chat ID as the session_id
or sessionKey
. This ensures that the memory—whether it’s the conversation history or the knowledge graph—is kept separate and distinct for each user.
This capability for per-user memory unlocks a wide range of powerful applications:
- Personalized Customer Service: An agent can recall a customer’s entire interaction history and product preferences to provide faster, more relevant support.
- Customized Onboarding: An onboarding bot can remember a new employee’s progress, questions, and learning pace to tailor the experience.
- Adaptive Tutoring: An educational agent can track a student’s strengths and weaknesses over time to adjust its teaching methods accordingly.
By combining this persistent, user-specific memory with an agent’s other capabilities, we can build assistants that are not just responsive, but truly intelligent and personalized partners.
Conclusion
The evolution from a simple, all-in memory to an optimized, hybrid system demonstrates a critical principle in building advanced AI agents: controlling the context is key. By selectively retrieving relevant long-term facts and recent conversation history, you can create powerful, cost-effective, and scalable agents that learn and grow with every interaction.