AI agents are systems capable of pursuing and achieving goals by harnessing the reasoning capabilities of large language models (LLMs) to plan, observe, and execute actions. In 2025, AI agents are anticipated to drive transformative changes across the workforce, impacting productivity and efficiency in significant ways, including supporting employees in their daily work, deploying digital humans for critical business functions, and even replacing enterprise SaaS as we know it.
Building agentic systems is still an evolving field, and it comes with challenges that researchers and industry experts are actively working to solve. One key challenge is developing models specialized in reasoning tasks, as opposed to the language-focused tasks like summarization that characterized the first wave of GenAI apps. Another critical hurdle is managing the memory of AI agents, which often requires adopting sophisticated methodologies to achieve the desired level of agentic performance.
Memory is key to making AI agents work. This guide covers why it matters, the different types, best practices for managing it, and why Redis stands out as the ideal data platform for agentic memory. We’ll also cover practical implementation to help you integrate agentic memory effectively.
AI agent memory is crucial for enhancing efficiency and capabilities because Large Language Models (LLMs) do not inherently remember things i.e., they are stateless. Memory allows AI agents to learn from past interactions, retain information, and maintain context, leading to more coherent and personalized responses.
Imagine an AI agent designed to plan and book work trips. Without memory, it won’t remember personal preferences (e.g., “Do you like direct flights or flights with layovers?”); make procedural mistakes due to lack of understanding (e.g., booking a hotel that does not offer the amenities required for business trips, like meeting rooms or reliable Wi-Fi); and fail to recall previously provided details like passport information. This leads to a frustrating user experience with repetitive questions, inconsistent behaviour, and a lack of personalisation.
AI agents, like humans, rely on both short-term and long-term memory to function effectively.
Short-term memory works like a computer’s RAM—holding onto relevant details for an ongoing task or conversation. This working memory exists only briefly within a conversation thread and is usually limited due to the constrained context windows of large language models (LLMs) or the need to minimize less relevant information. That’s where agentic frameworks like LangGraph come in. Agentic frameworks like LangGraph simplify short-term memory management by providing tools like Checkpointers, which help maintain thread-specific context. This allows agents to store short-term memory efficiently in high-performance databases like Redis.
Long-term memory works more like a hard drive, storing vast amounts of information to be accessed later. This is information that persists across multiple task runs or conversations, allowing agents to learn from feedback and adapt to user preferences. These memories can further be divided into three types (for more information on the nuances of different memory types, we suggest reviewing the famous CoALA framework paper):
Managing long-term memory is complex due to challenges like deciding which type of memories to store, figuring out what to store, how to decay older memories and how to retrieve them effectively into working memory.
There are four highest-level decisions you need to make when planning your memory management architecture:
The type of memories you need to store and manage may depend on the type of application. For example, a conversational AI agent would be expected to remember information across threads about user preferences (and therefore store episodic memory). On the other hand a retail AI assistant may be required to store information about products and recall relevant facts from a product knowledge database (and therefore store semantic memory).
Given constraints in LLM context windows and the risk of context pollution, it’s critical to efficiently store memories. There are four common strategies we see developers use to efficiently store relevant memories. For most production deployments we expect AI Agents to use a combination of these techniques (note these techniques are not mutually exclusive and many developers may want to use a combination of these).
Imagine you have memory chunks stored in a database like Redis, along with their embeddings and text descriptions. How does the Agent know how to retrieve the most relevant memories? This is an emerging area of research with some sophisticated approaches being tried by researchers. For example, the MemGPT paper takes an approach of using the LLM as a query generator where it can make decisions about when to retrieve long-term memory, generating a query to do a search (by generating function calling tokens) and then using vector search to retrieve relevant chunks. For most applications, we recommend developers start with a vector search of the memory database and add on additional sophistications from there as needed.
It’s crucial to decay stored memories in AI systems to prevent memory bloat and maintain efficiency. As an AI agent interacts over time, it accumulates a massive amount of information, some of which becomes irrelevant or outdated. Without a mechanism for forgetting, the AI’s memory would become overwhelmed with useless data, leading to slower retrieval times, decreased accuracy in responses, and inefficient use of resources. If storing memories using Redis, you can use the various built-in eviction and expiration strategies to efficiently manage memory decay. You can also add timestamps as another field in the object and influence the final search result with some notion of recency sorting.
There are several reasons why developers prefer Redis as their platform to store and manage AI agent memories. These include:
We make managing memory simpler with our open source Redis Agent Memory Server.
This notebook demonstrates how to manage short-term and long-term agent memory using LangGraph and Redis. In it, we explore:
In the notebook, we build two versions of a travel agent, one that manages long-term memory manually and one that does so using tools the LLM calls.
Here are two diagrams showing the components used in both agents:
Want to make your own agent? Try the LangGraph Quickstart. Then add our Redis checkpointer to give your agent fast, persistent memory. Redis Agent Memory Server is our open source tool for managing memory for agents and AI apps.
Using Redis to manage memory for your AI Agent lets you build a flexible and scalable system that can store and retrieve memories fast. Check out the resources below to start building with Redis today, or connect with our team to chat about AI Agents.