Karpathy says LLM memory features "trying too hard"

Every major AI chatbot now remembers things about you. OpenAI's ChatGPT, Anthropic's Claude and Google's Gemini all store details from past conversations and use them to personalize future responses. In a pair of posts on X on March 25, Andrej Karpathy noted that the memory feature has an issue: models can treat stale, one-off queries as deep ongoing interests.

"One common issue with personalization in all LLMs is how distracting memory seems to be for the models," Karpathy wrote. "A single question from 2 months ago about some topic can keep coming up as some kind of a deep interest of mine with undue mentions in perpetuity. Some kind of trying too hard".

The first post drew 15,000 likes and 1.5 million views within its first day.

Users recognize the pattern

The thread quickly filled with specific examples of the behavior Karpathy described. One user reported that a background in quantitative finance caused their chatbot to shoehorn financial concepts into unrelated conversations. "No you don't need to frame my kids education in terms of Sharpe ratios to help me understand it," they wrote.

Another said a vitamin deficiency they asked about a year ago continued to surface in responses, eventually prompting them to disable memory entirely.

A researcher noted that a single mention of an interpretability method they had worked on caused Claude to reference it in every subsequent project, regardless of relevance.

Several users described turning memory off as the most reliable workaround. "Every single offering sucks. It's better to start with a clean slate or use compartmentalized projects," one wrote.

The frustration spans languages and geographies. Replies in Japanese, Korean and Spanish echoed the same complaints, with one Japanese user noting that they had disabled memory across all AI products because the feature was more hindrance than help.

A training problem, not an implementation problem

In a follow-up post, Karpathy proposed an explanation rooted in how LLMs are trained rather than how any specific product implements memory. He noted that he cycles through all major LLMs and sees the same behavior across providers.

His hypothesis: during training, most information placed in the context window is genuinely relevant to the task at hand. This teaches models to treat everything in context as important. At inference time, when a memory system retrieves old conversation fragments via RAG (retrieval-augmented generation), the model applies that same learned bias, overfitting to whatever happens to appear, regardless of whether the user still cares about it.

This is consistent with an earlier observation Karpathy made about long context windows more broadly. In a February 2025 post about the "One Thread" approach to LLM conversations, he flagged that too many tokens competing for attention can decrease performance by diffusing the signal-to-noise ratio. The memory problem may be a specific case of a more general issue: models that struggle to distinguish high-signal information from noise when given a large context.

The timing matters

Karpathy's posts arrive at a moment when AI companies are doubling down on personalization as a competitive moat. The competitive logic is clear: the more an AI assistant knows about you, the harder it is to switch.

But the feedback loop Karpathy describes suggests that more memory does not automatically mean better memory. Without mechanisms to weight recent interests over stale ones, or to distinguish between a one-off question and a genuine ongoing interest, memory features risk making models less useful rather than more.

Karpathy says LLM memory features "trying too hard"

Users recognize the pattern

A training problem, not an implementation problem

The timing matters

More in Large Language Models

Related stories

Google releases Gemini 3.1 Flash-Lite

Alibaba launches Qwen 3.5 small model series with sub-1B edge options

Claude hits #1 on the US App Store after Pentagon dispute

Gemini 3.1 Pro claims top-tier reasoning gains