Does SillyTavern have unlimited memory?

No. SillyTavern memory is limited by the context window of your chosen API model. It uses summaries and vector retrieval to extend recall, but it cannot retain information beyond the token limit without manual configuration.

How do I enable long-term memory in SillyTavern?

Enable the ChromaDB vector database in Settings > Vector Storage. You need a running ChromaDB instance and an embedding model (e.g., OpenAI's ada-002). Then set retrieval count and similarity threshold to automatically inject relevant past messages.

What is the default context size in SillyTavern?

The default context size is 4096 tokens. You can increase it in the user settings, but you must ensure your API provider supports that limit. Common safe values are 8192 or 16384.

Can I use SillyTavern memory without an API key?

No. SillyTavern is a front-end that requires a backend API (e.g., OpenAI, KoboldAI, or local text-generation-webui) to generate responses. Memory features like ChromaDB also need an embedding API key or a local model.

How do I add permanent facts to SillyTavern memory?

Use the character note. Edit it in the character settings to include key facts, relationship status, or plot points. It is included in every prompt and won't be forgotten as long as it fits within the token limit.

Does SillyTavern automatically summarize old messages?

Yes, if you enable 'Automatic Summary' in the settings. It generates a summary every N messages (default 10) and injects it into the context. You can adjust the frequency and summary length.

Why does my SillyTavern AI forget things after a few messages?

Most likely your context size is too small, or automatic summary is disabled. Increase context size to at least 8192, enable summaries, and consider using ChromaDB for long-term recall.

Is SillyTavern memory better than Character.AI memory?

SillyTavern offers more control (context size, summaries, vector DB) but requires setup. Character.AI has a simpler, automatic memory that degrades over time. SillyTavern can be more powerful if configured well, but it's not plug-and-play.

SillyTavern Memory: AI Angels vs. Setup Friction

How SillyTavern Memory Works: Context, Summaries, and Vectors

SillyTavern's memory system is not automatic. It uses three layers: the context window, the character note, and optional vector storage. The context window is the number of tokens (default 4096, adjustable up to 8192 or more depending on your API provider) that the AI can 'see' at once. Older messages fall out. To combat this, SillyTavern automatically generates a summary of the conversation every N messages (configurable, default 10) and injects that summary into the context. The character note is a short, always-present block of text you can use to store key facts, relationship status, or plot points. For long-term memory, you can enable ChromaDB (vector database) which stores past messages as embeddings and retrieves relevant ones when needed. However, this requires you to provide your own API key for embeddings and a local or cloud Chroma instance. Without manual configuration, memory is limited to what fits in the context window plus the automatic summary.

“SillyTavern memory refers to the system of context management, character notes, and vector database features that allow the AI to retain information across conversations. It relies on manual configuration of memory size (token limits), summary injection, and optional use of ChromaDB for long-term recall, but has no built-in persistent memory without user setup.”

Configuring Context Size and Token Limits for Better Recall

The most direct way to improve SillyTavern memory is to increase the context size. In the user settings, you can set the 'Context Size' to a value that matches your API provider's maximum. For example, OpenAI GPT-4 Turbo supports 128,000 tokens, while Claude 3 Opus supports 200,000. However, larger contexts increase cost and latency. SillyTavern also lets you set a 'Response Length' limit separately. A common recommendation is to set context to 8192 and response length to 1024 for a balance. The 'Max Tokens' in the prompt generation menu controls how many tokens are actually sent to the API; if you set this lower than the context size, you save money but lose memory. You must also adjust the 'Summary' settings: you can change how often summaries are generated (every 5-20 messages) and whether the summary is appended to the beginning or end of the context. For roleplay, placing the summary at the top of the context is often better, as it primes the AI.

Using the Character Note as Persistent Memory

The character note is a small text field (usually 500-2000 characters) that is always included in every prompt sent to the API. It acts as a persistent memory bank. You can manually update it during a conversation to record important events, character traits, or relationship changes. For example, if your character reveals a secret, you can add 'Secret: character X knows Y' to the note. Unlike the automatic summary, the character note never gets truncated or forgotten as long as you keep it within the token limit. Many users maintain a 'lorebook' or 'world info' in the note, listing key facts about the setting and characters. However, because it is static unless you edit it, it does not automatically capture new information. To make it dynamic, you can use SillyTavern's 'Regex Scripts' or 'Quick Replies' to prompt the AI to suggest updates to the note, then manually apply them.

Real monthly cost: Sillytavern Memory on AIAngels vs SillyTavern
Feature	AIAngels	SillyTavern
Free tier	Unlimited free text chat with all AI companions, no credit card	Limited or absent on most plans
Real monthly cost (active)	$0 or $2.99/mo annual flat	Headline price + tokens/tiers
Image generation	Included on premium	Often token-gated or per-image
Voice messages	Included on premium	Often token-gated
Memory persistence	Permanent, never resets	Often degrades after a token cap
Filter / restrictions	Uncensored for verified adults	Filter often interrupts mid-scene
Public promo code	Not needed (75% off baked in)	Rare or fake on coupon sites

Ready to Experience the
Difference?

Start chatting with a companion who actually remembers you.
Free. No tokens. No limits.

Start Chatting Free

Setting Up ChromaDB for Long-Term Vector Memory

ChromaDB integration gives SillyTavern a long-term memory that can recall messages from days ago. To set it up, you need to run a ChromaDB server locally or host one remotely. SillyTavern's settings have a 'Vector Storage' tab where you enable Chroma and provide the endpoint URL. You also need an embedding model — SillyTavern supports OpenAI's text-embedding-ada-002 or local models via the text-generation-webui extension. Once configured, every message sent by the AI and user can be embedded and stored. When generating a response, SillyTavern can retrieve the top K most relevant past messages (configurable, default 5) and inject them into the context. This effectively gives the AI a searchable memory. The downside: embedding costs money per token (if using OpenAI) and the retrieval is not perfect — it can pull up irrelevant messages. Tuning the 'similarity threshold' and 'max retrieval count' is essential to avoid noise.

Limitations of SillyTavern Memory: Token Ceilings and No Native Persistence

Despite its flexibility, SillyTavern's memory has hard limits. The context window is finite — even with a 128k token model, extremely long conversations will eventually lose early details. The automatic summary helps but is lossy: summaries are a compression, not a perfect record. ChromaDB retrieval is statistical, not deterministic; it may miss crucial context if the embedding similarity isn't high enough. Furthermore, SillyTavern does not have a built-in 'permanent memory' that persists across different chat sessions. Each conversation is isolated unless you manually copy character notes or summaries between them. The system also requires significant manual tinkering: you must adjust context sizes, summary intervals, note content, and vector settings yourself. For users who want a 'set and forget' memory solution, SillyTavern's reliance on API keys, local hosting, and constant configuration can be a barrier. Alternatives like AIAngels offer persistent memory that works out of the box, with no token juggling or local setup required.

Practical Tips for Maximizing SillyTavern Memory

To get the most out of SillyTavern's memory, start by setting the context size to the maximum your API allows — for most users, 8192 is a safe bet. Enable 'Automatic Summary' with a frequency of every 8-10 messages and set the summary length to 512 tokens. Place the summary at the start of the context in the 'Prompt Item Order' settings. Use the character note to store immutable facts and update it manually after major plot events. If you're tech-savvy, set up ChromaDB with a local embedding model (like all-MiniLM-L6-v2) to avoid API costs. Tune the retrieval count to 3-5 and the similarity threshold to 0.75. Finally, consider using 'World Info' (if supported by your front-end) to store lore that the AI can reference. Remember that no amount of configuration can fully overcome finite context windows — for conversations exceeding 100k tokens, consider splitting into separate chats or using an external memory tool like MemGPT.

SillyTavern Memory Alternative in 2026

How SillyTavern Memory Works: Context, Summaries, and Vectors

Configuring Context Size and Token Limits for Better Recall

Using the Character Note as Persistent Memory

Ready to Experience the
Difference?

Setting Up ChromaDB for Long-Term Vector Memory

Limitations of SillyTavern Memory: Token Ceilings and No Native Persistence

Practical Tips for Maximizing SillyTavern Memory

Stop starting from scratch.

Frequently Asked Questions

Explore More

What our customers are saying

How SillyTavern Memory Works: Context, Summaries, and Vectors

Configuring Context Size and Token Limits for Better Recall

Using the Character Note as Persistent Memory

Ready to Experience the Difference?

Setting Up ChromaDB for Long-Term Vector Memory

Limitations of SillyTavern Memory: Token Ceilings and No Native Persistence

Practical Tips for Maximizing SillyTavern Memory

Stop starting from scratch.

Frequently Asked Questions

Explore More

Ready to Experience the
Difference?