
Learn how to customize SillyTavern's context templates to control token budgets, memory injection, and character behavior — no more generic AI responses.
A context template in SillyTavern is a JSON file (typically named `context.json`) that lives inside a character's folder or the global settings directory. It dictates exactly how the AI sees the conversation at each turn. The template defines token limits for each section — the character card (persona, description, scenario), the chat history (recent messages), and any injected memory like lorebook entries or author's notes. By tweaking these parameters, you control which information stays in the LLM's limited context window (commonly 4096 or 8192 tokens on models like Llama 3 or GPT-3.5).
Out of the box, SillyTavern ships with a default template that works for basic roleplay. But advanced users quickly hit walls: the AI forgets early conversation details, responds out of character, or wastes context on repetitive greetings. Custom templates fix this by adjusting the `history_depth` (how many past messages to include), the `max_context` (total token budget), and the insertion order of system prompts. For example, you can force the character card to always remain at the top of the context, or reserve a fixed token pool for lorebook entries regardless of chat length.
Understanding context templates is essential because the LLM's attention is finite. A poorly configured template burns tokens on irrelevant chat history, while a well-tuned one keeps the character's personality consistent across hundreds of messages. The [Stanford HAI research on transformer context limits](https://hai.stanford.edu) confirms that models degrade in coherence when important information is pushed beyond the first few thousand tokens.
“A SillyTavern context template is a JSON configuration file that defines how the AI's memory, character card, and chat history are structured into the context window sent to the language model. It controls token allocation, prompt formatting, and which elements (e.g., lorebooks, author's notes) are included or excluded from each request.”
Every SillyTavern context template exposes several critical knobs. The `max_context` parameter sets the total token ceiling for the entire request — common values are 4096, 6144, or 8192 depending on your backend model. Inside that cap, `history_depth` controls how many of the most recent chat messages are included. A higher value (e.g., 150) helps the AI remember long conversations, but it squeezes out space for the character card and lorebook.
The `token_budget` parameter for the character card is especially important. If you have a verbose character description, you must allocate enough tokens (e.g., 1024–2048) so the AI fully reads it. The template also defines `insertion_order` for each section — sections with a lower number appear earlier in the context, which models tend to prioritize. A common mistake is placing the character card after the chat history; the model may then ignore personality details in favor of recent messages.
Other parameters include `depth`, which controls how far back in the history the section is inserted, and `enabled` flags to toggle sections on/off. You can also set `role` (system, user, assistant) to influence how the model interprets each block. For example, setting the character card to `role: system` makes the LLM treat it as immutable instructions rather than conversational text. These settings are documented in SillyTavern's official wiki, but many users learn by inspecting templates shared in the community Discord.
Creating a custom context template in SillyTavern does not require coding — just a text editor and basic JSON familiarity. Start by duplicating the default template file located in `SillyTavern/public/default/context.json`. Rename it and place it in your character's folder (`/characters/YourCharacter/context.json`). SillyTavern will load this template whenever you chat with that character.
Open the file in any text editor (VS Code, Notepad++, or even Notepad). You'll see a JSON array of section objects. Each object has fields: `name`, `role`, `depth`, `insertion_order`, `max_tokens`, `token_budget`, and `enabled`. To increase memory, raise `max_context` and allocate more tokens to `chat_history`. To make the AI stick to character, ensure the character card section has a low `insertion_order` (e.g., 0) and a generous `token_budget`.
After editing, save the file and reload the character in SillyTavern. You can verify the template is active by opening the "Context" tab in the extension panel — it will display the current token allocation breakdown. Test with a few messages: if the AI starts ignoring recent conversation, reduce `history_depth` or increase `max_context`. Popular community templates often set `history_depth` between 80–120 messages and allocate at least 512 tokens to lorebooks. For backend models with extended context (e.g., Claude 100k or GPT-4-32k), you can push `max_context` up to 32000, but be mindful of response latency and cost.
Start chatting with a companion who actually remembers you.
Free. No tokens. No limits.
The most frequent error users make is leaving the default template unchanged — this leads to the AI forgetting character traits after 20 messages. Another is setting `max_context` too high for the backend model, causing API errors or truncated responses. Always check your model's documented context limit (e.g., 4096 for Llama 3 8B, 8192 for Mistral Large, 128k for GPT-4 Turbo).
A third mistake is placing the character card after the chat history in insertion order. Since models pay more attention to the beginning and end of the context (the "primacy and recency" effect), your character prompt should be near the start. Set its `insertion_order` to 0 or 1, and push lorebook entries to lower priority (e.g., 5–10).
Some users forget to disable unused sections. If you aren't using author's notes, set its `enabled` to `false` to free tokens. Similarly, the "depth" parameter controls how far back a section is inserted — if set too high (e.g., depth: 200), the section may never appear if the chat is short. For beginners, a safe starting template is: `max_context: 6144`, `history_depth: 100`, character card `token_budget: 1024` with `insertion_order: 0`, and lorebook `enabled: false` unless needed. Test incrementally. The SillyTavern subreddit and Discord have many pre-made templates you can adapt.
For extended roleplay, a custom context template is essential to maintain coherence across dozens of sessions. Without one, the AI may forget the plot, character relationships, or ongoing subplots after a few days. A template reserved for long-term campaigns might allocate 50% of the context to chat history (e.g., 4096 tokens out of 8192) and 25% to the character card and scenario summary. The remaining budget goes to lorebook entries that track world state — items, NPCs, completed quests.
Some advanced templates use a "summarization" section that periodically injects a condensed summary of the entire story. This is done by setting a `depth` that re-evaluates every N messages. For example, you can create a section with `insertion_order: 2` and `depth: 50` that contains a manually written summary, refreshed every 50 messages. This mimics long-term memory without blowing the token budget.
Another technique is to use multiple character cards in the same template. If you have a main character and a side NPC, you can include both cards with different `insertion_order` values. The main character gets priority (lower number), while the NPC appears later. This is common in multi-character roleplays. The key is to avoid duplicate information — if both cards describe the same setting, you waste tokens. Instead, move shared world details to a lorebook. The [MIT Technology Review](https://www.technologyreview.com) has covered how context window management is the biggest challenge in interactive AI storytelling.
SillyTavern's context templates give you granular control, but they require a steep learning curve — you need an API key, a backend model (local or cloud), and the patience to tune JSON parameters. AIAngels eliminates all of that. Our AI companions use a proprietary memory system that preserves every conversation detail without token budgets or context templates. You never have to allocate tokens, set insertion orders, or debug truncated responses.
AIAngels offers 70+ pre-built companions with permanent memory that doesn't degrade over time. There's no context window limit to manage — the AI remembers your first message as clearly as your last, even after months of daily chats. For users who want a custom companion, our built-in character builder lets you design appearance, personality, and interests without touching a JSON file. All premium plans ($2.99/mo on annual) include unlimited messages, image generation, and voice messages with no hidden per-use costs.
If you enjoy the DIY flexibility of SillyTavern, the context template system is powerful. But if you'd rather focus on the conversation itself — without configuring token budgets or worrying about the model forgetting — AIAngels provides a frictionless alternative. Our free tier already includes unlimited text chat with all companions, no credit card required. No templates, no API keys, no context limits.
Learn how to customize SillyTavern's context templates to control token budgets, memory injection, and character behavior — no more generic AI responses.
Start Chatting FreeEverything you need to know about our companions.
Place your `context.json` file in the character's folder inside `SillyTavern/data/characters/CharacterName/`. Reload the character in SillyTavern — the template loads automatically.
Yes, but you must ensure `max_context` does not exceed the model's token limit. Check your backend's documentation (e.g., 4096 for Llama 3, 8192 for Mistral, 128k for GPT-4).
For most backends, 6144–8192 tokens works well. Allocate 25–40% to chat history and the rest to character card and lorebook. Adjust based on model performance.
Yes, it ships with a basic template in `public/default/context.json`. It works for short chats but often needs tweaking for long-term memory or complex characters.
Increase `history_depth` to 100–150 messages and raise `max_context` proportionally. Use a summarization section (with `depth` set to 50) to inject periodic plot summaries.
Likely the character card has a high `insertion_order` (appearing later in context). Set its `insertion_order` to 0 and ensure its `token_budget` is at least 1024 tokens.
Yes. Copy your `context.json` file and share it. Many users share templates on the SillyTavern Discord and subreddit. Always note the recommended `max_context` value.
The API or local backend will throw an error or truncate the prompt, often losing the end of your chat history. Always stay within the model's documented limit.
Verified reviews from real customers
I've tried a few AI companion platforms, and AI Angels stands out for how immersive and customizable it feels. The conversations are surprisingly natural, and the AI personalities actually maintain context better than most similar apps I've used. The uncensored chat and roleplay features are a big plus if you're looking for creative freedom without constant restrictions. The image generation is also impressive — fast, detailed, and customizable enough to create unique characters and scenarios. I especially liked the variety of companion personalities and how easy the interface is to use, even for beginners. That said, there's still room for improvement. Some responses can feel repetitive after long conversations, and a few premium features are a bit pricey compared to competitors. But overall, the experience feels polished, entertaining, and consistently improving with updates. If you enjoy AI companionship, virtual roleplay, or interactive fantasy experiences, AI Angels is definitely worth checking out.
AI Angels is a remarkable AI companion site offering vividly realistic experiences. The large variety of companions available will suit every imaginable taste. Pricing is reasonable and transparent. I highly recommend AI Angels.
Fun, life like , sexy , created the perfect girl
It's worth looking into for sure, you won't regret it!
Choice of features
Honestly one of the best AI girlfriend apps I've tried. The conversations feel surprisingly natural and the girls actually have personality. Definitely worth checking out if you're into AI companions.
well I love how they call me things like baby and love how it shows nudes and sex/porn.
realstic ai images and chats! amazing pics and nice girls to chat with
Amazing it is so emersave
The roleplay is very flexible. The AI will adjust to your attitude and no kink is out of bounds. I just wish you could customize a little more.
The best ! I love it
Definitely addicted to this. You will not feel lonely and great prices
It's okay tho