Stop Your Claude Code Skill List From Hogging Your Context Window: The Router Pattern
Custom Claude Code skills cost tokens on every turn — even the ones you don't invoke. Here's the router pattern that cut my catalog from 78 skills to 40 without losing a single workflow.
I had 78 custom slash commands in my Claude Code vault. Every prompt I sent was paying for all 78 — even when I only used one. Here’s what was happening and how I cut the catalog to 40 skills without losing a single workflow.
The hidden cost
Every custom skill in Claude Code ships with a name and a description field. Both get injected into the system prompt on every turn, regardless of whether you ever invoke that skill in the session. That’s how the model knows the skill exists and what it does — but it’s also how a quiet folder full of helpers turns into a tax you didn’t realize you were paying.
The math is simple and unflattering:
- Average description length: ~40 tokens.
- 78 skills × 40 tokens = ~3,100 tokens per turn.
- A 100-turn session: ~310,000 tokens spent on skill metadata before Claude does any actual work.
You don’t see this in your editor. The skills sit in claude/skills/<name>/SKILL.md. Adding one costs nothing visible. Adding ten doesn’t feel different. By the time you notice the catalog is dominating the system prompt, you have 78 of them and no obvious place to cut.
The fix in one sentence
Replace clusters of near-identical skills with one router skill that dispatches to sub-files.
The router’s description is the only thing in the system prompt. The sub-files — the actual procedures — are read on invocation, not at session start. One router replaces N near-clones, and the ambient token cost drops to that of a single skill.
Worked example: /p — project briefing
Before consolidation I had 12 skills: /p-cto, /p-zufa, /p-recall, /p-rankrush, /p-jobhunter, /p-cc, /p-sa, /p-telegram, /p-obsidian, /p-keeper, /p-automations, /p-apprabbit.
Every one of them ran the same 6-step briefing algorithm: load task cache, find today’s focus, pull recent decisions from the project index, identify the last session’s edits, optionally compute billable hours, render the briefing. The only thing that differed was the project slug and a handful of file path patterns.
So I collapsed them:
claude/skills/p/SKILL.md— the dispatcher. A table of 12 alias blocks, each with a parameters table (project name, slug, file paths, billable rate, status banner).claude/skills/_p-template/SKILL.md— the canonical 6-step algorithm, parameterized.
Invocation: /p cto → dispatcher looks up cto, reads _p-template, substitutes the cto parameters, runs the algorithm, renders the briefing.
The model never sees _p-template in the system prompt — its filename starts with _, and the skill list excludes underscore-prefixed entries. The dispatcher’s description is all that loads ambiently. Cost: ~40 tokens regardless of how many aliases exist.
Before: 12 × 40 = 480 tokens/turn, every turn, even when none were invoked.
After: 40 tokens/turn for the dispatcher. Sub-file reads happen only when /p <alias> actually fires.
The four consolidations I shipped
I applied the same pattern across the catalog. Here’s the scorecard:
| Router | Replaced | Skill count delta |
|---|---|---|
/p <alias> | 12 per-project briefing skills | −11 |
/eo <s|d|w|m|q|y> | 6 end-of-period skills (/eos, /eod, /eow, /eom, /eoq, /eoy) | −5 |
/so <d|w|m|q|y> | 5 start-of-period skills | −4 |
/lifecycle <close|reopen> <client|project> | 4 lifecycle skills | −3 |
/update <machine|skills|tools-index> | 3 refresh skills | −2 |
Net: ~25 fewer top-level skills. Zero workflows lost. Every original slash command still works — it just dispatches through a router now.
The /eo consolidation is a good second example: end-of-session, end-of-day, end-of-week, end-of-month, end-of-quarter, end-of-year. Six distinct procedures that share the “wrap up the period, write a retrospective, update caches” skeleton but differ in exactly how. Six sub-files under claude/skills/eo/, one dispatcher table at the top of eo/SKILL.md. Same pattern, different domain.
The 40-skill cap
The cap isn’t arbitrary. It’s the line where the catalog stops dominating the system prompt. I codified it in a vault rules file and wrote a /skill-audit skill that runs monthly as part of the end-of-month routine. Any new skill proposal has to answer one of four questions:
- Does it replace an existing skill? (Archive the old one in the same change.)
- Does it consolidate into an existing router? (Add as a sub-file, not a top-level skill.)
- Is it project-local? (Lives under
projects/<slug>/claude/skills/, only loads when working in that repo.) - Does it justify the slot at a catalog count of ≤39? (Net new vault-wide skill — must be exercised more than once a month and not fit any router.)
If the answer is “none of the above,” the workflow becomes a manual checklist in the relevant project index, not a skill. The catalog stays lean because /skill-audit blocks the regrowth — same way a linter blocks the import sprawl you’d otherwise accumulate.
When NOT to consolidate
Two anti-patterns I had to learn the hard way:
Distinct workflows that share early steps but diverge. /tdd and /debug both start with “read the code first.” But the dispatch logic afterwards is genuinely different — TDD is RED-GREEN-REFACTOR; debug is reproduce-pattern-hypothesize-fix. One router with a 200-line if tree would be worse than two skills. Keep them separate.
One-shot rituals you invoke once a year. /eo y doesn’t need to merge with /eo m even though both are end-of-period retrospectives. The procedures differ by more than the alias — annual review pulls quarterly notes, monthly review pulls weekly notes. The router pattern is for near-clones, not for anything that shares a vague theme.
How to apply this to your setup
- Grep your
claude/skills/folder for description fields starting with the same 5 words. That’s the cleanest signal of clone candidates. If 12 skill descriptions all start with “Generate a briefing for…”, those 12 want to be a router. - Look for adjacent skill names with the same prefix or suffix.
/p-foo,/p-bar,/p-bazis the obvious signal. So is/close-client,/close-project,/reopen-client,/reopen-project— that’s the/lifecycleshape. - One dispatcher + one template + N sub-files. Preserve the original procedures verbatim inside the sub-files; only the entry point changes. Don’t rewrite logic while you consolidate — that’s two refactors in one PR and a recipe for regressions.
Result
Catalog at 40. System prompt ~1k tokens lighter per turn. New skills default to routers, not top-level entries. The audit runs monthly and surfaces drift before it compounds. If your Claude Code setup feels heavier than it should, this is usually where the weight is hiding.
If you found this useful, you might also want an effective AI strategy for your team and the case for letting your developers work without pull requests — same lens, different parts of the engineering stack.
I help engineering teams set up Claude Code so it actually compounds — production-grade CLAUDE.md, memory rules, a lean skill catalog that stays lean. If you have 50+ custom skills and want to know which to consolidate first, here’s where to start.
Gabe Giro
Fractional CTO & Android Engineer · 12+ years · 150M+ users impacted
I help startups and scale-ups build better software faster — as a fractional CTO or hands-on Android consultant. Notable clients include HBO GO / Max, AppRabbit, and Recall.
LinkedIn profile →Stay in the loop
Practical thoughts on engineering leadership, Android, and AI. No spam, unsubscribe anytime.