docs-gap-finder
Surfaces which documentation pages should be created next, based on real user signals (search misses, unanswered AI-chat questions, top external queries) instead of guesswork. Cross-references these signals against the live doc graph so already-covered topics are filtered out, then returns a prioritized list of the top 7 pages to create. Optionally opens a GitHub Issue per gap with a draft outline.
Arguments#
$ARGUMENTS.workspace— string, required. Workspace ID orowner/repo.$ARGUMENTS.period— string, optional. Analytics window:7d|14d|30d(default:30d).$ARGUMENTS.open_issues— boolean, optional (default:false). Iftrue, create one GitHub Issue per gap in the workspace's source repository with a draft outline.$ARGUMENTS.limit— number, optional (default:7). Max number of gaps to return.
Step 1 — Pull signals from Docsbook MCP#
Call the following MCP tools in parallel for the target workspace:
get_failed_searches({ workspace, period })— internal search queries that returned zero or low-relevance results.get_ai_unanswered({ workspace, period })— AI-chat questions where the model could not answer (no grounded citation, orchat.no_answerevent fired).get_popular_searches({ workspace, period })— top search queries by volume, including ones that did return results (used as demand signal).
For each query/question, retain:
- normalized text (lowercased, trimmed)
- frequency / count
- representative example phrasings
- source signal type (
failed_search|ai_unanswered|popular_search)
Step 2 — Cluster and score#
Group near-duplicate queries into topic clusters (e.g. "how to set custom domain", "custom domain SSL", "docs.mycompany.com setup" → one cluster: custom domain setup).
Compute a priority score per cluster:
score = (failed_search_count * 3) + (ai_unanswered_count * 3) + (popular_search_count * 1)
Failed searches and AI-unanswered are weighted higher because they represent confirmed gaps (a user actively tried to find an answer and failed). Popular searches add demand magnitude.
Step 3 — Cross-reference with the doc graph#
Call get_doc_graph({ workspace, format: "toon" }) and drop clusters whose topic is already covered. A cluster is considered "covered" when:
- An existing page title, H1, or H2 closely matches the cluster topic (case-insensitive token overlap ≥ 0.6), and
- That page has non-trivial content (more than a stub).
Keep clusters where:
- No matching page exists, or
- A matching page exists but is a stub / only mentions the topic in passing.
For "stub" clusters, mark them as expand_existing (pointing to the existing page) rather than create_new.
Step 4 — Produce the prioritized report#
Sort surviving clusters by score, take top limit (default 7), and emit a markdown report:
# Documentation gaps — <workspace> (<period>)
Found N high-signal gaps. Prioritized by user demand.
## 1. <Cluster topic> — score: <X>
- Action: create_new | expand_existing → <path>
- Signals:
- failed_search: <n> hits (e.g. "<example query>")
- ai_unanswered: <n> hits (e.g. "<example question>")
- popular_search: <n> hits
- Suggested path: docs/<slug>.md
- Draft outline:
1. <heading 1>
2. <heading 2>
3. <heading 3>
## 2. ...
The draft outline is generated by the model from the cluster's example queries — it should answer the questions users actually asked.
Step 5 — Optionally open GitHub Issues#
If $ARGUMENTS.open_issues === true:
- Resolve the workspace's
owner/repo(fromget_workspaceif not given). - For each gap in the final list, create one issue via
gh issue create(or the GitHub API) with:-
Title:
docs: <Cluster topic> -
Body: the per-gap section from the report (signals + suggested path + draft outline), plus a footer:
--- Generated by docs-gap-finder. Signal window: <period>. Priority score: <X>. -
Labels:
documentation,gap-finder.
-
- Print the created issue URLs at the end of the report.
If open_issues is false, just print the report — the user decides what to do with it.
Notes#
- Requires PRO+ plan (the underlying analytics tools
get_failed_searches,get_ai_unanswered, andget_popular_searchesare PRO+ features). - Run this skill monthly or after major product launches — that's when search-miss patterns shift fastest.
- Pairs well with
docs-analyze(quality of existing pages) anddocs-stale-watcher(freshness). This skill answers a different question: what's missing entirely?