windags_skill_searchShippingSix-stage cascade — BM25 + MiniLM + RRF + cross-encoder + local k-NN + cross-user global priors.
Returns ranked candidates with descriptions only (no full bodies — pair with skill_graft to load winners). Stage 1 is wink-bm25 over name + description + tags + category. Stage 2 is cosine similarity vs ~800KB of bundled embeddings (all-MiniLM-L6-v2, 384-dim, one vector per skill, packed as a Float32 array and shipped with the brew formula). Stage 3 is reciprocal rank fusion (Cormack et al, K=60). Stage 4 takes the RRF top-30 and reranks with Xenova/ms-marco-MiniLM-L-6-v2 — a cross-encoder that scores (query, candidate) jointly, catching interactions that bi-encoder cosine misses. Stage 5 is per-user attribution k-NN: every /next-move you run writes a triple to ~/.windags/triples/ with the prompt and the skills that got accepted; this stage embeds the historical prompts (cached after first call), finds the k nearest to your current query, and boosts skills that worked in similar past sessions. Empty history → no-op. Stage 6 blends in cross-user global priors served by api.windags.ai — three nested tiers (manifest_match ⊆ exact_match ⊆ any_match) decomposed into exclusive subsets and capped at a 0.2 blend weight; fetched once at startup with adaptive freshness refresh. Network unreachable or WINDAGS_TELEMETRY=off → Stage 6 silently no-ops. Both MiniLM models (~25MB each) download once and cache in ~/.cache/transformers-js/. Local-first, with one cheap public read per startup.
- query: string — Free-text task description
- limit: number = 10 — How many candidates to return
Array of { id, name, description, score, breakdown } where breakdown shows each stage's rank.
windags_skill_search({ query: "stripe webhook idempotency", limit: 5 })