diff --git a/.claude/memory/MEMORY.md b/.claude/memory/MEMORY.md index 1a3bf89..d095d10 100644 --- a/.claude/memory/MEMORY.md +++ b/.claude/memory/MEMORY.md @@ -2,3 +2,5 @@ - [No nohup with pueue](feedback_no_nohup_with_pueue.md) — run `pueue follow|wait` directly as the bg task; nohup& orphans it from the harness. - [Burn down task list](feedback_burn_down_task_list.md) — when many asks are queued, do them all; don't stop to ask which first. - [Workshop paper goal](project_workshop_paper_goal.md) — current phase is ablations+seeds for a workshop paper; artifact tracker A1-A7 lives in docs/spec/20260602_writeup_spec.md. +- [qmd prefer lexical](qmd-prefer-lexical.md) — search local papers with `qmd search`/`rg`, not vector (corpus ~93% unembedded, can't fit embeddings). +- [Semantic Scholar keyed access](semantic-scholar-keyed-access.md) — S2 API key in semantic-search skill .env; use it to dodge 429s. diff --git a/.claude/memory/qmd-prefer-lexical.md b/.claude/memory/qmd-prefer-lexical.md new file mode 100644 index 0000000..1c8bad3 --- /dev/null +++ b/.claude/memory/qmd-prefer-lexical.md @@ -0,0 +1,22 @@ +--- +name: qmd-prefer-lexical +description: "Default to lexical search (qmd search / rg) on the papers corpus, not vector/semantic" +metadata: + node_type: memory + type: feedback + originSessionId: dfb6617b-8e6e-4008-96e0-81669fc600b4 +--- + +For local paper search, default to lexical: `qmd search` (BM25) or `rg`, NOT +`qmd vsearch`/`qmd query` (vector/HyDE/rerank). + +**Why:** (1) wassname finds vector search rarely helps him. (2) The big `papers` +qmd collection (~48k files) is ~93% unembedded, so semantic modes fall back to +junk there. (3) He cannot fit the embeddings on his PC, so `qmd embed` is not a +real fix. + +**How to apply:** When dispatching search agents over the local corpus, instruct +them to use `qmd search`/`rg` first and reach for `qmd query` only as a last +resort on a small embedded collection (e.g. markdown-notes). A subagent once +burned ~5 min and crashed (exit 144) running `qmd query` over `papers`; lexical +returns in milliseconds. Do not suggest running `qmd embed` on his machine. diff --git a/.claude/memory/semantic-scholar-keyed-access.md b/.claude/memory/semantic-scholar-keyed-access.md new file mode 100644 index 0000000..479aa89 --- /dev/null +++ b/.claude/memory/semantic-scholar-keyed-access.md @@ -0,0 +1,25 @@ +--- +name: semantic-scholar-keyed-access +description: Semantic Scholar API key lives in the semantic-search skill .env; use it to avoid 429s +metadata: + node_type: memory + type: reference + originSessionId: 14deeefc-610a-40ee-b01c-03cf4f1f54b6 +--- + +The keyless Semantic Scholar API (api.semanticscholar.org/graph/v1) 429s fast. +A real S2 key (len 40) is stored at +`~/.claude/skills/semantic-search/.env` as `SEMANTIC_SCHOLAR_API_KEY`. + +Use it for direct S2 calls: +```sh +set -a; . ~/.claude/skills/semantic-search/.env; set +a +curl -s "https://api.semanticscholar.org/graph/v1/paper/arXiv:?fields=title,authors,citationCount" \ + -H "x-api-key: $SEMANTIC_SCHOLAR_API_KEY" +``` +or just call the `semantic-search` skill, which loads the key itself. + +The `bibtex` MCP (DBLP/S2) sometimes returns 0 for brand-new arXiv papers +(days old); arXiv `citation_author` meta tags are the authoritative author list +and the keyed S2 API confirms them once indexed. See [[qmd-prefer-lexical]] for +the analogous local-search gotcha.