add mini-context-graph skill (#1580)

* add mini-context-graph skill * remove pycache files * filename case update to SKILL.md * update readme
2026-06-20 22:47:43 +00:00 · 2026-05-05 09:34:37 +05:30
parent 1f96bce626
commit 746ba555b6
16 changed files with 2343 additions and 0 deletions
@@ -0,0 +1,163 @@
+# Retrieval Instructions
+
+This file defines how the agent answers queries using the two-layer retrieval strategy:
+**wiki-first** (fast path), then **graph traversal with evidence** (deep path).
+
+---
+
+## Overview
+
+Retrieval is a 7-step process:
+
+1. Parse the query
+2. **Check the wiki first** (fast path)
+3. Find seed nodes in the graph
+4. Expand the graph via BFS
+5. Prune noisy nodes
+6. Build the subgraph with provenance
+7. Return structured context
+
+---
+
+## Step 1: Parse the Query
+
+Read the query string and identify:
+- **Key noun phrases**: potential entity names (e.g., "system crash", "memory leak")
+- **Keywords**: individual meaningful words (e.g., "crash", "leak", "memory")
+- Normalize all terms to **lowercase**
+
+Ignore stopwords (e.g., "the", "a", "is", "why", "does", "how", "what").
+
+---
+
+## Step 2: Check the Wiki First (Fast Path)
+
+Before touching the graph, search the wiki. The wiki contains compiled knowledge —
+cross-references already resolved, contradictions flagged, syntheses written.
+
+```python
+from scripts.tools import wiki_store
+
+results = wiki_store.search_wiki(query)
+```
+
+For each relevant result, read the page:
+
+```python
+content = wiki_store.read_page_by_slug(result["slug"])
+```
+
+**If the wiki has a sufficient answer:**
+- Synthesize from wiki pages.
+- Cite the source pages (e.g., "According to [[memory-leak]] and [[system-crash]]...").
+- File the answer as a new wiki topic page if it's valuable and not already captured:
+  ```python
+  wiki_store.write_page(category="topic", title="Why System Crashes", content=..., summary=...)
+  ```
+- **Return early** — no graph traversal needed.
+
+**If the wiki answer is incomplete or missing:** proceed to Step 3.
+
+---
+
+## Step 3: Find Seed Nodes
+
+Call `index_store.search(query)` with the original query string.
+
+This returns node IDs matching entity names or keywords.
+
+If no seed nodes are found:
+- Try searching with individual keywords from Step 1.
+- If still no results, return an empty subgraph: "No relevant entities found."
+
+---
+
+## Step 4: Expand the Graph (BFS)
+
+Call `retrieval_engine.retrieve(seed_node_ids, depth=2)`.
+
+BFS from seed nodes:
+- **Depth 1**: direct neighbors
+- **Depth 2**: neighbors of neighbors
+
+Rules:
+- Only traverse edges with confidence ≥ MIN_CONFIDENCE (from config.py)
+- Do NOT traverse beyond depth 2
+- Collect all visited node IDs
+
+---
+
+## Step 5: Prune Nodes
+
+- Limit total nodes to MAX_NODES (from config.py)
+- Prioritize:
+  1. Seed nodes (always include)
+  2. Nodes at depth 1
+  3. Nodes at depth 2 (as space allows)
+- Remove nodes only weakly connected (edge confidence < MIN_CONFIDENCE)
+
+---
+
+## Step 6: Build the Subgraph with Provenance
+
+For a standard query, call:
+
+```python
+subgraph = skill.query(query)
+# Returns: {"nodes": {node_id: {name, type, source_document, source_chunks}},
+#           "edges": [{source, target, type, confidence, source_document, supporting_text, chunk_id}]}
+```
+
+For queries requiring evidence (citations, fact-checking), call:
+
+```python
+result = skill.query_with_evidence(query)
+# Returns:
+# {
+#   "query": str,
+#   "subgraph": {"nodes": {...}, "edges": [...]},
+#   "supporting_documents": [
+#     {
+#       "doc_id": str,
+#       "doc_title": str,
+#       "supporting_chunks": [{"chunk_id": str, "text": str}, ...]
+#     }
+#   ],
+#   "evidence_chain": "memory leak --[causes]--> system crash"
+# }
+```
+
+---
+
+## Step 7: Return Structured Context
+
+Return the result with:
+- **Subgraph**: nodes + edges (the graph answer)
+- **Supporting documents**: source chunks that prove each relation
+- **Evidence chain**: human-readable path summary
+- **Wiki references**: links to relevant wiki pages found in Step 2
+
+**If valuable, file the answer back into the wiki:**
+
+```python
+wiki_store.write_page(
+    category="topic",
+    title=query,
+    content=f"# {query}\n\n**Evidence chain:** {result['evidence_chain']}\n\n...",
+    summary="...",
+)
+```
+
+This way, future queries on the same topic find the answer instantly in the wiki.
+
+---
+
+## Rules
+
+- NEVER fabricate nodes or edges not present in the graph
+- NEVER traverse deeper than depth 2
+- ALWAYS check the wiki before the graph (wiki-first)
+- Always include seed nodes in the result, even if they have no edges
+- Prefer edges with higher confidence when pruning
+- File valuable answers back into the wiki as topic pages
+- Return an empty subgraph (not an error) if no relevant nodes are found