Partitioned Graph Synthesis
A coverage-optimized query architecture for large knowledge graphs. PGS overcomes the structural coverage ceiling of single-pass retrieval by decomposing graphs into topologically coherent partitions, sweeping each at full fidelity, and synthesizing across all outputs for genuine cross-domain reasoning.
The Problem PGS Solves
Standard RAG retrieves the top-K nodes by embedding similarity and fits them into a single LLM context window. For knowledge graphs with hundreds or thousands of nodes, this has a hard structural limit: nodes semantically distant from the query embedding are never examined, even when they contain directly relevant findings. The system silently omits the gaps.
PGS addresses three failure modes simultaneously: local scoring misses topologically distant content (embedding similarity misses structurally relevant nodes), context window exhaustion (graphs over ~500 nodes cannot fit in a single pass), and no mechanism to report gaps (standard RAG returns answers derived only from retrieved content, with no channel for what was not found).
The architecture is a four-phase pipeline: Partition the graph using community detection, Route the query to relevant partitions using centroid similarity, Sweep each partition independently with a full-fidelity LLM pass, and Synthesize across all sweep outputs in a final cross-domain reasoning step.
PGS is not a new algorithm. It combines established techniques -- Louvain community detection, cosine similarity routing, parallel LLM inference -- in an architecture oriented toward coverage completeness and explicit gap reporting. The novelty is the structured information contract between sweep and synthesis phases, which transforms disconnected per-partition outputs into material for genuine cross-domain reasoning.
The Four-Phase Pipeline
Phase 0: Partition
The graph is partitioned using the Louvain community detection algorithm -- pure JavaScript, no external dependencies. Cache validity is determined by a four-field brain hash (nodeCount:edgeCount:cycleCount:timestamp). Any change to the graph invalidates the cache and triggers re-partitioning.
The modularity gain formula for moving a node from one community to another:
// Modularity gain for moving node i to community d
gain = wTarget - (ki * sigmaTarget) / m2
netGain = gain - removeGain
// Where:
// wTarget = sum of edge weights between node i and community d
// ki = total strength of node i
// sigmaTarget = total strength of community d
// m2 = 2 * total graph weight
// A node moves only when netGain > 1e-10 (numerical stability)The algorithm runs up to 20 iterations with randomized node processing order for better convergence, terminating early when a full pass produces no moves.
Post-Processing
After Louvain convergence, two adjustments are applied. Merge small communities: communities with fewer than 30 nodes (configurable) are merged into their most-connected neighbor by total edge weight, iteratively until no undersized communities remain. Split large communities: communities exceeding 1,800 nodes (configurable) are bisected using a greedy algorithm -- seed two groups, assign each remaining node to the group it shares more edge weight with, with a balance penalty to prevent degenerate splits. This is simple bisection, not hierarchical Louvain.
Partition Enrichment
Each partition is enriched with metadata before caching:
- Centroid embedding -- element-wise mean of all 512-dimensional node embeddings within the partition. Nodes without embeddings are excluded.
- Keywords -- term frequency scoring across node concept texts (document frequency, one count per node). Stop words filtered. Top 20 terms stored.
- Adjacent partitions -- cross-partition edges enumerated, top 5 neighbor partitions by shared edge count. Used for "peripheral vision" in sweep prompts.
- Quick summary -- no-LLM heuristic: top keywords plus the highest-weight node text (first 120 characters).
Phase 1: Route
Given a query, PGS computes its embedding and scores each partition by cosine similarity to its centroid. Partitions above the relevance threshold (default 0.25) are selected, capped at maxSweepPartitions (default 15).
Broad query bypass: Six regex patterns detect comprehensive or open-ended queries (e.g., "what are the gaps," "comprehensive overview," "full sweep"). When matched, routing is skipped and all partitions are selected up to the max cap. This prevents unfocused queries from routing to zero partitions.
Without embedding provider: When no embedding provider is configured, routing cannot compute cosine similarity. PGS degrades gracefully by sweeping all partitions up to the configured maximum.
Phase 2: Sweep
Selected partitions are swept in batches of 5 concurrent LLM calls (configurable). Each batch uses Promise.allSettled for fault tolerance -- a failed sweep does not block or abort others in the batch.
Full-fidelity context: Each sweep receives all nodes in its partition, sorted descending by weight, at full text -- no summarization, no compression, no tiering. A 500,000-character safety cap prevents context overflow for pathologically large partitions, but in practice no real brain has reached this limit.
Adjacent partition context: Summaries and keywords of up to 5 adjacent partitions are appended to the sweep prompt, giving each sweep awareness of neighboring domains without accessing those nodes directly.
Structured response contract: The sweep prompt specifies an exact four-section output structure:
Domain State
2-3 sentences on what this partition covers and its research state relative to the query.
Findings
Key discoveries with Node ID citations for traceability.
Outbound Flags
Specific, characterized connections to other partitions using adjacent partition context.
Absences
What was searched for and NOT found in this partition. The critical innovation.
The Absences section is the critical innovation. Rather than silently omitting missing content, each sweep explicitly reports what it searched for and did not find. This creates a machine-readable gap signal that the synthesis phase aggregates.
Phase 3: Synthesize
A single LLM call receives all sweep outputs concatenated with partition metadata headers. The synthesis prompt specifies four explicit tasks:
- Cross-Domain Connection Discovery -- chase the outbound flags from the sweep phase. When Partition A flags a connection to Partition B's domain, determine whether the connection is genuine by consulting Partition B's sweep output.
- Absence Detection -- when multiple partitions report absence of the same content, that is high-confidence evidence of a real gap in the knowledge graph. When one partition flags an outbound connection but the target partition reports absence, that identifies a specific research opportunity.
- Convergence Identification -- findings appearing independently in multiple partitions are treated as stronger evidence than single-partition findings. Independent observation of the same phenomenon increases confidence.
- Thesis Formation -- the synthesizer is instructed not to survey findings but to make claims, commit to positions, and rank insights. The goal is a thesis, not a literature review.
Synthesis uses reasoningEffort: 'high' and streams output to the client as it generates.
Session Management
PGS maintains persistent session state, tracking which partitions have been examined across queries. This enables incremental coverage -- a user can progressively exhaust a knowledge graph rather than re-querying from scratch each time.
Three session modes control which partitions are swept:
- full (default) -- route the current query against all partitions, sweep up to the configured maximum. Session history is recorded but does not affect partition selection.
- continue -- skip partitions already searched. Sweep only the unsearched remainder. Falls back to full sweep if all partitions have been covered. This is the primary mode for incremental deep exploration.
- targeted -- re-route the query against only the unsearched partitions. Combines session continuity with relevance routing -- useful when a follow-up query is more focused and should not re-sweep irrelevant partitions from the previous run.
Session tracking accumulates across calls. If a full sweep covers 5 of 11 partitions, a subsequent continue will sweep the remaining 6. After both runs, all 11 partitions have been examined and the session is exhausted.
Configuration Reference
All parameters have defaults. Environment variables override defaults; per-call pgsConfig overrides both.
| Parameter | Default | Description |
|---|---|---|
maxConcurrentSweeps |
5 | Parallel LLM calls per sweep batch |
minCommunitySize |
30 | Merge communities smaller than this |
targetPartitionMax |
1800 | Split communities larger than this via bisection |
maxSweepPartitions |
15 | Maximum partitions to sweep per query |
minSweepPartitions |
0 | Minimum partitions to sweep (0 = relevance only) |
partitionRelevanceThreshold |
0.25 | Cosine similarity threshold for routing |
sweepMaxTokens |
6000 | Token budget per sweep LLM call |
synthesisMaxTokens |
16000 | Token budget for synthesis LLM call |
sweepFraction |
null | Fraction of routed partitions to sweep (0.1-1.0); overrides maxSweepPartitions when set |
API Reference
Constructor
const { PGSEngine } = require('pgs-engine');
const engine = new PGSEngine({
sweepProvider, // LLM for sweep (called N times in parallel)
synthesisProvider, // LLM for synthesis (called once)
embeddingProvider, // For query routing (optional)
storage, // Session/cache persistence (optional)
config, // Override defaults (optional)
onEvent, // Global event listener (optional)
});execute(query, graph, options)
Run the full PGS pipeline: partition, route, sweep, synthesize.
const result = await engine.execute(query, graph, {
mode: 'full', // 'full' | 'continue' | 'targeted'
sessionId: 'default', // Session tracking identifier
fullSweep: false, // Bypass routing, sweep everything
sweepFraction: 0.5, // Sweep 50% of routed partitions
onChunk: (text) => {}, // Synthesis token streaming
});
// result.answer — The synthesis output
// result.metadata.pgs — { totalNodes, totalPartitions, sweptPartitions, elapsed, ... }layeredSearch(query, graph, options)
The gold pattern: run a standard top-K query first, then feed that context into a PGS pass. The standard query tells PGS what is already known; PGS finds what is not.
const result = await engine.layeredSearch(query, graph, {
topK: 20, // Standard query retrieves top-K nodes first
standardProvider, // LLM for standard query
mode: 'full',
});
// result.answer — PGS synthesis (enhanced with standard context)
// result.standardAnswer — What standard retrieval foundpartition(graph)
Run Louvain community detection and return partition structure. No LLM calls.
const partitions = engine.partition(graph);
// Array of { id, nodeIds, centroid, keywords, adjacentPartitions, summary }route(query, graph, partitions)
Score partitions by cosine similarity to the query embedding. Returns partitions above the relevance threshold.
const routed = await engine.route(query, graph, partitions);
// Array of { partition, similarity } sorted by relevancesweepPartition(query, partition, graph, partitions)
Run a single full-fidelity sweep on one partition. Returns the structured four-section output.
const sweepResult = await engine.sweepPartition(query, routed[0], graph, partitions);
// { content, partitionId, nodeCount, elapsed }synthesize(query, sweepResults, context)
Single cross-domain reasoning pass over all sweep outputs.
const answer = await engine.synthesize(query, sweepResults, {
onChunk: (text) => process.stdout.write(text),
});
// Returns the synthesis stringProvider Interfaces
PGS is provider-agnostic. Implement three simple interfaces to use any LLM or embedding service.
LLMProvider (sweep and synthesis)
{
generate: async ({
instructions, // System prompt
input, // User content (graph nodes, query)
maxTokens, // Response token budget
reasoningEffort, // 'low' | 'medium' | 'high' (advisory)
onChunk, // Streaming callback (optional)
}) => ({ content: string })
}EmbeddingProvider (optional)
{
embed: async (text) => number[] | null, // Returns embedding vector
dimensions: number // Output dimensions (e.g., 512)
}StorageProvider (optional)
{
read: async (key) => string | null, // Read JSON string by key
write: async (key, data) => void // Write JSON string
}Provider Examples
// OpenAI
const engine = new PGSEngine({
sweepProvider: makeOpenAIProvider('gpt-4o-mini'),
synthesisProvider: makeOpenAIProvider('gpt-4o'),
embeddingProvider: makeOpenAIEmbeddingProvider(),
});
// Anthropic
const engine = new PGSEngine({
sweepProvider: makeAnthropicProvider('claude-sonnet-4-20250514'),
synthesisProvider: makeAnthropicProvider('claude-opus-4-20250514'),
});
// xAI
const engine = new PGSEngine({
sweepProvider: makeOpenAICompatibleProvider('https://api.x.ai/v1', key, 'grok-3-mini'),
synthesisProvider: makeOpenAICompatibleProvider('https://api.x.ai/v1', key, 'grok-3'),
});Comparison to Existing Systems
| Feature | Standard RAG | Microsoft GraphRAG | PGS |
|---|---|---|---|
| Retrieval method | Top-K by embedding | Leiden communities + pre-generated reports | Louvain + full-fidelity sweep per query |
| Coverage awareness | None | Partial (community reports) | Full (session tracking with %) |
| Absence reporting | No | No | Yes (structured per partition) |
| Cross-domain discovery | No | Via community summaries | Yes (outbound flags + synthesis chasing) |
| Convergence detection | No | No | Yes (independent findings across partitions) |
| Session continuity | No | No | Yes (continue/targeted modes) |
| Build mode | Batch (offline index) | Batch (offline index) | Online (partition at query time) |
| Community algorithm | N/A | Leiden (hierarchical) | Louvain (single-level + bisection) |
| Entity extraction | N/A | Explicit NER pipeline | Pre-existing (graph built by research agents) |
GraphRAG is designed as a batch pipeline: extract entities from a text corpus, build a graph, run Leiden, generate community reports, store. Queries run against stored community reports, which reflect the graph at index time and can become stale. PGS is a live query architecture: partitions are cached but sweep outputs are generated fresh per query against the current graph state.
PGS vs. HNSW Vector Stores
| Dimension | HNSW Stores (Pinecone, Weaviate) | PGS |
|---|---|---|
| Index type | Approximate nearest-neighbor vector index | Graph-topology-aware community partitioning |
| Retrieval scope | K nearest vectors in embedding space | Full partition content (all nodes, full text) |
| Structural relationships | None (vector proximity only) | 13 semantic edge types |
| Gap reporting | No | Structured absences |
| Coverage tracking | No | Session-based, incremental |
| Query latency | Milliseconds | Seconds to minutes (N sweep LLM calls) |
HNSW stores are read-optimized indexes designed for sub-millisecond retrieval of the K nearest embeddings. PGS trades latency for coverage: it examines entire topological communities at full fidelity rather than individual nearest neighbors. For small graphs or latency-critical applications, HNSW is strictly preferable. For research synthesis, gap analysis, and cross-domain discovery on graphs with 200+ nodes, PGS surfaces what vector similarity alone cannot.
Known Limitations
PGS has real constraints. These are documented honestly because understanding them is essential for using the system effectively.
- Singleton partition problem. Isolated nodes (zero edges or edges only to the dominant cluster) become their own partitions. Each singleton requires a separate sweep LLM call. For targeted queries, routing assigns low scores to most singletons, but broad queries or full sweeps generate many LLM calls that return only "no findings."
- Supercluster tendency. Research graphs are often dominated by one large community. When one partition contains 98% of nodes, PGS for a focused query degenerates to sweeping that one partition -- equivalent to a large standard RAG call. PGS is most beneficial for graphs with meaningful community structure (multiple research domains, merged runs, or deliberately broad topics).
- No hierarchical Louvain. The implementation runs only Phase 1 of Louvain (node moves). Phase 2 (graph contraction and recursion) is not implemented. For large dense graphs, Phase 1 alone may produce suboptimal community assignments. The bisection fallback is a workaround, not a substitute.
- Unweighted centroids. The centroid embedding is an unweighted mean. High-importance nodes influence the centroid equally to low-importance nodes. A weighted centroid would produce a more representative routing target, but this is not yet implemented.
- Latency. PGS makes N+1 LLM calls (N sweeps + 1 synthesis) versus 1 for standard RAG. A 5-partition sweep on local models takes approximately 78 seconds. Cloud APIs with higher throughput complete in 15-30 seconds for the same partition count.
Get the Code
PGS Engine is open source under MIT license. The package includes a real research brain (586 nodes of conformal field theory research) for exploration and testing without API keys.