r/LocalLLaMA · June 30, 2026 · 1 min read

Benchmarked Graph-RAG vs. Graph-Free Multi-Hop RAG: The graph mostly bought us a massive rebuild bill, not accuracy.

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

We kept hitting the same wall building multi-hop RAG: the systems with the best accuracy (GraphRAG, HippoRAG 2, RAPTOR) all lean on a knowledge graph built offline - and that’s great numbers, until the moment your data changes! Every update means re-running an LLM indexing pass to rebuild the graph. For a corpus that moves daily (prices, filings, tickets, news), you're paying that rebuild cost constantly.

So we tested whether the graph is actually necessary. We ran a graph-free dense index with query-time orchestration instead (with no graph, no GPU), every component behind a commodity API — against the graph-based systems on HotpotQA, 2WikiMultiHopQA, and MuSiQue.

Against the graph systems, it won on all three benchmarks:

Benchmark	MOTHRAG (ours)	GraphRAG	HippoRAG 2	RAPTOR
HotpotQA	78.1	68.6	75.5	69.5
2WikiMultiHop	76.3	58.6	71.0	52.1
MuSiQue	50.5	38.5	48.6	28.9

And updates are just embed-and-append, with no need in rebuild, and retraining. Cost is ~$0.03/query on commodity APIs, no GPU anywhere.

Against GPU-bound systems that use constrained decoding (NeocorRAG), it's not a clean win. We match them on HotpotQA (78.1 vs 78.3) and 2Wiki (76.3 vs 76.1), but we lose on MuSiQue (50.5 vs 52.6). MuSiQue is our weak spot (retrieval recall bottlenecks there), and we haven't solved it yet.

The takeaway for us: for multi-hop over changing data, the graph overhead mostly buys you a rebuild bill, not accuracy. A graph-free index with good query-time orchestration held up.

Curious where others landed on this, is the graph worth the rebuild cost for data that changes?

submitted by /u/Annual-Commercial563
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/LocalLLaMA