# RAG Frameworks Comparison

Choose the right Retrieval-Augmented Generation (RAG) framework for your project on Clore.ai GPU servers.

{% hint style="info" %}
**RAG (Retrieval-Augmented Generation)** lets LLMs answer questions using your own documents. This guide compares the four leading frameworks: LangChain, LlamaIndex, Haystack, and RAGFlow — covering features, performance, and when to use each.
{% endhint %}

***

## Quick Decision Matrix

|                    | LangChain        | LlamaIndex    | Haystack          | RAGFlow         |
| ------------------ | ---------------- | ------------- | ----------------- | --------------- |
| **Best for**       | General LLM apps | Document Q\&A | Enterprise search | Self-hosted RAG |
| **Learning curve** | Medium           | Low-Medium    | Medium-High       | Low             |
| **Flexibility**    | Very high        | High          | High              | Medium          |
| **Built-in UI**    | No               | No            | No                | Yes             |
| **GitHub stars**   | 90K+             | 35K+          | 15K+              | 12K+            |
| **Language**       | Python           | Python        | Python            | Python          |
| **License**        | MIT              | MIT           | Apache 2.0        | Apache 2.0      |

***

## Overview

### LangChain

LangChain is the most popular LLM orchestration framework. It provides a unified interface for chains, agents, memory, and RAG pipelines.

**Philosophy**: Everything is a chain of composable components.

```python
from langchain.chains import RetrievalQA
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.llms import OpenAI

# Build RAG pipeline in 5 lines
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(docs, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})
chain = RetrievalQA.from_chain_type(llm=OpenAI(), retriever=retriever)
result = chain.run("What is the capital of France?")
```

### LlamaIndex

LlamaIndex (formerly GPT Index) is purpose-built for document indexing and retrieval. It excels at connecting LLMs to diverse data sources.

**Philosophy**: Index first, query intelligently.

```python
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

# Load and index documents
documents = SimpleDirectoryReader("data/").load_data()
index = VectorStoreIndex.from_documents(documents)

# Query
query_engine = index.as_query_engine()
response = query_engine.query("Summarize the main findings")
print(response)
```

### Haystack

Haystack (by deepset) is an enterprise-grade NLP framework focused on search and Q\&A pipelines. It has a component-based architecture with visual pipeline builder.

**Philosophy**: Modular pipelines with enterprise reliability.

```python
from haystack.nodes import DensePassageRetriever, FARMReader
from haystack.pipelines import ExtractiveQAPipeline

retriever = DensePassageRetriever(document_store=document_store)
reader = FARMReader(model_name_or_path="deepset/roberta-base-squad2")
pipeline = ExtractiveQAPipeline(reader, retriever)
result = pipeline.run(query="What is machine learning?", params={"Retriever": {"top_k": 10}})
```

### RAGFlow

RAGFlow is an open-source RAG engine with a built-in web UI, document parsing, and knowledge base management. It's designed to be deployed as a complete solution.

**Philosophy**: Out-of-the-box RAG system, no coding required.

```yaml
# RAGFlow is deployed via Docker Compose
# Configuration via web UI at localhost:80
version: "3"
services:
  ragflow:
    image: infiniflow/ragflow:latest
    ports:
      - "80:80"
    volumes:
      - ./ragflow-data:/ragflow/data
```

***

## Feature Comparison

### Core RAG Features

| Feature              | LangChain | LlamaIndex | Haystack | RAGFlow  |
| -------------------- | --------- | ---------- | -------- | -------- |
| Vector store support | 50+       | 30+        | 20+      | Built-in |
| Document loaders     | 100+      | 50+        | 30+      | Built-in |
| Hybrid search        | ✅         | ✅          | ✅        | ✅        |
| Re-ranking           | ✅         | ✅          | ✅        | ✅        |
| Multi-modal          | ✅         | ✅          | Partial  | ✅        |
| Streaming            | ✅         | ✅          | ✅        | ✅        |
| Async support        | ✅         | ✅          | ✅        | ✅        |
| Agents               | ✅         | ✅          | ✅        | ❌        |

### Integration Ecosystem

| Integration Type | LangChain                                    | LlamaIndex                           | Haystack                          | RAGFlow                    |
| ---------------- | -------------------------------------------- | ------------------------------------ | --------------------------------- | -------------------------- |
| LLM providers    | 50+                                          | 30+                                  | 20+                               | 10+                        |
| Vector DBs       | Chroma, Pinecone, Weaviate, Qdrant, 40+ more | Chroma, Pinecone, Weaviate, 25+ more | Weaviate, Elasticsearch, 15+ more | Built-in InfiniFlow        |
| Document types   | PDF, Web, CSV, JSON, 80+                     | PDF, Web, CSV, DB, 40+               | PDF, TXT, HTML, 20+               | PDF, Word, Excel, PPT, Web |
| Cloud storage    | S3, GCS, Azure                               | S3, GCS, Azure                       | S3, GCS                           | S3                         |

### Advanced RAG Features

| Feature                                 | LangChain | LlamaIndex        | Haystack | RAGFlow |
| --------------------------------------- | --------- | ----------------- | -------- | ------- |
| Query decomposition                     | ✅         | ✅                 | ✅        | ✅       |
| HyDE (Hypothetical Document Embeddings) | ✅         | ✅                 | ❌        | ❌       |
| Multi-hop retrieval                     | ✅         | ✅                 | Partial  | ✅       |
| Contextual compression                  | ✅         | ✅                 | ✅        | ✅       |
| Self-RAG                                | ✅         | ✅                 | ❌        | ❌       |
| GraphRAG                                | ✅         | ✅ (PropertyGraph) | ❌        | ✅       |
| Citation tracking                       | Partial   | ✅                 | Partial  | ✅       |

***

## Performance Benchmarks

### Retrieval Accuracy (RAG-Bench, 2024)

{% hint style="info" %}
Benchmarks vary significantly by dataset and configuration. These are approximate figures from community benchmarks.
{% endhint %}

| Framework         | HotpotQA (F1) | Natural Questions (EM) | TriviaQA (Acc) |
| ----------------- | ------------- | ---------------------- | -------------- |
| LangChain (RAG)   | \~68%         | \~42%                  | \~72%          |
| LlamaIndex (RAG)  | \~71%         | \~45%                  | \~74%          |
| Haystack (RAG)    | \~69%         | \~43%                  | \~71%          |
| RAGFlow (default) | \~65%         | \~40%                  | \~68%          |

*Results depend heavily on chosen LLM, embedding model, and chunk size*

### Indexing Speed (10K documents, \~1KB each)

| Framework  | CPU Only  | GPU Embeddings |
| ---------- | --------- | -------------- |
| LangChain  | \~120 sec | \~18 sec       |
| LlamaIndex | \~110 sec | \~15 sec       |
| Haystack   | \~130 sec | \~20 sec       |
| RAGFlow    | \~150 sec | \~25 sec       |

*With text-embedding-ada-002 equivalent (1536 dims)*

### Query Latency (P50/P99, with pre-built index)

| Framework  | P50   | P99  | Notes                    |
| ---------- | ----- | ---- | ------------------------ |
| LangChain  | 450ms | 1.2s | No re-ranking            |
| LlamaIndex | 400ms | 1.0s | No re-ranking            |
| Haystack   | 500ms | 1.5s | With pipeline overhead   |
| RAGFlow    | 600ms | 2.0s | Includes UI/API overhead |

***

## LangChain: Deep Dive

### Strengths

✅ **Largest ecosystem** — 50+ integrations, massive community\
✅ **Agents and tools** — build autonomous AI agents\
✅ **LangSmith** — excellent observability and debugging\
✅ **LCEL** — LangChain Expression Language for composing chains\
✅ **Memory systems** — conversation history, entity memory

### Weaknesses

❌ **Complexity** — can be over-engineered for simple tasks\
❌ **Frequent breaking changes** — v0.1 vs v0.2 vs v0.3 migrations\
❌ **Heavy dependency** — large install size\
❌ **Abstraction leakage** — sometimes harder to debug

### Best Use Cases

* Multi-step LLM pipelines with complex logic
* AI agents that use tools (web search, code execution, APIs)
* Applications needing conversation memory
* Projects needing maximum flexibility

### Example: Advanced RAG with Sources

```python
from langchain.chains import RetrievalQAWithSourcesChain
from langchain_community.vectorstores import Chroma
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Setup
llm = ChatOpenAI(model="gpt-4", temperature=0)
embeddings = OpenAIEmbeddings()

# Index documents with metadata
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = splitter.split_documents(documents)

vectorstore = Chroma.from_documents(
    chunks, 
    embeddings,
    persist_directory="./chroma_db"
)

# Build chain with source attribution
chain = RetrievalQAWithSourcesChain.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever(search_kwargs={"k": 5}),
    return_source_documents=True
)

result = chain({"question": "What are the main risks?"})
print(result["answer"])
print("Sources:", result["sources"])
```

***

## LlamaIndex: Deep Dive

### Strengths

✅ **Document-first design** — best for complex document indexing\
✅ **Index types** — Vector, Knowledge Graph, SQL, Keyword\
✅ **Sub-question engine** — automatically decomposes complex queries\
✅ **Structured outputs** — Pydantic integration\
✅ **Router query engine** — intelligently routes to right index

### Weaknesses

❌ **Less agent-focused** than LangChain\
❌ **Smaller ecosystem** than LangChain\
❌ **Documentation** can be inconsistent

### Best Use Cases

* Document Q\&A systems (PDFs, reports, wikis)
* Complex multi-document reasoning
* Knowledge graph construction
* Data-to-LLM bridges (databases, APIs)

### Example: Multi-Document Query Engine

```python
from llama_index.core import (
    VectorStoreIndex, 
    SimpleDirectoryReader,
    StorageContext,
    Settings
)
from llama_index.core.query_engine import RouterQueryEngine
from llama_index.core.tools import QueryEngineTool
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

# Configure globally
Settings.llm = OpenAI(model="gpt-4")
Settings.embed_model = OpenAIEmbedding()

# Create separate indices for different document sets
annual_reports = SimpleDirectoryReader("./annual_reports").load_data()
tech_docs = SimpleDirectoryReader("./tech_docs").load_data()

index_reports = VectorStoreIndex.from_documents(annual_reports)
index_tech = VectorStoreIndex.from_documents(tech_docs)

# Build router that selects correct index
tools = [
    QueryEngineTool.from_defaults(
        query_engine=index_reports.as_query_engine(),
        description="Annual financial reports and business metrics"
    ),
    QueryEngineTool.from_defaults(
        query_engine=index_tech.as_query_engine(),
        description="Technical documentation and API references"
    )
]

router = RouterQueryEngine.from_defaults(query_engine_tools=tools)
response = router.query("What was the revenue growth last year?")
```

***

## Haystack: Deep Dive

### Strengths

✅ **Enterprise-grade** — production reliability\
✅ **Visual pipeline builder** — Haystack Studio\
✅ **Annotation tool** — built-in labeling UI\
✅ **Strong NLP** — extractive QA, summarization\
✅ **deepset Cloud** — managed deployment option

### Weaknesses

❌ **Steeper learning curve** than competitors\
❌ **Smaller community** than LangChain/LlamaIndex\
❌ **Less flexible** for novel architectures

### Best Use Cases

* Enterprise document search and Q\&A
* Projects needing audit trails and observability
* Teams wanting visual pipeline design
* Production deployments with SLA requirements

### Example: Hybrid Search Pipeline

```python
from haystack import Pipeline
from haystack.components.retrievers import InMemoryBM25Retriever, InMemoryEmbeddingRetriever
from haystack.components.joiners import DocumentJoiner
from haystack.components.rankers import MetaFieldRanker
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import RAGPromptBuilder

# Build hybrid search pipeline
pipeline = Pipeline()
pipeline.add_component("bm25_retriever", InMemoryBM25Retriever(document_store=store, top_k=10))
pipeline.add_component("embedding_retriever", InMemoryEmbeddingRetriever(document_store=store, top_k=10))
pipeline.add_component("joiner", DocumentJoiner(join_mode="reciprocal_rank_fusion"))
pipeline.add_component("ranker", MetaFieldRanker(meta_field="score"))
pipeline.add_component("prompt_builder", RAGPromptBuilder())
pipeline.add_component("llm", OpenAIGenerator(model="gpt-4"))

# Connect components
pipeline.connect("bm25_retriever", "joiner.documents")
pipeline.connect("embedding_retriever", "joiner.documents")
pipeline.connect("joiner", "ranker")
pipeline.connect("ranker", "prompt_builder.documents")
pipeline.connect("prompt_builder", "llm")

result = pipeline.run({"bm25_retriever": {"query": "deep learning"}, 
                       "embedding_retriever": {"query": "deep learning"}})
```

***

## RAGFlow: Deep Dive

### Strengths

✅ **Zero-code deployment** — full UI included\
✅ **Advanced document parsing** — tables, images, charts\
✅ **Knowledge base management** — visual interface\
✅ **API included** — REST API out of the box\
✅ **Agentic RAG** — built-in agents

### Weaknesses

❌ **Less customizable** than code-first frameworks\
❌ **Heavy resource requirement** (Elasticsearch + Infinity DB)\
❌ **Limited LLM support** vs LangChain\
❌ **Newer project** — smaller community

### Best Use Cases

* Non-developers needing RAG without coding
* Teams wanting a complete knowledge base product
* Internal enterprise wikis and documentation search
* Rapid prototyping of RAG applications

### Deployment on Clore.ai

```yaml
# docker-compose.yml for RAGFlow
version: "3"
services:
  ragflow:
    image: infiniflow/ragflow:v0.12.0
    container_name: ragflow
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./ragflow-logs:/ragflow/logs
      - ./ragflow-data:/ragflow/data
    depends_on:
      - elasticsearch
      - infinity

  elasticsearch:
    image: elasticsearch:8.11.3
    environment:
      - discovery.type=single-node
      - ES_JAVA_OPTS=-Xms1g -Xmx1g
      - xpack.security.enabled=false
    volumes:
      - es_data:/usr/share/elasticsearch/data

  infinity:
    image: infiniflow/infinity:v0.3.0
    volumes:
      - infinity_data:/var/infinity

volumes:
  es_data:
  infinity_data:
```

```bash
docker compose up -d
# Access web UI at http://<server-ip>:80
```

***

## When to Use Which

### Choose LangChain if:

* Building AI agents with tools (web search, code execution, APIs)
* Need maximum ecosystem flexibility
* Building complex multi-step pipelines
* Integrating with many different LLMs and data sources
* Team is comfortable with Python

### Choose LlamaIndex if:

* Primary use case is document Q\&A
* Working with complex document structures (tables, nested content)
* Need knowledge graphs or multi-index routing
* Want best-in-class document ingestion
* Building over structured data (databases, APIs)

### Choose Haystack if:

* Enterprise environment with compliance requirements
* Need visual pipeline building tools
* Building on top of Elasticsearch
* Want extractive (not just generative) QA
* Team needs NLP pipeline observability

### Choose RAGFlow if:

* Non-technical team needs self-service RAG
* Want a complete product, not a framework
* Rapid deployment is priority over customization
* Building an internal knowledge base
* Don't want to write Python code

***

## Running on Clore.ai: Resource Requirements

| Framework  | Min RAM     | Min VRAM        | Recommended GPU |
| ---------- | ----------- | --------------- | --------------- |
| LangChain  | 8GB         | 8GB (local LLM) | RTX 3080        |
| LlamaIndex | 8GB         | 8GB (local LLM) | RTX 3080        |
| Haystack   | 16GB        | 8GB (local LLM) | RTX 3090        |
| RAGFlow    | 32GB (RAM!) | 16GB            | A6000 / A100    |

{% hint style="warning" %}
**RAGFlow needs more RAM**: It runs Elasticsearch + InfinityDB + the app itself. Plan for at least 32GB system RAM. Haystack with Elasticsearch also benefits from 16GB+ RAM.
{% endhint %}

***

## Useful Links

* [LangChain Docs](https://python.langchain.com)
* [LlamaIndex Docs](https://docs.llamaindex.ai)
* [Haystack Docs](https://docs.haystack.deepset.ai)
* [RAGFlow GitHub](https://github.com/infiniflow/ragflow)
* [RAG Survey Paper (arxiv)](https://arxiv.org/abs/2312.10997)

***

## Summary Recommendation

```
Simple document Q&A          → LlamaIndex
Complex AI agents            → LangChain
Enterprise search            → Haystack
No-code RAG product          → RAGFlow
Maximum flexibility          → LangChain
Best document understanding  → LlamaIndex
```

All four frameworks are excellent choices — the right one depends on your specific requirements, team skills, and deployment constraints. When in doubt, start with **LlamaIndex** for document-heavy use cases or **LangChain** if you need the broadest possible ecosystem.

***

## Clore.ai GPU Recommendations

| Use Case            | Recommended GPU | Est. Cost on Clore.ai |
| ------------------- | --------------- | --------------------- |
| Development/Testing | RTX 3090 (24GB) | \~$0.12/gpu/hr        |
| Production          | RTX 4090 (24GB) | \~$0.70/gpu/hr        |
| Large Scale         | A100 80GB       | \~$1.20/gpu/hr        |

> 💡 All examples in this guide can be deployed on [Clore.ai](https://clore.ai/marketplace) GPU servers. Browse available GPUs and rent by the hour — no commitments, full root access.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/comparisons/rag-frameworks-comparison.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.