# RAG Frameworks Comparison

Choose the right Retrieval-Augmented Generation (RAG) framework for your project on Clore.ai GPU servers.

{% hint style="info" %}
**RAG (Retrieval-Augmented Generation)** lets LLMs answer questions using your own documents. This guide compares the four leading frameworks: LangChain, LlamaIndex, Haystack, and RAGFlow — covering features, performance, and when to use each.
{% endhint %}

***

## Quick Decision Matrix

|                    | LangChain        | LlamaIndex    | Haystack          | RAGFlow         |
| ------------------ | ---------------- | ------------- | ----------------- | --------------- |
| **Best for**       | General LLM apps | Document Q\&A | Enterprise search | Self-hosted RAG |
| **Learning curve** | Medium           | Low-Medium    | Medium-High       | Low             |
| **Flexibility**    | Very high        | High          | High              | Medium          |
| **Built-in UI**    | No               | No            | No                | Yes             |
| **GitHub stars**   | 90K+             | 35K+          | 15K+              | 12K+            |
| **Language**       | Python           | Python        | Python            | Python          |
| **License**        | MIT              | MIT           | Apache 2.0        | Apache 2.0      |

***

## Overview

### LangChain

LangChain is the most popular LLM orchestration framework. It provides a unified interface for chains, agents, memory, and RAG pipelines.

**Philosophy**: Everything is a chain of composable components.

```python
from langchain.chains import RetrievalQA
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.llms import OpenAI

# Build RAG pipeline in 5 lines
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(docs, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})
chain = RetrievalQA.from_chain_type(llm=OpenAI(), retriever=retriever)
result = chain.run("What is the capital of France?")
```

### LlamaIndex

LlamaIndex (formerly GPT Index) is purpose-built for document indexing and retrieval. It excels at connecting LLMs to diverse data sources.

**Philosophy**: Index first, query intelligently.

```python
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

# Load and index documents
documents = SimpleDirectoryReader("data/").load_data()
index = VectorStoreIndex.from_documents(documents)

# Query
query_engine = index.as_query_engine()
response = query_engine.query("Summarize the main findings")
print(response)
```

### Haystack

Haystack (by deepset) is an enterprise-grade NLP framework focused on search and Q\&A pipelines. It has a component-based architecture with visual pipeline builder.

**Philosophy**: Modular pipelines with enterprise reliability.

```python
from haystack.nodes import DensePassageRetriever, FARMReader
from haystack.pipelines import ExtractiveQAPipeline

retriever = DensePassageRetriever(document_store=document_store)
reader = FARMReader(model_name_or_path="deepset/roberta-base-squad2")
pipeline = ExtractiveQAPipeline(reader, retriever)
result = pipeline.run(query="What is machine learning?", params={"Retriever": {"top_k": 10}})
```

### RAGFlow

RAGFlow is an open-source RAG engine with a built-in web UI, document parsing, and knowledge base management. It's designed to be deployed as a complete solution.

**Philosophy**: Out-of-the-box RAG system, no coding required.

```yaml
# RAGFlow is deployed via Docker Compose
# Configuration via web UI at localhost:80
version: "3"
services:
  ragflow:
    image: infiniflow/ragflow:latest
    ports:
      - "80:80"
    volumes:
      - ./ragflow-data:/ragflow/data
```

***

## Feature Comparison

### Core RAG Features

| Feature              | LangChain | LlamaIndex | Haystack | RAGFlow  |
| -------------------- | --------- | ---------- | -------- | -------- |
| Vector store support | 50+       | 30+        | 20+      | Built-in |
| Document loaders     | 100+      | 50+        | 30+      | Built-in |
| Hybrid search        | ✅         | ✅          | ✅        | ✅        |
| Re-ranking           | ✅         | ✅          | ✅        | ✅        |
| Multi-modal          | ✅         | ✅          | Partial  | ✅        |
| Streaming            | ✅         | ✅          | ✅        | ✅        |
| Async support        | ✅         | ✅          | ✅        | ✅        |
| Agents               | ✅         | ✅          | ✅        | ❌        |

### Integration Ecosystem

| Integration Type | LangChain                                    | LlamaIndex                           | Haystack                          | RAGFlow                    |
| ---------------- | -------------------------------------------- | ------------------------------------ | --------------------------------- | -------------------------- |
| LLM providers    | 50+                                          | 30+                                  | 20+                               | 10+                        |
| Vector DBs       | Chroma, Pinecone, Weaviate, Qdrant, 40+ more | Chroma, Pinecone, Weaviate, 25+ more | Weaviate, Elasticsearch, 15+ more | Built-in InfiniFlow        |
| Document types   | PDF, Web, CSV, JSON, 80+                     | PDF, Web, CSV, DB, 40+               | PDF, TXT, HTML, 20+               | PDF, Word, Excel, PPT, Web |
| Cloud storage    | S3, GCS, Azure                               | S3, GCS, Azure                       | S3, GCS                           | S3                         |

### Advanced RAG Features

| Feature                                 | LangChain | LlamaIndex        | Haystack | RAGFlow |
| --------------------------------------- | --------- | ----------------- | -------- | ------- |
| Query decomposition                     | ✅         | ✅                 | ✅        | ✅       |
| HyDE (Hypothetical Document Embeddings) | ✅         | ✅                 | ❌        | ❌       |
| Multi-hop retrieval                     | ✅         | ✅                 | Partial  | ✅       |
| Contextual compression                  | ✅         | ✅                 | ✅        | ✅       |
| Self-RAG                                | ✅         | ✅                 | ❌        | ❌       |
| GraphRAG                                | ✅         | ✅ (PropertyGraph) | ❌        | ✅       |
| Citation tracking                       | Partial   | ✅                 | Partial  | ✅       |

***

## Performance Benchmarks

### Retrieval Accuracy (RAG-Bench, 2024)

{% hint style="info" %}
Benchmarks vary significantly by dataset and configuration. These are approximate figures from community benchmarks.
{% endhint %}

| Framework         | HotpotQA (F1) | Natural Questions (EM) | TriviaQA (Acc) |
| ----------------- | ------------- | ---------------------- | -------------- |
| LangChain (RAG)   | \~68%         | \~42%                  | \~72%          |
| LlamaIndex (RAG)  | \~71%         | \~45%                  | \~74%          |
| Haystack (RAG)    | \~69%         | \~43%                  | \~71%          |
| RAGFlow (default) | \~65%         | \~40%                  | \~68%          |

*Results depend heavily on chosen LLM, embedding model, and chunk size*

### Indexing Speed (10K documents, \~1KB each)

| Framework  | CPU Only  | GPU Embeddings |
| ---------- | --------- | -------------- |
| LangChain  | \~120 sec | \~18 sec       |
| LlamaIndex | \~110 sec | \~15 sec       |
| Haystack   | \~130 sec | \~20 sec       |
| RAGFlow    | \~150 sec | \~25 sec       |

*With text-embedding-ada-002 equivalent (1536 dims)*

### Query Latency (P50/P99, with pre-built index)

| Framework  | P50   | P99  | Notes                    |
| ---------- | ----- | ---- | ------------------------ |
| LangChain  | 450ms | 1.2s | No re-ranking            |
| LlamaIndex | 400ms | 1.0s | No re-ranking            |
| Haystack   | 500ms | 1.5s | With pipeline overhead   |
| RAGFlow    | 600ms | 2.0s | Includes UI/API overhead |

***

## LangChain: Deep Dive

### Strengths

✅ **Largest ecosystem** — 50+ integrations, massive community\
✅ **Agents and tools** — build autonomous AI agents\
✅ **LangSmith** — excellent observability and debugging\
✅ **LCEL** — LangChain Expression Language for composing chains\
✅ **Memory systems** — conversation history, entity memory

### Weaknesses

❌ **Complexity** — can be over-engineered for simple tasks\
❌ **Frequent breaking changes** — v0.1 vs v0.2 vs v0.3 migrations\
❌ **Heavy dependency** — large install size\
❌ **Abstraction leakage** — sometimes harder to debug

### Best Use Cases

* Multi-step LLM pipelines with complex logic
* AI agents that use tools (web search, code execution, APIs)
* Applications needing conversation memory
* Projects needing maximum flexibility

### Example: Advanced RAG with Sources

```python
from langchain.chains import RetrievalQAWithSourcesChain
from langchain_community.vectorstores import Chroma
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Setup
llm = ChatOpenAI(model="gpt-4", temperature=0)
embeddings = OpenAIEmbeddings()

# Index documents with metadata
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = splitter.split_documents(documents)

vectorstore = Chroma.from_documents(
    chunks, 
    embeddings,
    persist_directory="./chroma_db"
)

# Build chain with source attribution
chain = RetrievalQAWithSourcesChain.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever(search_kwargs={"k": 5}),
    return_source_documents=True
)

result = chain({"question": "What are the main risks?"})
print(result["answer"])
print("Sources:", result["sources"])
```

***

## LlamaIndex: Deep Dive

### Strengths

✅ **Document-first design** — best for complex document indexing\
✅ **Index types** — Vector, Knowledge Graph, SQL, Keyword\
✅ **Sub-question engine** — automatically decomposes complex queries\
✅ **Structured outputs** — Pydantic integration\
✅ **Router query engine** — intelligently routes to right index

### Weaknesses

❌ **Less agent-focused** than LangChain\
❌ **Smaller ecosystem** than LangChain\
❌ **Documentation** can be inconsistent

### Best Use Cases

* Document Q\&A systems (PDFs, reports, wikis)
* Complex multi-document reasoning
* Knowledge graph construction
* Data-to-LLM bridges (databases, APIs)

### Example: Multi-Document Query Engine

```python
from llama_index.core import (
    VectorStoreIndex, 
    SimpleDirectoryReader,
    StorageContext,
    Settings
)
from llama_index.core.query_engine import RouterQueryEngine
from llama_index.core.tools import QueryEngineTool
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

# Configure globally
Settings.llm = OpenAI(model="gpt-4")
Settings.embed_model = OpenAIEmbedding()

# Create separate indices for different document sets
annual_reports = SimpleDirectoryReader("./annual_reports").load_data()
tech_docs = SimpleDirectoryReader("./tech_docs").load_data()

index_reports = VectorStoreIndex.from_documents(annual_reports)
index_tech = VectorStoreIndex.from_documents(tech_docs)

# Build router that selects correct index
tools = [
    QueryEngineTool.from_defaults(
        query_engine=index_reports.as_query_engine(),
        description="Annual financial reports and business metrics"
    ),
    QueryEngineTool.from_defaults(
        query_engine=index_tech.as_query_engine(),
        description="Technical documentation and API references"
    )
]

router = RouterQueryEngine.from_defaults(query_engine_tools=tools)
response = router.query("What was the revenue growth last year?")
```

***

## Haystack: Deep Dive

### Strengths

✅ **Enterprise-grade** — production reliability\
✅ **Visual pipeline builder** — Haystack Studio\
✅ **Annotation tool** — built-in labeling UI\
✅ **Strong NLP** — extractive QA, summarization\
✅ **deepset Cloud** — managed deployment option

### Weaknesses

❌ **Steeper learning curve** than competitors\
❌ **Smaller community** than LangChain/LlamaIndex\
❌ **Less flexible** for novel architectures

### Best Use Cases

* Enterprise document search and Q\&A
* Projects needing audit trails and observability
* Teams wanting visual pipeline design
* Production deployments with SLA requirements

### Example: Hybrid Search Pipeline

```python
from haystack import Pipeline
from haystack.components.retrievers import InMemoryBM25Retriever, InMemoryEmbeddingRetriever
from haystack.components.joiners import DocumentJoiner
from haystack.components.rankers import MetaFieldRanker
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import RAGPromptBuilder

# Build hybrid search pipeline
pipeline = Pipeline()
pipeline.add_component("bm25_retriever", InMemoryBM25Retriever(document_store=store, top_k=10))
pipeline.add_component("embedding_retriever", InMemoryEmbeddingRetriever(document_store=store, top_k=10))
pipeline.add_component("joiner", DocumentJoiner(join_mode="reciprocal_rank_fusion"))
pipeline.add_component("ranker", MetaFieldRanker(meta_field="score"))
pipeline.add_component("prompt_builder", RAGPromptBuilder())
pipeline.add_component("llm", OpenAIGenerator(model="gpt-4"))

# Connect components
pipeline.connect("bm25_retriever", "joiner.documents")
pipeline.connect("embedding_retriever", "joiner.documents")
pipeline.connect("joiner", "ranker")
pipeline.connect("ranker", "prompt_builder.documents")
pipeline.connect("prompt_builder", "llm")

result = pipeline.run({"bm25_retriever": {"query": "deep learning"}, 
                       "embedding_retriever": {"query": "deep learning"}})
```

***

## RAGFlow: Deep Dive

### Strengths

✅ **Zero-code deployment** — full UI included\
✅ **Advanced document parsing** — tables, images, charts\
✅ **Knowledge base management** — visual interface\
✅ **API included** — REST API out of the box\
✅ **Agentic RAG** — built-in agents

### Weaknesses

❌ **Less customizable** than code-first frameworks\
❌ **Heavy resource requirement** (Elasticsearch + Infinity DB)\
❌ **Limited LLM support** vs LangChain\
❌ **Newer project** — smaller community

### Best Use Cases

* Non-developers needing RAG without coding
* Teams wanting a complete knowledge base product
* Internal enterprise wikis and documentation search
* Rapid prototyping of RAG applications

### Deployment on Clore.ai

```yaml
# docker-compose.yml for RAGFlow
version: "3"
services:
  ragflow:
    image: infiniflow/ragflow:v0.12.0
    container_name: ragflow
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./ragflow-logs:/ragflow/logs
      - ./ragflow-data:/ragflow/data
    depends_on:
      - elasticsearch
      - infinity

  elasticsearch:
    image: elasticsearch:8.11.3
    environment:
      - discovery.type=single-node
      - ES_JAVA_OPTS=-Xms1g -Xmx1g
      - xpack.security.enabled=false
    volumes:
      - es_data:/usr/share/elasticsearch/data

  infinity:
    image: infiniflow/infinity:v0.3.0
    volumes:
      - infinity_data:/var/infinity

volumes:
  es_data:
  infinity_data:
```

```bash
docker compose up -d
# Access web UI at http://<server-ip>:80
```

***

## When to Use Which

### Choose LangChain if:

* Building AI agents with tools (web search, code execution, APIs)
* Need maximum ecosystem flexibility
* Building complex multi-step pipelines
* Integrating with many different LLMs and data sources
* Team is comfortable with Python

### Choose LlamaIndex if:

* Primary use case is document Q\&A
* Working with complex document structures (tables, nested content)
* Need knowledge graphs or multi-index routing
* Want best-in-class document ingestion
* Building over structured data (databases, APIs)

### Choose Haystack if:

* Enterprise environment with compliance requirements
* Need visual pipeline building tools
* Building on top of Elasticsearch
* Want extractive (not just generative) QA
* Team needs NLP pipeline observability

### Choose RAGFlow if:

* Non-technical team needs self-service RAG
* Want a complete product, not a framework
* Rapid deployment is priority over customization
* Building an internal knowledge base
* Don't want to write Python code

***

## Running on Clore.ai: Resource Requirements

| Framework  | Min RAM     | Min VRAM        | Recommended GPU |
| ---------- | ----------- | --------------- | --------------- |
| LangChain  | 8GB         | 8GB (local LLM) | RTX 3080        |
| LlamaIndex | 8GB         | 8GB (local LLM) | RTX 3080        |
| Haystack   | 16GB        | 8GB (local LLM) | RTX 3090        |
| RAGFlow    | 32GB (RAM!) | 16GB            | A6000 / A100    |

{% hint style="warning" %}
**RAGFlow needs more RAM**: It runs Elasticsearch + InfinityDB + the app itself. Plan for at least 32GB system RAM. Haystack with Elasticsearch also benefits from 16GB+ RAM.
{% endhint %}

***

## Useful Links

* [LangChain Docs](https://python.langchain.com)
* [LlamaIndex Docs](https://docs.llamaindex.ai)
* [Haystack Docs](https://docs.haystack.deepset.ai)
* [RAGFlow GitHub](https://github.com/infiniflow/ragflow)
* [RAG Survey Paper (arxiv)](https://arxiv.org/abs/2312.10997)

***

## Summary Recommendation

```
Simple document Q&A          → LlamaIndex
Complex AI agents            → LangChain
Enterprise search            → Haystack
No-code RAG product          → RAGFlow
Maximum flexibility          → LangChain
Best document understanding  → LlamaIndex
```

All four frameworks are excellent choices — the right one depends on your specific requirements, team skills, and deployment constraints. When in doubt, start with **LlamaIndex** for document-heavy use cases or **LangChain** if you need the broadest possible ecosystem.

***

## Clore.ai GPU Recommendations

| Use Case            | Recommended GPU | Est. Cost on Clore.ai |
| ------------------- | --------------- | --------------------- |
| Development/Testing | RTX 3090 (24GB) | \~$0.12/gpu/hr        |
| Production          | RTX 4090 (24GB) | \~$0.70/gpu/hr        |
| Large Scale         | A100 80GB       | \~$1.20/gpu/hr        |

> 💡 All examples in this guide can be deployed on [Clore.ai](https://clore.ai/marketplace) GPU servers. Browse available GPUs and rent by the hour — no commitments, full root access.
