CrewAI Multi-Agent Framework

Deploy CrewAI on Clore.ai — orchestrate teams of role-playing autonomous AI agents for complex multi-step tasks using any LLM provider.

Overview

CrewAI is a cutting-edge framework for orchestrating role-playing autonomous AI agents, with 44K+ GitHub stars. Unlike single-agent systems, CrewAI lets you define specialized agents (Researcher, Writer, Coder, Analyst...) that collaborate as a "crew" to complete complex tasks — each agent with its own role, goal, backstory, and toolkit.

On Clore.ai, CrewAI can be deployed in a Dockerized environment for as little as $0.05–0.20/hr. While CrewAI itself is CPU-bound (it orchestrates API calls), combining it with a local Ollama or vLLM server on the same GPU node gives you a fully private, offline-capable multi-agent system.

Key capabilities:

👥 Multi-agent crews — define agent personas with roles, goals, and backstories
🎯 Task delegation — manager agent automatically assigns tasks to the right specialist
🛠️ Tool ecosystem — web search, file I/O, code execution, database access, custom tools
🔁 Sequential & Parallel — execute tasks in order or run independent tasks simultaneously
🧠 Agent memory — short-term, long-term, entity, and contextual memory types
🔌 LLM-agnostic — works with OpenAI, Anthropic, Google, Ollama, Groq, Azure, and more
📊 CrewAI Studio — visual interface for building crews without code (enterprise)
🚀 Pipelines — chain multiple crews for complex multi-stage workflows

Requirements

CrewAI is a Python library. It runs on CPU and requires only a system Python 3.10+ environment or Docker. GPU is optional but unlocks powerful local model inference.

Configuration

GPU

VRAM

System RAM

Disk

Clore.ai Price

Minimal (cloud APIs)

None / CPU

—

2 GB

10 GB

~$0.03/hr (CPU)

Standard

None / CPU

—

4 GB

20 GB

~$0.05/hr

+ Local LLM (small)

RTX 3080

10 GB

8 GB

40 GB

~$0.15/hr

+ Local LLM (large)

RTX 3090 / 4090

24 GB

16 GB

60 GB

$0.20–0.35/hr

+ High-quality local LLM

A100 40 GB

40 GB

32 GB

100 GB

~$0.80/hr

API Keys

CrewAI works with most major LLM providers. You need at least one:

OpenAI — GPT-4o (best reasoning for complex tasks)
Anthropic — Claude 3.5 Sonnet (excellent for writing-heavy crews)
Groq — Free tier, fast inference (Llama 3 70B)
Ollama — Fully local, no API key needed (see GPU Acceleration)

Quick Start

1. Rent a Clore.ai server

CPU-only if using cloud LLM APIs
RTX 3090/4090 for local Ollama inference
SSH access enabled
No special port requirements for CLI usage (expose ports only for web UIs)

2. Connect and prepare

ssh root@<clore-server-ip> -p <ssh-port>

# Update system
apt-get update && apt-get upgrade -y

# Install Python 3.11 (if not present)
apt-get install -y python3.11 python3.11-venv python3-pip
python3 --version   # Should be 3.10+

# Verify Docker
docker --version

3. Option A — Direct pip install (fastest)

# Create a virtual environment
python3 -m venv crewai-env
source crewai-env/bin/activate

# Install CrewAI with all tools
pip install crewai crewai-tools

# Verify installation
python -c "import crewai; print(crewai.__version__)"
crewai --version

4. Option B — Docker container (recommended for reproducibility)

# Create project directory
mkdir my-crew && cd my-crew

# Write Dockerfile
cat > Dockerfile << 'EOF'
FROM python:3.11-slim

# Install system dependencies
RUN apt-get update && apt-get install -y \
    curl \
    git \
    && rm -rf /var/lib/apt/lists/*

# Set working directory
WORKDIR /app

# Install CrewAI and common tools
RUN pip install --no-cache-dir \
    crewai \
    crewai-tools \
    langchain-openai \
    langchain-anthropic \
    python-dotenv

# Copy application code
COPY . .

# Default command
CMD ["python", "main.py"]
EOF

# Build the image
docker build -t my-crewai-app .

5. Create your first crew

# Initialize a new CrewAI project using the CLI
source crewai-env/bin/activate
crewai create crew my-research-crew
cd my-research-crew

# Configure API keys
cat > .env << 'EOF'
OPENAI_API_KEY=sk-...
# Or for Anthropic:
# ANTHROPIC_API_KEY=sk-ant-...
# Or for Ollama (no key needed):
# OPENAI_API_BASE=http://localhost:11434/v1
# OPENAI_API_KEY=ollama
EOF

# Install project dependencies
crewai install

# Run the crew
crewai run

Configuration

Project structure (from `crewai create`)

my-research-crew/
├── .env                    # API keys and settings
├── pyproject.toml          # Dependencies
├── src/
│   └── my_research_crew/
│       ├── config/
│       │   ├── agents.yaml   # Agent definitions
│       │   └── tasks.yaml    # Task definitions
│       ├── tools/
│       │   └── custom_tool.py  # Your custom tools
│       ├── crew.py           # Crew assembly
│       └── main.py           # Entry point

agents.yaml — Define your agents

researcher:
  role: >
    Senior Research Analyst
  goal: >
    Uncover cutting-edge developments in {topic} and synthesize actionable insights
  backstory: >
    You are an expert researcher with a talent for finding and connecting information
    from diverse sources. You have a keen eye for credibility and relevance.
  tools:
    - SerperDevTool
    - ScrapeWebsiteTool
  llm: gpt-4o                # Can be set per-agent
  verbose: true
  max_iter: 15               # Max reasoning iterations
  memory: true               # Enable agent memory

writer:
  role: >
    Expert Technical Writer
  goal: >
    Transform complex research into clear, compelling, and accurate content
  backstory: >
    You craft explanations that make complex topics accessible without sacrificing depth.
    You cite sources and structure content for maximum readability.
  llm: claude-3-5-sonnet-20241022
  verbose: true

tasks.yaml — Define tasks

research_task:
  description: >
    Research the latest developments in {topic} from the past 6 months.
    Identify key trends, breakthroughs, and implications.
    Compile at least 10 credible sources.
  expected_output: >
    A detailed research report with findings organized by theme,
    including source URLs and publication dates.
  agent: researcher

writing_task:
  description: >
    Using the research report, write a comprehensive blog post about {topic}.
    Target length: 1500-2000 words. Include a title, introduction, main sections,
    and a conclusion with future outlook.
  expected_output: >
    A publication-ready blog post in Markdown format.
  agent: writer
  context:
    - research_task    # Writing task receives researcher's output
  output_file: output/blog_post.md

crew.py — Assemble the crew

from crewai import Agent, Crew, Process, Task
from crewai.project import CrewBase, agent, crew, task
from crewai_tools import SerperDevTool, ScrapeWebsiteTool

@CrewBase
class MyResearchCrew():
    """Research and writing crew"""
    agents_config = 'config/agents.yaml'
    tasks_config = 'config/tasks.yaml'

    @agent
    def researcher(self) -> Agent:
        return Agent(
            config=self.agents_config['researcher'],
            tools=[SerperDevTool(), ScrapeWebsiteTool()],
            verbose=True
        )

    @agent
    def writer(self) -> Agent:
        return Agent(
            config=self.agents_config['writer'],
            verbose=True
        )

    @task
    def research_task(self) -> Task:
        return Task(config=self.tasks_config['research_task'])

    @task
    def writing_task(self) -> Task:
        return Task(
            config=self.tasks_config['writing_task'],
            output_file='output/blog_post.md'
        )

    @crew
    def crew(self) -> Crew:
        return Crew(
            agents=self.agents,
            tasks=self.tasks,
            process=Process.sequential,  # or Process.hierarchical
            verbose=True,
            memory=True,
            max_rpm=10   # Rate limit API calls
        )

Running with Docker Compose (with Ollama)

# docker-compose.yml
version: "3.8"

services:
  crewai:
    build: .
    volumes:
      - ./src:/app/src
      - ./output:/app/output
    environment:
      - OPENAI_API_BASE=http://ollama:11434/v1
      - OPENAI_API_KEY=ollama
      - OPENAI_MODEL_NAME=llama3.1:70b
      - SERPER_API_KEY=${SERPER_API_KEY}
    depends_on:
      - ollama

  ollama:
    image: ollama/ollama:latest
    volumes:
      - ollama_data:/root/.ollama
    ports:
      - "11434:11434"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

volumes:
  ollama_data:

# Start the stack
docker compose up -d ollama

# Pull your model
docker compose exec ollama ollama pull llama3.1:70b

# Run your crew
docker compose run --rm crewai python src/my_research_crew/main.py

GPU Acceleration

CrewAI itself doesn't use the GPU — but the LLM it calls does. Run Ollama or vLLM on the same Clore server for GPU-accelerated local inference.

Ollama setup (recommended for ease)

# Install Ollama on the host
curl -fsSL https://ollama.com/install.sh | sh

# Verify GPU detection
nvidia-smi
ollama run llama3:8b "Test"   # Should use GPU

# Pull models for different agent needs
ollama pull llama3.1:8b          # Fast, lightweight reasoning
ollama pull llama3.1:70b         # Complex reasoning (needs A100+)
ollama pull codestral:latest     # Specialized for code agents
ollama pull nomic-embed-text     # For memory/RAG embeddings

# Configure CrewAI to use Ollama
export OPENAI_API_BASE=http://localhost:11434/v1
export OPENAI_API_KEY=ollama
export OPENAI_MODEL_NAME=llama3.1:8b

Configure CrewAI LLM per agent

from crewai import Agent, LLM

# Use Ollama for a specific agent
local_llm = LLM(
    model="ollama/llama3.1:8b",
    base_url="http://localhost:11434"
)

# Use cloud API for another agent (e.g., complex reasoning)
cloud_llm = LLM(
    model="gpt-4o",
    api_key="sk-..."
)

researcher = Agent(
    role="Researcher",
    goal="Find information",
    backstory="Expert researcher",
    llm=local_llm    # Uses local GPU
)

strategist = Agent(
    role="Strategist",
    goal="Plan actions",
    backstory="Strategic thinker",
    llm=cloud_llm    # Uses cloud API
)

Model recommendations for agent tasks

Task Type

Recommended Model

VRAM

Notes

Research + web search

Llama 3.1 70B

40 GB

Best local reasoning

Code generation

Codestral 22B

13 GB

Code-specialized

Writing

Llama 3.1 8B

6 GB

Fast, good quality

Complex orchestration

GPT-4o (API)

—

Best overall

Embeddings/memory

nomic-embed-text

< 1 GB

Required for memory

See Ollama on Clore.ai and vLLM on Clore.ai for full inference setup guides.

Tips & Best Practices

Cost optimization

# Track LLM API costs with CrewAI's built-in usage tracking
# Add to your crew:
result = crew.kickoff(inputs={"topic": "AI agents"})
print(f"Total tokens used: {result.token_usage}")

# Use cheaper models for simple tasks, premium for complex ones
# Llama 3.1 8B on Ollama = $0 API cost
# GPT-4o = ~$0.005/1K input tokens

# Set max_iter to prevent runaway agents
Agent(max_iter=10, max_execution_time=120)  # 2-minute timeout

# Cache LLM responses during development
os.environ["CREWAI_DISABLE_CACHE"] = "false"

Running crews as a persistent service

# Serve crews via a FastAPI endpoint
pip install fastapi uvicorn

cat > server.py << 'EOF'
from fastapi import FastAPI, BackgroundTasks
from my_research_crew.crew import MyResearchCrew
import asyncio

app = FastAPI()
results = {}

@app.post("/run/{topic}")
async def run_crew(topic: str, background_tasks: BackgroundTasks):
    job_id = f"job-{topic}-{len(results)}"
    background_tasks.add_task(execute_crew, job_id, topic)
    return {"job_id": job_id, "status": "started"}

async def execute_crew(job_id: str, topic: str):
    crew = MyResearchCrew().crew()
    result = crew.kickoff(inputs={"topic": topic})
    results[job_id] = str(result)

@app.get("/result/{job_id}")
async def get_result(job_id: str):
    return {"result": results.get(job_id, "pending")}

EOF

# Run the server
uvicorn server:app --host 0.0.0.0 --port 8080 &

# Trigger a crew run
curl -X POST http://<clore-ip>:8080/run/quantum-computing

Useful built-in CrewAI tools

from crewai_tools import (
    SerperDevTool,        # Google search via Serper API
    ScrapeWebsiteTool,    # Web scraping
    FileReadTool,         # Read local files
    FileWriterTool,       # Write files
    DirectoryReadTool,    # List directory contents
    CodeInterpreterTool,  # Execute Python code
    GithubSearchTool,     # Search GitHub repos
    YoutubeVideoSearchTool, # Search YouTube
    PGSearchTool,         # Query PostgreSQL
)

Implementing human-in-the-loop

# Ask for human input at a specific task
task = Task(
    description="Research {topic}",
    expected_output="Research report",
    agent=researcher,
    human_input=True    # Pauses and prompts user for feedback
)

Troubleshooting

"openai.AuthenticationError" even with valid key

# Check your API key is loaded
python3 -c "import os; from dotenv import load_dotenv; load_dotenv(); print(os.getenv('OPENAI_API_KEY')[:10])"

# If using Ollama, ensure these are set:
export OPENAI_API_BASE=http://localhost:11434/v1
export OPENAI_API_KEY=ollama
export OPENAI_MODEL_NAME=llama3.1:8b

# Verify Ollama is accessible
curl http://localhost:11434/v1/models

Agent stuck in reasoning loop

# Set a hard limit on iterations
Agent(
    max_iter=10,            # Max reasoning steps
    max_execution_time=300  # 5-minute hard timeout
)

# Enable verbose to see where it's looping
Agent(verbose=True)

# Check if the model supports tool calling
# Some smaller models don't; use llama3.1 or mistral-nemo

CrewAI tools fail (SerperDevTool 403)

# SerperDevTool requires a free API key from serper.dev
export SERPER_API_KEY=your-serper-key

# Alternative: use DuckDuckGo (no API key needed)
from langchain_community.tools import DuckDuckGoSearchRun
from crewai import tool

@tool("DuckDuckGo Search")
def ddg_search(query: str) -> str:
    """Search the web using DuckDuckGo."""
    return DuckDuckGoSearchRun().run(query)

Memory errors (ChromaDB / embeddings)

# CrewAI memory uses ChromaDB for vector storage
# If it fails, check disk space and permissions
df -h
ls -la ~/.local/share/crewai/

# Clear corrupted memory
rm -rf ~/.local/share/crewai/

# Or disable memory if not needed
Crew(memory=False)

# If using Ollama embeddings, ensure the model is pulled
docker exec ollama ollama pull nomic-embed-text

Docker build fails on ARM/x86 mismatch

# Clore.ai servers are x86_64; specify platform explicitly:
docker build --platform linux/amd64 -t my-crewai-app .

# Or in docker-compose.yml:
services:
  crewai:
    platform: linux/amd64
    build: .

Rate limiting from LLM APIs

# Reduce requests per minute
Crew(max_rpm=5)   # 5 requests per minute

# Add retry logic in LLM config
LLM(
    model="gpt-4o",
    max_retries=3,
    timeout=60
)

hashtagOverview

hashtagRequirements

hashtagAPI Keys

hashtagQuick Start

hashtag1. Rent a Clore.ai server

hashtag2. Connect and prepare

hashtag3. Option A — Direct pip install (fastest)

hashtag4. Option B — Docker container (recommended for reproducibility)

hashtag5. Create your first crew

hashtagConfiguration

hashtagProject structure (from crewai create)

hashtagagents.yaml — Define your agents

hashtagtasks.yaml — Define tasks

hashtagcrew.py — Assemble the crew

hashtagRunning with Docker Compose (with Ollama)

hashtagGPU Acceleration

hashtagOllama setup (recommended for ease)

hashtagConfigure CrewAI LLM per agent

hashtagModel recommendations for agent tasks

hashtagTips & Best Practices

hashtagCost optimization

hashtagRunning crews as a persistent service

hashtagUseful built-in CrewAI tools

hashtagImplementing human-in-the-loop

hashtagTroubleshooting

hashtag"openai.AuthenticationError" even with valid key

hashtagAgent stuck in reasoning loop

hashtagCrewAI tools fail (SerperDevTool 403)

hashtagMemory errors (ChromaDB / embeddings)

hashtagDocker build fails on ARM/x86 mismatch

hashtagRate limiting from LLM APIs

hashtagFurther Reading

Overview

Requirements

API Keys

Quick Start

1. Rent a Clore.ai server

2. Connect and prepare

3. Option A — Direct pip install (fastest)

4. Option B — Docker container (recommended for reproducibility)

5. Create your first crew

Configuration

Project structure (from `crewai create`)

agents.yaml — Define your agents

tasks.yaml — Define tasks

crew.py — Assemble the crew

Running with Docker Compose (with Ollama)

GPU Acceleration

Ollama setup (recommended for ease)

Configure CrewAI LLM per agent

Model recommendations for agent tasks

Tips & Best Practices

Cost optimization

Running crews as a persistent service

Useful built-in CrewAI tools

Implementing human-in-the-loop

Troubleshooting

"openai.AuthenticationError" even with valid key

Agent stuck in reasoning loop

CrewAI tools fail (SerperDevTool 403)

Memory errors (ChromaDB / embeddings)

Docker build fails on ARM/x86 mismatch

Rate limiting from LLM APIs

Further Reading