> For the complete documentation index, see [llms.txt](https://docs.clore.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.clore.ai/guides/guides_v2-ru/prodvinutye/api-integration.md).

# Интеграция API

> 💡 **Рекомендуется:** Используйте официальный [clore-ai Python SDK](/guides/guides_v2-ru/prodvinutye/python-sdk.md) вместо прямых HTTP-запросов для управления серверами и заказами Clore.ai. Встроенное управление частотой запросов, повторные попытки, безопасность типов и поддержка async.

Интегрируйте AI-модели, запущенные на CLORE.AI, в ваши приложения.

{% hint style="success" %}
Разверните API-серверы на [CLORE.AI Marketplace](https://clore.ai/marketplace).
{% endhint %}

## Быстрый старт

Большинство AI-сервисов на CLORE.AI предоставляют совместимые с OpenAI API. Замените базовый URL и вы готовы.

```python
from openai import OpenAI

client = OpenAI(
    base_url="http://<your-clore-server>:8000/v1",
    api_key="not-needed"  # Большинству self-host решений ключ не нужен
)

response = client.chat.completions.create(
    model="meta-llama/Llama-3.1-8B-Instruct",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
```

***

## API для LLM

### vLLM (совместим с OpenAI)

**Настройка сервера:**

```bash
python -m vllm.entrypoints.openai.api_server \
    --model meta-llama/Llama-3.1-8B-Instruct \
    --host 0.0.0.0 --port 8000
```

**Клиент Python:**

```python
from openai import OpenAI

client = OpenAI(base_url="http://server:8000/v1", api_key="dummy")

# Chat completion
response = client.chat.completions.create(
    model="meta-llama/Llama-3.1-8B-Instruct",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Write a poem about coding"}
    ],
    temperature=0.7,
    max_tokens=500
)
print(response.choices[0].message.content)

# Streaming
stream = client.chat.completions.create(
    model="meta-llama/Llama-3.1-8B-Instruct",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)
```

**Клиент Node.js:**

```javascript
import OpenAI from 'openai';

const client = new OpenAI({
    baseURL: 'http://server:8000/v1',
    apiKey: 'dummy'
});

async function chat(message) {
    const response = await client.chat.completions.create({
        model: 'meta-llama/Llama-3.1-8B-Instruct',
        messages: [{ role: 'user', content: message }]
    });
    return response.choices[0].message.content;
}

// Streaming
async function streamChat(message) {
    const stream = await client.chat.completions.create({
        model: 'meta-llama/Llama-3.1-8B-Instruct',
        messages: [{ role: 'user', content: message }],
        stream: true
    });

    for await (const chunk of stream) {
        process.stdout.write(chunk.choices[0]?.delta?.content || '');
    }
}
```

**cURL:**

```bash
curl http://server:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "meta-llama/Llama-3.1-8B-Instruct",
        "messages": [{"role": "user", "content": "Hello!"}]
    }'
```

### Ollama API

**Python:**

```python
import requests

# Generate
response = requests.post('http://server:11434/api/generate', json={
    'model': 'llama3.2',
    'prompt': 'Why is the sky blue?',
    'stream': False
})
print(response.json()['response'])

# Chat
response = requests.post('http://server:11434/api/chat', json={
    'model': 'llama3.2',
    'messages': [
        {'role': 'user', 'content': 'Hello!'}
    ],
    'stream': False
})
print(response.json()['message']['content'])

# Streaming
response = requests.post('http://server:11434/api/chat', json={
    'model': 'llama3.2',
    'messages': [{'role': 'user', 'content': 'Tell me a story'}],
    'stream': True
}, stream=True)

for line in response.iter_lines():
    if line:
        data = json.loads(line)
        print(data['message']['content'], end='', flush=True)
```

**Ollama также поддерживает формат OpenAI:**

```python
from openai import OpenAI

client = OpenAI(base_url='http://server:11434/v1', api_key='ollama')
# Используйте тот же код, что и в примерах vLLM
```

### TGI API

**Python:**

```python
import requests

# Generate
response = requests.post('http://server:8080/generate', json={
    'inputs': 'What is machine learning?',
    'parameters': {
        'max_new_tokens': 200,
        'temperature': 0.7,
        'do_sample': True
    }
})
print(response.json()['generated_text'])

# Streaming
response = requests.post('http://server:8080/generate_stream', json={
    'inputs': 'Explain quantum computing',
    'parameters': {'max_new_tokens': 500}
}, stream=True)

for line in response.iter_lines():
    if line:
        data = json.loads(line.decode().replace('data:', ''))
        print(data.get('token', {}).get('text', ''), end='', flush=True)
```

***

## API для генерации изображений

### Stable Diffusion WebUI API

**Включить API:** Добавьте `--api` в команду запуска.

**Python:**

```python
import requests
import base64
from PIL import Image
from io import BytesIO

def txt2img(prompt, negative_prompt="", steps=20, width=512, height=512):
    response = requests.post('http://server:7860/sdapi/v1/txt2img', json={
        'prompt': prompt,
        'negative_prompt': negative_prompt,
        'steps': steps,
        'width': width,
        'height': height,
        'sampler_name': 'DPM++ 2M Karras',
        'cfg_scale': 7
    })

    # Декодировать изображение из base64
    image_data = base64.b64decode(response.json()['images'][0])
    return Image.open(BytesIO(image_data))

# Generate
image = txt2img(
    prompt="A beautiful sunset over mountains, photorealistic, 8k",
    negative_prompt="blurry, low quality"
)
image.save("output.png")

# img2img
def img2img(prompt, image_path, denoising=0.5):
    with open(image_path, 'rb') as f:
        image_b64 = base64.b64encode(f.read()).decode()

    response = requests.post('http://server:7860/sdapi/v1/img2img', json={
        'prompt': prompt,
        'init_images': [image_b64],
        'denoising_strength': denoising,
        'steps': 30
    })

    image_data = base64.b64decode(response.json()['images'][0])
    return Image.open(BytesIO(image_data))
```

**Node.js:**

```javascript
const axios = require('axios');
const fs = require('fs');

async function txt2img(prompt) {
    const response = await axios.post('http://server:7860/sdapi/v1/txt2img', {
        prompt: prompt,
        steps: 20,
        width: 512,
        height: 512
    });

    const imageBuffer = Buffer.from(response.data.images[0], 'base64');
    fs.writeFileSync('output.png', imageBuffer);
}
```

### ComfyUI API

**Python:**

```python
import json
import urllib.request
import urllib.parse
import websocket
import uuid

SERVER = "server:8188"

def queue_prompt(workflow):
    """Добавить workflow в очередь на выполнение"""
    data = json.dumps({"prompt": workflow}).encode('utf-8')
    req = urllib.request.Request(f"http://{SERVER}/prompt", data=data)
    return json.loads(urllib.request.urlopen(req).read())

def get_image(filename, subfolder, folder_type):
    """Загрузить сгенерированное изображение"""
    params = urllib.parse.urlencode({
        "filename": filename,
        "subfolder": subfolder,
        "type": folder_type
    })
    with urllib.request.urlopen(f"http://{SERVER}/view?{params}") as response:
        return response.read()

# Загрузить workflow из файла
with open('workflow.json') as f:
    workflow = json.load(f)

# Изменить prompt
workflow["6"]["inputs"]["text"] = "A cat wearing a hat"

# Добавить в очередь и получить результат
result = queue_prompt(workflow)
print(f"Queued: {result}")
```

**WebSocket для прогресса:**

```python
import websocket
import json

def on_message(ws, message):
    data = json.loads(message)
    if data['type'] == 'progress':
        print(f"Progress: {data['data']['value']}/{data['data']['max']}")
    elif data['type'] == 'executed':
        print("Генерация завершена!")

ws = websocket.WebSocketApp(
    f"ws://{SERVER}/ws",
    on_message=on_message
)
ws.run_forever()
```

### FLUX с Diffusers

```python
import torch
from diffusers import FluxPipeline
import base64
from io import BytesIO

# Загрузить модель (один раз)
pipe = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-schnell",
    torch_dtype=torch.bfloat16
)
pipe.to("cuda")

def generate_image(prompt, height=1024, width=1024):
    image = pipe(
        prompt,
        height=height,
        width=width,
        num_inference_steps=4,
        guidance_scale=0.0
    ).images[0]
    return image

# Простейшая обертка API на Flask
from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route('/generate', methods=['POST'])
def generate():
    data = request.json
    image = generate_image(data['prompt'])

    # Конвертировать в base64
    buffer = BytesIO()
    image.save(buffer, format='PNG')
    img_b64 = base64.b64encode(buffer.getvalue()).decode()

    return jsonify({'image': img_b64})

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)
```

***

## Аудио API

### Транскрипция Whisper

**С использованием whisper-asr-webservice:**

```python
import requests

def transcribe(audio_path):
    with open(audio_path, 'rb') as f:
        response = requests.post(
            'http://server:9000/asr',
            files={'audio_file': f},
            data={
                'task': 'transcribe',
                'language': 'en',
                'output': 'json'
            }
        )
    return response.json()['text']

text = transcribe('audio.mp3')
print(text)
```

**Прямой Whisper API:**

```python
import whisper
from flask import Flask, request, jsonify

model = whisper.load_model("large-v3")

app = Flask(__name__)

@app.route('/transcribe', methods=['POST'])
def transcribe():
    audio = request.files['audio']
    audio.save('/tmp/audio.mp3')

    result = model.transcribe('/tmp/audio.mp3')
    return jsonify({'text': result['text']})
```

### Text-to-Speech (Bark)

```python
from bark import SAMPLE_RATE, generate_audio, preload_models
from scipy.io.wavfile import write as write_wav
import base64
from flask import Flask, request, jsonify

preload_models()

app = Flask(__name__)

@app.route('/tts', methods=['POST'])
def text_to_speech():
    text = request.json['text']
    audio = generate_audio(text)

    # Сохранить в файл
    write_wav('/tmp/output.wav', SAMPLE_RATE, audio)

    # Вернуть в base64
    with open('/tmp/output.wav', 'rb') as f:
        audio_b64 = base64.b64encode(f.read()).decode()

    return jsonify({'audio': audio_b64})
```

***

## Создание приложений

### Чат-приложение

```python
from flask import Flask, request, jsonify, Response
from openai import OpenAI
import json

app = Flask(__name__)
client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")

@app.route('/chat', methods=['POST'])
def chat():
    messages = request.json.get('messages', [])

    response = client.chat.completions.create(
        model="meta-llama/Llama-3.1-8B-Instruct",
        messages=messages,
        temperature=0.7
    )

    return jsonify({
        'response': response.choices[0].message.content
    })

@app.route('/chat/stream', methods=['POST'])
def chat_stream():
    messages = request.json.get('messages', [])

    def generate():
        stream = client.chat.completions.create(
            model="meta-llama/Llama-3.1-8B-Instruct",
            messages=messages,
            stream=True
        )
        for chunk in stream:
            if chunk.choices[0].delta.content:
                yield f"data: {json.dumps({'content': chunk.choices[0].delta.content})}\n\n"
        yield "data: [DONE]\n\n"

    return Response(generate(), mimetype='text/event-stream')

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)
```

### Сервис генерации изображений

```python
from flask import Flask, request, jsonify, send_file
import requests
import base64
from io import BytesIO

app = Flask(__name__)
SD_API = "http://localhost:7860"

@app.route('/generate', methods=['POST'])
def generate():
    data = request.json

    response = requests.post(f'{SD_API}/sdapi/v1/txt2img', json={
        'prompt': data['prompt'],
        'negative_prompt': data.get('negative_prompt', ''),
        'steps': data.get('steps', 20),
        'width': data.get('width', 512),
        'height': data.get('height', 512)
    })

    image_b64 = response.json()['images'][0]

    if data.get('return_base64'):
        return jsonify({'image': image_b64})

    # Вернуть как файл
    image_data = base64.b64decode(image_b64)
    return send_file(BytesIO(image_data), mimetype='image/png')

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)
```

### Мультимодальный конвейер

```python
from flask import Flask, request, jsonify
from openai import OpenAI
import requests
import base64

app = Flask(__name__)
llm_client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")
SD_API = "http://localhost:7860"

@app.route('/create-image-from-description', methods=['POST'])
def create_image():
    description = request.json['description']

    # Шаг 1: Сгенерировать детализированный prompt с помощью LLM
    prompt_response = llm_client.chat.completions.create(
        model="meta-llama/Llama-3.1-8B-Instruct",
        messages=[{
            "role": "user",
            "content": f"Create a detailed image generation prompt for: {description}. Include style, lighting, and composition details. Return only the prompt, no explanation."
        }]
    )
    detailed_prompt = prompt_response.choices[0].message.content

    # Шаг 2: Сгенерировать изображение
    image_response = requests.post(f'{SD_API}/sdapi/v1/txt2img', json={
        'prompt': detailed_prompt,
        'steps': 25,
        'width': 1024,
        'height': 1024
    })

    return jsonify({
        'prompt_used': detailed_prompt,
        'image': image_response.json()['images'][0]
    })

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)
```

***

## Обработка ошибок

```python
from openai import OpenAI, APIError, APIConnectionError
import time

client = OpenAI(base_url="http://server:8000/v1", api_key="dummy")

def chat_with_retry(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            response = client.chat.completions.create(
                model="meta-llama/Llama-3.1-8B-Instruct",
                messages=messages,
                timeout=60
            )
            return response.choices[0].message.content

        except APIConnectionError as e:
            print(f"Ошибка соединения (попытка {attempt + 1}): {e}")
            if attempt < max_retries - 1:
                time.sleep(2 ** attempt)  # Экспоненциальная задержка
            else:
                raise

        except APIError as e:
            print(f"Ошибка API: {e}")
            raise

# Использование
try:
    result = chat_with_retry([{"role": "user", "content": "Hello"}])
    print(result)
except Exception as e:
    print(f"Не удалось после повторных попыток: {e}")
```

***

## Рекомендуемые практики

1. **Пул соединений** - Повторно используйте HTTP-соединения
2. **Асинхронные запросы** - Используйте aiohttp для параллельных вызовов
3. **Таймауты** - Всегда задавайте таймауты запросов
4. **Логика повторных попыток** - Обрабатывайте временные сбои
5. **Ограничение скорости** - Не перегружайте сервер
6. **Проверки здоровья** - Мониторьте доступность сервера

***

## Следующие шаги

* [Пакетная обработка](/guides/guides_v2-ru/prodvinutye/batch-processing.md) - Обрабатывать большие объёмы задач
* [Мульти-GPU настройка](/guides/guides_v2-ru/prodvinutye/multi-gpu-setup.md) - Масштабируйте ваше развертывание
* [Сравнение LLM](/guides/guides_v2-ru/sravneniya/llm-serving-comparison.md) - Выберите подходящий сервер


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.clore.ai/guides/guides_v2-ru/prodvinutye/api-integration.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.