Model catalog

Open-source and free-tier models across eight modalities — all behind one API.

Text Generation

Chat and completion with open-source LLMs. OpenAI-compatible.

Featured

Gemini 2.5 Flash

Google's fast multimodal model with a large context window.

Text GenerationGoogle Gemini
1,000K context2 cr per 1K tokens
Featured

Llama 3.1 8B Instant

Fast, capable general-purpose LLM. Great default for most tasks.

Text GenerationGroq
131.072K context1 cr per 1K tokens
Featured

Llama 3.3 70B Versatile

High-quality reasoning and generation for complex tasks.

Text GenerationGroq
131.072K context4 cr per 1K tokens

GPT-OSS 20B

Open-weight model with strong instruction following.

Text GenerationGroq
131.072K context2 cr per 1K tokens

Qwen3 32B

Multilingual model with solid coding and math abilities.

Text GenerationGroq
131.072K context3 cr per 1K tokens

Image to Text

Vision understanding, captioning, and OCR from images.

Featured

Gemini 2.5 Flash (Vision)

Image understanding, captioning, and OCR.

Image to TextGoogle Gemini
 2 cr per 1K tokens

Llama 4 Scout (Vision)

Open multimodal model for visual question answering.

Image to TextGroq
 3 cr per 1K tokens

Text to Image

Generate images from prompts with FLUX and SDXL.

Featured

FLUX.1 [schnell]

Ultra-fast, high-quality text-to-image generation.

Text to ImageHugging Face
 20 cr per image

FLUX.1 [dev]

Highest-detail FLUX model for photorealistic images.

Text to ImageHugging Face
 40 cr per image

Stable Diffusion XL

Versatile open image model with broad style support.

Text to ImageHugging Face
 25 cr per image

Image to Image

Edit and transform images with a guiding prompt.

Featured

FLUX.1 Kontext [dev]

Prompt-guided image editing and transformation.

Image to ImageHugging Face
 40 cr per image

Text to Video

Create short video clips from text prompts.

Featured

LTX Video

Generate short video clips from a text prompt.

Text to VideoHugging Face
 200 cr per request

Text to Speech

Natural-sounding speech synthesis from text.

Featured

PlayAI TTS

Natural English speech synthesis.

Text to SpeechGroq
 5 cr per 1K characters

Kokoro 82M

Lightweight open-source TTS via Hugging Face.

Text to SpeechHugging Face
 4 cr per 1K characters

Speech to Text

Fast, multilingual transcription with Whisper.

Featured

Whisper Large v3 Turbo

Fast multilingual transcription (216x real-time).

Speech to TextGroq
 1 cr per second

Whisper Large v3

State-of-the-art accuracy for transcription & translation.

Speech to TextGroq
 2 cr per second

Embeddings

Vector embeddings for search and RAG (entity-to-entity).

Featured

BGE Base EN v1.5

Compact, high-quality English text embeddings.

EmbeddingsHugging Face
 1 cr per 1K tokens

Multilingual E5 Large

Multilingual embeddings for cross-language search.

EmbeddingsHugging Face
 1 cr per 1K tokens