Skip to main content

Major LLM Providers

  • OpenAI
  • Anthropic
  • Google AI
  • Cohere

OpenAI

GPT-5, GPT-5 Mini, GPT-5 Nano, o3, DALL-E, WhisperIndustry leader in large language models. GPT-5 for coding & agents, o3 for complex reasoning.Best for: Production AI apps, coding, agentic tasks, embeddingsModels (2025):
  • GPT-5 - State-of-the-art coding & agents (74.9% SWE-bench, 88% Aider)
  • GPT-5 Mini - Balanced performance & cost
  • GPT-5 Nano - Ultra-fast & affordable
  • GPT-5 Pro - Highest quality with scaled reasoning
  • o3 - Advanced reasoning model
  • DALL-E 3 - Image generation
  • Whisper - Speech-to-text
Context: 272K input tokens, 128K output tokensPricing (per 1M tokens):
  • GPT-5: 1.25input/1.25 input / 10 output
  • GPT-5 Mini: 0.25input/0.25 input / 2 output
  • GPT-5 Nano: 0.05input/0.05 input / 0.40 output
  • Prompt caching: $0.125/1M tokens
Key Features:
  • 45% fewer factual errors with web search
  • 80% fewer hallucinations vs o3
  • 4 reasoning levels: minimal, low, medium, high
  • Parallel & sequential tool calling
  • Available to free users
DocumentationAPI

Fast Inference Providers

Groq

Ultra-fast LLM inference with LPU500+ tokens/sec, lowest latency in industry. Perfect for real-time applications.Speed: Fastest (LPU technology)Models: Llama 3.1, Mixtral, GemmaBest for: Real-time AI, chatbots, low-latency requirementsPricing: $ (generous free tier)Get Started

Together AI

Fast inference + fine-tuningOpen-source models with custom fine-tuning capabilities.Best for: Custom models, fine-tuning, open-source LLMsFeatures:
  • Fine-tuning support
  • Open-source models
  • Competitive pricing
  • Fast inference
Pricing: $$ (mid-range)Get Started

Fireworks AI

Production-grade inferenceFast inference for open models with function calling support.Best for: Serverless AI, function calling, production scaleFeatures:
  • Serverless deployment
  • Function calling
  • Open-source models
  • Fast performance
Pricing: $$ (pay-per-use)Get Started

Aggregators & Multi-Model Platforms

OpenRouter

Unified API for 300+ modelsAccess OpenAI, Anthropic, Google, Groq, Meta through one API with automatic fallbacks.Best for: Multi-model apps, cost optimization, avoiding vendor lock-inFeatures:
  • 300+ models
  • Automatic fallbacks
  • Cost optimization
  • Single API
Pricing: 5% markup on base costsGet Started

Replicate

Run open-source modelsStable Diffusion, LLaMA, Whisper, and more. Pay-per-use serverless.Best for: Open-source models, image/video generation, experimentationFeatures:
  • Open-source models
  • Image generation
  • Video generation
  • Serverless deployment
Pricing: $ (pay-per-use)Get Started

AI Frameworks & Orchestration

LangChain

Framework for building LLM appsv1.0 alpha released! New create_agent built on LangGraph runtime.Best for: RAG, chatbots, complex LLM workflowsLatest Features (2025):
  • LangChain 1.0 alpha with unified agent
  • Built on LangGraph runtime
  • Python & JavaScript support
  • Enhanced observability in LangSmith
  • Open Agent Platform (no-code builder)
Documentation

LangGraph

Build stateful AI agents (GA)Production-ready platform for deploying long-running autonomous agents.Best for: Complex agents, multi-step workflows, production deploymentsLatest Features (2025):
  • LangGraph Platform (Generally Available)
  • Node caching & deferred nodes
  • Revision queueing for smooth deploys
  • Trace mode with LangSmith integration
  • Dynamic tool calling
  • Pre/Post model hooks
  • Built-in web search & RemoteMCP
  • Studio v2 (runs locally)
  • 1-click GitHub deployment
Documentation

CrewAI

Multi-agent orchestrationRole-based agents collaborating on tasks. Hierarchical or sequential workflows.Best for: Multi-agent systems, collaborative AI, task delegation
  • Multi-agent
  • Role-based
  • Collaborative workflows
  • Easy setup
Documentation

Vercel AI SDK

TypeScript AI frameworkStreaming responses, function calling, React hooks. Built for Next.js.Best for: Next.js apps, edge AI, streaming UIs, React integration
  • TypeScript-first
  • Streaming support
  • React hooks
  • Edge compatible
Documentation

LlamaIndex

Data framework for LLMsData connectors, indexes, query engines. Optimized for RAG.Best for: RAG applications, data ingestion, structured retrieval
  • Data connectors
  • RAG-optimized
  • Query engines
  • Indexing strategies
Documentation

AutoGen

Multi-agent framework (Microsoft)Conversable agents with human-in-the-loop and code execution.Best for: Research, multi-agent collaboration, code generation
  • Multi-agent
  • Code execution
  • Human-in-loop
  • Microsoft Research
Documentation

Provider Comparison (2025)

ProviderSpeedCostContextBest For
OpenAI GPT-5Fast$$272KCoding, agents, production AI
Anthropic Sonnet 4.5Medium$$$1MBest coding, autonomous agents
Google Gemini 2.5 ProMedium$$1M-2MReasoning, multimodal, cost-effective
Groq⚡ Fastest$VariesReal-time, low latency
OpenRouterFast$$VariesMulti-model flexibility
Together AIFast$$VariesOpen models, fine-tuning

Pricing Comparison (per 1M tokens - 2025)

Input Tokens:
  • GPT-5: $1.25
  • GPT-5 Mini: $0.25
  • GPT-5 Nano: $0.05
  • Claude Sonnet 4.5: $3.00
  • Claude Opus 4.1: $15.00
  • Gemini 2.5 Pro (standard): $1.25
  • Gemini 2.5 Pro (long context): $2.50
  • Gemini 2.0 Flash: $0.10
Output Tokens:
  • GPT-5: $10.00
  • GPT-5 Mini: $2.00
  • GPT-5 Nano: $0.40
  • Claude Sonnet 4.5: $15.00
  • Claude Opus 4.1: $75.00
  • Gemini 2.5 Pro (standard): $10.00
  • Gemini 2.5 Pro (long context): $15.00
  • Gemini 2.0 Flash: $0.40

Edge Functions vs Custom SDKs

  • Supabase Edge Functions
  • Provider SDKs
Deno-based serverless functionsPros:
  • Direct database access
  • Global deployment
  • TypeScript/JavaScript
  • Built-in secrets management
  • Free tier included
Cons:
  • Deno runtime only
  • Supabase ecosystem
Best for: Supabase users, TypeScript developers, database-connected AI
// Supabase Edge Function example
import { serve } from "https://deno.land/[email protected]/http/server.ts"

serve(async (req) => {
  const { prompt } = await req.json()

  const response = await fetch("https://api.openai.com/v1/chat/completions", {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${Deno.env.get("OPENAI_API_KEY")}`,
      "Content-Type": "application/json"
    },
    body: JSON.stringify({
      model: "gpt-4",
      messages: [{ role: "user", content: prompt }]
    })
  })

  return new Response(JSON.stringify(await response.json()))
})

Memory & State Management

Zustand

Lightweight React statePerfect for managing AI conversation state in React apps.
  • Minimal boilerplate
  • React integration
  • TypeScript support
  • Persistent storage
Documentation

LangChain Memory

Built-in conversation memoryBuffer, summary, entity, and vector store memory types.
  • Conversation history
  • Summary memory
  • Entity extraction
  • Vector store integration
Documentation

Upstash Redis

Serverless Redis for statePerfect for storing conversation history and session data.
  • Serverless
  • Low latency
  • Global replication
  • REST API
Documentation

Vercel KV

Edge-compatible key-value storeBuilt on Upstash, perfect for Next.js edge functions.
  • Edge-compatible
  • Low latency
  • Simple API
  • Vercel integration
Documentation

Decision Trees

Need industry-leading performance? → OpenAI (GPT-4)Long documents & safety-critical? → Anthropic (Claude)Real-time, low-latency chatbots? → GroqWant flexibility across providers? → OpenRouterOpen-source models & fine-tuning? → Together AIImage/video generation? → ReplicateCost-effectiveness & multimodal? → Google AI (Gemini)
Building RAG applications? → LangChain + LlamaIndexComplex multi-step agents? → LangGraphMulti-agent collaboration? → CrewAINext.js with streaming UI? → Vercel AI SDKResearch & experimentation? → AutoGenSimple chat integration? → Direct SDK (OpenAI, Anthropic)
1

Production RAG App

OpenAI (embeddings) + Pinecone (vectors) + LangChain (orchestration)
2

Real-time Chatbot

Groq (inference) + Upstash Redis (state) + Vercel AI SDK (streaming)
3

Cost-Optimized

OpenRouter (multi-model) + Chroma (vectors) + Supabase Edge Functions
4

Enterprise AI

Anthropic Claude (safety) + Weaviate (hybrid search) + LangGraph (agents)