AI & LLM Providers

Major LLM Providers

OpenAI
Anthropic
Google AI
Cohere

OpenAI

GPT-5, GPT-5 Mini, GPT-5 Nano, o3, DALL-E, WhisperIndustry leader in large language models. GPT-5 for coding & agents, o3 for complex reasoning.Best for: Production AI apps, coding, agentic tasks, embeddingsModels (2025):

GPT-5 - State-of-the-art coding & agents (74.9% SWE-bench, 88% Aider)
GPT-5 Mini - Balanced performance & cost
GPT-5 Nano - Ultra-fast & affordable
GPT-5 Pro - Highest quality with scaled reasoning
o3 - Advanced reasoning model
DALL-E 3 - Image generation
Whisper - Speech-to-text

Context: 272K input tokens, 128K output tokensPricing (per 1M tokens):

GPT-5: $1.25 input /$ 10 output
GPT-5 Mini: $0.25 input /$ 2 output
GPT-5 Nano: $0.05 input /$ 0.40 output
Prompt caching: $0.125/1M tokens

Key Features:

45% fewer factual errors with web search
80% fewer hallucinations vs o3
4 reasoning levels: minimal, low, medium, high
Parallel & sequential tool calling
Available to free users

Documentation • API

Fast Inference Providers

Groq

Ultra-fast LLM inference with LPU500+ tokens/sec, lowest latency in industry. Perfect for real-time applications.Speed: Fastest (LPU technology)Models: Llama 3.1, Mixtral, GemmaBest for: Real-time AI, chatbots, low-latency requirementsPricing: $ (generous free tier)Get Started

Together AI

Fast inference + fine-tuningOpen-source models with custom fine-tuning capabilities.Best for: Custom models, fine-tuning, open-source LLMsFeatures:

Fine-tuning support
Open-source models
Competitive pricing
Fast inference

Pricing: $$ (mid-range)Get Started

Fireworks AI

Production-grade inferenceFast inference for open models with function calling support.Best for: Serverless AI, function calling, production scaleFeatures:

Serverless deployment
Function calling
Open-source models
Fast performance

Pricing: $$ (pay-per-use)Get Started

Aggregators & Multi-Model Platforms

OpenRouter

Unified API for 300+ modelsAccess OpenAI, Anthropic, Google, Groq, Meta through one API with automatic fallbacks.Best for: Multi-model apps, cost optimization, avoiding vendor lock-inFeatures:

300+ models
Automatic fallbacks
Cost optimization
Single API

Pricing: 5% markup on base costsGet Started

Replicate

Run open-source modelsStable Diffusion, LLaMA, Whisper, and more. Pay-per-use serverless.Best for: Open-source models, image/video generation, experimentationFeatures:

Open-source models
Image generation
Video generation
Serverless deployment

Pricing: $ (pay-per-use)Get Started

AI Frameworks & Orchestration

LangChain

Framework for building LLM appsv1.0 alpha released! New create_agent built on LangGraph runtime.Best for: RAG, chatbots, complex LLM workflowsLatest Features (2025):

LangChain 1.0 alpha with unified agent
Built on LangGraph runtime
Python & JavaScript support
Enhanced observability in LangSmith
Open Agent Platform (no-code builder)

Documentation

LangGraph

Build stateful AI agents (GA)Production-ready platform for deploying long-running autonomous agents.Best for: Complex agents, multi-step workflows, production deploymentsLatest Features (2025):

LangGraph Platform (Generally Available)
Node caching & deferred nodes
Revision queueing for smooth deploys
Trace mode with LangSmith integration
Dynamic tool calling
Pre/Post model hooks
Built-in web search & RemoteMCP
Studio v2 (runs locally)
1-click GitHub deployment

Documentation

CrewAI

Multi-agent orchestrationRole-based agents collaborating on tasks. Hierarchical or sequential workflows.Best for: Multi-agent systems, collaborative AI, task delegation

Multi-agent
Role-based
Collaborative workflows
Easy setup

Documentation

Vercel AI SDK

TypeScript AI frameworkStreaming responses, function calling, React hooks. Built for Next.js.Best for: Next.js apps, edge AI, streaming UIs, React integration

TypeScript-first
Streaming support
React hooks
Edge compatible

Documentation

LlamaIndex

Data framework for LLMsData connectors, indexes, query engines. Optimized for RAG.Best for: RAG applications, data ingestion, structured retrieval

Data connectors
RAG-optimized
Query engines
Indexing strategies

Documentation

AutoGen

Multi-agent framework (Microsoft)Conversable agents with human-in-the-loop and code execution.Best for: Research, multi-agent collaboration, code generation

Multi-agent
Code execution
Human-in-loop
Microsoft Research

Documentation

Provider Comparison (2025)

Provider	Speed	Cost	Context	Best For
OpenAI GPT-5	Fast	$$	272K	Coding, agents, production AI
Anthropic Sonnet 4.5	Medium	$$$	1M	Best coding, autonomous agents
Google Gemini 2.5 Pro	Medium	$$	1M-2M	Reasoning, multimodal, cost-effective
Groq	⚡ Fastest	$	Varies	Real-time, low latency
OpenRouter	Fast	$$	Varies	Multi-model flexibility
Together AI	Fast	$$	Varies	Open models, fine-tuning

Pricing Comparison (per 1M tokens - 2025)

Input Tokens:

GPT-5: $1.25
GPT-5 Mini: $0.25
GPT-5 Nano: $0.05
Claude Sonnet 4.5: $3.00
Claude Opus 4.1: $15.00
Gemini 2.5 Pro (standard): $1.25
Gemini 2.5 Pro (long context): $2.50
Gemini 2.0 Flash: $0.10

Output Tokens:

GPT-5: $10.00
GPT-5 Mini: $2.00
GPT-5 Nano: $0.40
Claude Sonnet 4.5: $15.00
Claude Opus 4.1: $75.00
Gemini 2.5 Pro (standard): $10.00
Gemini 2.5 Pro (long context): $15.00
Gemini 2.0 Flash: $0.40

Edge Functions vs Custom SDKs

Supabase Edge Functions
Provider SDKs

Deno-based serverless functionsPros:

Direct database access
Global deployment
TypeScript/JavaScript
Built-in secrets management
Free tier included

Cons:

Deno runtime only
Supabase ecosystem

Best for: Supabase users, TypeScript developers, database-connected AI

// Supabase Edge Function example
import { serve } from "https://deno.land/std@0.168.0/http/server.ts"

serve(async (req) => {
  const { prompt } = await req.json()

  const response = await fetch("https://api.openai.com/v1/chat/completions", {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${Deno.env.get("OPENAI_API_KEY")}`,
      "Content-Type": "application/json"
    },
    body: JSON.stringify({
      model: "gpt-4",
      messages: [{ role: "user", content: prompt }]
    })
  })

  return new Response(JSON.stringify(await response.json()))
})

Official OpenAI, Anthropic SDKsPros:

Official support
Latest features
Type-safe
Well-documented
Works anywhere

Cons:

More boilerplate
Need separate hosting
API key management

Best for: Any runtime, maximum flexibility, latest features

// OpenAI SDK example
import OpenAI from "openai"

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY
})

const completion = await openai.chat.completions.create({
  model: "gpt-4",
  messages: [{ role: "user", content: "Hello!" }]
})

Memory & State Management

Zustand

Lightweight React statePerfect for managing AI conversation state in React apps.

Minimal boilerplate
React integration
TypeScript support
Persistent storage

Documentation

LangChain Memory

Built-in conversation memoryBuffer, summary, entity, and vector store memory types.

Conversation history
Summary memory
Entity extraction
Vector store integration

Documentation

Upstash Redis

Serverless Redis for statePerfect for storing conversation history and session data.

Serverless
Low latency
Global replication
REST API

Documentation

Vercel KV

Edge-compatible key-value storeBuilt on Upstash, perfect for Next.js edge functions.

Edge-compatible
Low latency
Simple API
Vercel integration

Documentation

Decision Trees

Choosing an LLM Provider

Need industry-leading performance? → OpenAI (GPT-4)Long documents & safety-critical? → Anthropic (Claude)Real-time, low-latency chatbots? → GroqWant flexibility across providers? → OpenRouterOpen-source models & fine-tuning? → Together AIImage/video generation? → ReplicateCost-effectiveness & multimodal? → Google AI (Gemini)

Choosing an AI Framework

Building RAG applications? → LangChain + LlamaIndexComplex multi-step agents? → LangGraphMulti-agent collaboration? → CrewAINext.js with streaming UI? → Vercel AI SDKResearch & experimentation? → AutoGenSimple chat integration? → Direct SDK (OpenAI, Anthropic)

Recommended Combinations

Production RAG App

OpenAI (embeddings) + Pinecone (vectors) + LangChain (orchestration)

Real-time Chatbot

Groq (inference) + Upstash Redis (state) + Vercel AI SDK (streaming)

Cost-Optimized

OpenRouter (multi-model) + Chroma (vectors) + Supabase Edge Functions

Enterprise AI

Anthropic Claude (safety) + Weaviate (hybrid search) + LangGraph (agents)

Frontend

Backend

Major LLM Providers

OpenAI

Anthropic

Google AI

Cohere

Fast Inference Providers

Groq

Together AI

Fireworks AI

Aggregators & Multi-Model Platforms

OpenRouter

Replicate

AI Frameworks & Orchestration

LangChain

LangGraph

CrewAI

Vercel AI SDK

LlamaIndex

AutoGen

Provider Comparison (2025)

Pricing Comparison (per 1M tokens - 2025)

Edge Functions vs Custom SDKs

Memory & State Management

Zustand

LangChain Memory

Upstash Redis

Vercel KV

Decision Trees

Recommended Combinations

Frontend

Backend

​Major LLM Providers

​OpenAI

​Anthropic

​Google AI

​Cohere

​Fast Inference Providers

Groq

Together AI

Fireworks AI

​Aggregators & Multi-Model Platforms

OpenRouter

Replicate

​AI Frameworks & Orchestration

LangChain

LangGraph

CrewAI

Vercel AI SDK

LlamaIndex

AutoGen

​Provider Comparison (2025)

​Pricing Comparison (per 1M tokens - 2025)

​Edge Functions vs Custom SDKs

​Memory & State Management

Zustand

LangChain Memory

Upstash Redis

Vercel KV

​Decision Trees

​Recommended Combinations

Major LLM Providers

OpenAI

Anthropic

Google AI

Cohere

Fast Inference Providers

Aggregators & Multi-Model Platforms

AI Frameworks & Orchestration

Provider Comparison (2025)

Pricing Comparison (per 1M tokens - 2025)

Edge Functions vs Custom SDKs

Memory & State Management

Decision Trees

Recommended Combinations