Foundation models are large AI models trained on broad data that can be adapted to many downstream tasks. The term encompasses LLMs, vision models, and multimodal systems.
Commercial Models
Anthropic Claude
Model Context Strengths Claude 4.5 Sonnet 200K (1M beta) Best for coding, agentic tasks Claude 4.5 Opus 200K Premium intelligence, deep reasoning Claude 4.5 Haiku 200K Fastest, near-frontier intelligence
Extended thinking for complex reasoning
64K max output tokens
Strong instruction following and tool use
Constitutional AI training
API Documentation
OpenAI GPT
Model Context Strengths GPT-5.2 128K+ Best for coding, agentic tasks GPT-5 mini 128K Faster, cost-efficient GPT-5 nano 128K Fastest, most cost-efficient GPT-4.1 128K Smartest non-reasoning model o3 / o4-mini 128K Deep reasoning, STEM
Open-weight models available (gpt-oss-120b, gpt-oss-20b)
Sora 2 for video generation
Strong function calling and tool use
API Documentation
Google Gemini
Model Context Strengths Gemini 3 Pro 1M+ Most intelligent, complex tasks Gemini 3 Flash 1M Frontier intelligence at speed Gemini 2.5 Flash-Lite 1M High volume, cost efficient
State-of-the-art reasoning and multimodal
Extended thinking capabilities
Strong agentic and coding performance
API Documentation
Others
Cohere Command — Enterprise focus, RAG-optimised
Amazon Nova — AWS Bedrock integration
xAI Grok — Strong reasoning, real-time data
Open-Source Models
Model Parameters Context Notes Llama 4 Scout 17B (16 experts) 128K Multimodal MoE Llama 4 Maverick 17B (128 experts) 128K Larger expert pool Llama 3.3 70B 70B 128K Text-only instruct
Mixture-of-experts architecture
Native multimodal (text + images)
Permissive license (with restrictions)
Llama Downloads
Mistral
Model Parameters Context Notes Mistral Large ~100B 128K Commercial flagship Mixtral 8x22B MoE 64K Mixture of experts Mistral OCR 3 — — Document processing
European AI company
Strong efficiency/performance ratio
Le Chat consumer product
Mistral AI
Qwen (Alibaba)
Model Parameters Context Notes Qwen3 Various 128K Strong all-round Qwen3-VL Various 128K Vision-language Qwen3-TTS Various — Text-to-speech
Strong multilingual (especially Chinese)
Extensive model family (400+ variants)
Embedding, reranking, and omni models
Qwen
DeepSeek
Model Parameters Notes DeepSeek-V3.2 685B MoE Latest flagship DeepSeek R1 Various Strong reasoning DeepSeek Coder Various Code-specialised
Competitive with frontier models
Cost-efficient training and inference
Open weights available
DeepSeek
Others
Yi (01.AI) — Strong multilingual
Phi (Microsoft) — Small but capable
Gemma 3 (Google) — Open weights, research-friendly
OLMo (AI2) — Fully open including training data
Grok (xAI) — Available via API
Model Comparison Factors
Capability Benchmarks
See Evaluation & Benchmarking for details.
MMLU — Broad knowledge
HumanEval — Coding
GSM8K — Math reasoning
GPQA — Graduate-level science
Practical Considerations
Factor Considerations Latency Time to first token, tokens/second Cost Per-token pricing, volume discounts Context How much text can be processed Reliability Uptime, consistency Privacy Data handling, compliance Ecosystem SDKs, documentation, support
License Types
Proprietary API — No access to weights (GPT-5, Claude)
Gated open — Weights available with restrictions (Llama 4)
Permissive open — Few restrictions (Mistral, Qwen, DeepSeek)
Fully open — Weights, code, and training data (OLMo)
API Providers
Model Providers
Direct from the source:
Aggregators / Routers
Access multiple models through one API:
Choosing a Model
Decision Framework
Task requirements — What capability is most important?
Latency needs — Real-time vs batch processing
Cost constraints — Budget per million tokens
Privacy requirements — Can data leave your environment?
Context needs — How much text per request?
Compliance — Regulatory requirements
Rules of Thumb
Start with a capable model (Claude 4.5 Sonnet, GPT-5.2, Gemini 3 Pro)
Optimise for cost/speed once it works (mini/nano/Flash variants)
Open models for privacy-sensitive use cases (Llama 4, Qwen3, DeepSeek)
Smaller models for high-volume, simple tasks
Staying Current
The landscape changes rapidly. Track developments:
Resources