LLMs are large neural networks (usually transformer-based) trained to consume human-like language as input and produce human-like language as output, exhibiting emergent reasoning behavior through probabilistic next-token prediction.
Model Name
Company
Year
Parameters (Est.)
Context Length
Performance (MT-Bench / MMLU / HumanEval)
Multimodal
Claude 3 Opus
Anthropic
2024
Undisclosed
200k
MT-Bench: 9.9 / MMLU: 86.8% / HumanEval: 88%+
Yes
GPT-4.5 / GPT-4-turbo
OpenAI
2023
~1.8T (MoE*)
128k
MT-Bench: ~9.9 / MMLU: ~87% / HumanEval: ~83%
Yes
Gemini 1.5 Pro
Google DeepMind
2024
Undisclosed
1M
MT-Bench: ~9.7 / MMLU: ~86% / HumanEval: ~80%
Yes
LLaMA 3 70B
Meta
2024
70B
8k
MT-Bench: ~8.9 / MMLU: 83.2% / HumanEval: ~74%
No
Grok-1.5
xAI
2024
~314B (MoE)
128k
MT-Bench: ~8.7 / MMLU: ~80% / HumanEval: ~72%
No
Mistral Large
Mistral
2024
~12.9B
32k
MT-Bench: 8.6 / MMLU: ~81% / HumanEval: ~69%
No
Mixtral 8x7B
Mistral
2023
12.9B x8 (MoE)
32k
MT-Bench: ~8.5 / MMLU: ~78% / HumanEval: ~65%
No
Command R+
Cohere
2024
~52B
128k
MT-Bench: ~8.4 / MMLU: ~79% / HumanEval: ~66%
No
Phi-3-mini (3.8B)
Microsoft
2024
3.8B
128k
MMLU: ~71% (no MT-Bench)
No
LLaMA 3 400B
Meta (internal)
2024
400B
128k (rumored)
Not benchmarked
Yes
LLMs in different verticals
1. General-Purpose / Chat Assistants (Horizontal)
Company
Product
Underlying Model
Primary Use
Differentiation
OpenAI
ChatGPT
GPT-4.1 / GPT-4o
General chat, reasoning
Best ecosystem + tool use
Anthropic
Claude
Claude 3.x
Long-context chat
Strong safety + context window
Google
Gemini
Gemini 1.5
Multimodal assistant
Native search + YouTube
Microsoft
Copilot
GPT-4 (Azure)
Enterprise chat
Deep M365 integration
Meta
Meta AI
LLaMA 3
Consumer chat
Distribution via WhatsApp
2. Coding / Software Development
Company
Product
Model
Target User
Notes
OpenAI
Codex
Codex / GPT-4
Developers
Code-native reasoning
GitHub
Copilot
GPT-4
Developers
IDE-embedded, massive adoption
Anthropic
Claude for Code
Claude 3
Backend / infra
Strong refactoring
Google
Gemini Code Assist
Gemini
Enterprise devs
GCP + IDEs
Replit
Replit AI
GPT-4 / Claude
Solo builders
End-to-end app build
3. Enterprise Productivity / Knowledge Work
Company
Product
Vertical
Differentiation
Microsoft
Copilot for M365
Docs, Excel, Email
Deep workflow lock-in
Google
Gemini for Workspace
Docs, Sheets
Search + data leverage
OpenAI
GPTs (Custom)
Internal tools
Low-code AI apps
Notion
Notion AI
Knowledge mgmt
Context-aware writing
4. Medical / Healthcare (Regulated & Semi-Regulated)