Grok-2 Vision
by xAIMultimodal version of Grok-2 with image understanding capabilities. Processes diagrams, screenshots, and documents.
Specifications
- Context Window
- 32,000 tokens
- Released
- August 2024
Pricing
- Input
- $2/MTok
- Output
- $10/MTok
Capabilities
VisionImage understandingDocument analysisReal-time knowledge
Related Models
Claude Sonnet 4
200K ctxby Anthropic
The best combination of performance and speed for efficient, high-throughput tasks. Excellent balance of intelligence and cost-effectiveness.
GPT-4o
128K ctxby OpenAI
OpenAI's flagship multimodal model with advanced reasoning, vision, and audio capabilities. Fast and versatile for most tasks.
Gemini 2.0 Flash
1000K ctxby Google
Google's latest multimodal model with native tool use, code execution, and agentic capabilities. Fast and efficient.
Gemini 1.5 Pro
2000K ctxby Google
Mid-size multimodal model optimized for complex reasoning and long context tasks with up to 2M token context.