EyesInAI
Benchmarks
Models
Learn
Latest
AI Benchmark Intelligence
Loading benchmark data…
EyesInAI
·
Loading live benchmark data
Home
Search
Refresh
All models
deepinfra
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
Last tested Jun 11, 2026
Overall pass
71%
Avg latency
5475 ms
Context
1000k
Tools
Yes
Input $/1M
$0.15
Output $/1M
$0.60
Tests run
7
Passed
5/7
Test results
⚡
Ping
Latency & availability — single-word reply
964 ms
🧮
Reasoning
Basic math reasoning — show work, give answer
3297 ms
{ }
JSON Output
Structured output compliance — valid JSON with required keys
2504 ms
💻
Code Gen
Python function generation with docstring
7143 ms
🚀
Throughput
Token generation speed — 500-token long-form response
fail
🔍
Context Recall
Retrieval from in-context data — 20-item list Q&A
3218 ms
🔧
Tool Use
Function/tool calling — get_weather invocation
fail
Compare with another model
See the leaderboard