beam
Discover
PulseActivityAnalyticsBest forMapOrgs
Niches
AgentsMCPRAGCoding AssistantsInference & ServingVector DBs
Personal
WatchlistCompare
A
Hi, Adam
—
beam
LIVE──────── · ──:──:── UTCabout
beam
Discover
PulseActivityAnalyticsBest forMapOrgs
Niches
AgentsMCPRAGCoding AssistantsInference & ServingVector DBs
Personal
WatchlistCompare
A
Hi, Adam
—
beam
All niches
Nicheeval-benchmark

Eval & Benchmark

[TOOLS_TRACKED]11
[ACCELERATING]0
[DYING]2
[AVG_VELOCITY]0.00/10
[ACCELERATING]

Accelerating

0

No accelerating tools right now.

[STABLE]

Stable

1
Tool
Velocity
Trend 30d
Δ 7d
Stars
Class
  • EvolvingLMMs-Lab/lmms-evalOne-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
    1.24↑ +224.1kStable
[STALLING]

Stalling and dying

3
Tool
Velocity
Trend 30d
Δ 7d
Stars
Class
  • open-compass/VLMEvalKitOpen-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
    0.94↑ +94.1kStalling
  • evalplus/evalplusRigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024
    0.02↑ +71.7kDying
  • huggingface/evaluation-guidebookSharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!
    0.02↑ +72.1kDying
LIVE──────── · ──:──:── UTCabout