beam
Discover
PulseActivityAnalyticsBest forMapOrgs
Niches
AgentsMCPRAGCoding AssistantsInference & ServingVector DBs
Personal
WatchlistCompare
A
Hi, Adam
—
beam
LIVE──────── · ──:──:── UTCabout
beam
Discover
PulseActivityAnalyticsBest forMapOrgs
Niches
AgentsMCPRAGCoding AssistantsInference & ServingVector DBs
Personal
WatchlistCompare
A
Hi, Adam
—
beam
All niches
Nicheinference

Inference & Serving

[TOOLS_TRACKED]36
[ACCELERATING]0
[DYING]2
[AVG_VELOCITY]0.04/10
[ACCELERATING]

Accelerating

0

No accelerating tools right now.

[STABLE]

Stable

10
Tool
Velocity
Trend 30d
Δ 7d
Stars
Class
  • ray-project/rayRay is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
    2.25↑ +8042kStable
  • GeeeekExplorer/nano-vllmNano vLLM
    1.17↑ +10913kStable
  • OpenRLHF/OpenRLHFAn Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)
    0.92↑ +349.5kStable
  • bentoml/BentoMLThe easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
    0.90↑ +188.6kStable
  • xorbitsai/inferenceSwap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-ready inference API.
    0.65↑ +129.3kStable
  • NVIDIA/kvpressLLM KV cache compression made easy
    0.64↑ +91.1kStable
  • Lightning-AI/litgpt20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
    0.50↑ +1013kStable
  • stas00/ml-engineeringMachine Learning Engineering Open Book
    0.29↑ +4718kStable
  • SafeAILab/EAGLEOfficial Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).
    0.21↑ +142.3kStable
  • beam-cloud/beta9Ultrafast serverless GPU inference, sandboxes, and background jobs
    0.16↑ +81.6kStable
[STALLING]

Stalling and dying

5
Tool
Velocity
Trend 30d
Δ 7d
Stars
Class
  • jd-opensource/xllmA high-performance inference engine for LLM, VLM, DiT and REC models, optimized for diverse AI accelerators.
    0.92↑ +71.3kStalling
  • Tiiny-AI/PowerInferHigh-speed Large Language Model Serving for Local Deployment
    0.10↑ +289.4kDying
  • adithya-s-k/AI-Engineering.academyMastering Applied AI, One Concept at a Time
    0.09↑ +42.2kStalling
  • meta-llama/llama-cookbookWelcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama model family and using them on various provider services
    0.06↑ +1818kDying
  • RunanywhereAI/runanywhere-sdksProduction ready toolkit to run AI locally
    0.05↑ +110kStalling
LIVE──────── · ──:──:── UTCabout